Guidelines and designs that specify how data is ingested, integrated, stored, accessed and consumed by business users, models, applications, and systems.
Architecture defines the roles, structures, relationships, and rules by which a collection of components constitutes a cohesive whole – the glue that bonds individual parts into a system. Architecture is an early-stage design activity that precedes detailed design, specification, and construction.
Today’s typical data management architecture is built on a foundation of 1990’s principles—relational databases, data warehousing, batch ETL, data latency, etc.—that don’t address more recent developments in big data, NoSQL, etc. Since the emergence of big data technologies, most organizations patched new concepts onto the surface of old architecture and they continue to add patches in a way that makes the architecture increasingly fragile. Data warehouses and data lakes have become the new data silos, and connecting the dots among them is especially difficult. The time has come to step back and rebuild data management architecture from the ground up.
For most people, data architecture defines a standard set of products and tools an organization uses to manage data. But it is much more than that. A data architecture defines the processes to capture, transform, and deliver usable data to business users. Most importantly, it identifies the people who will consume that data and their unique requirements. A good data architecture flows right to left: from data consumers to data sources—not the other way.