How to Develop Data Management Architecture

Nearly every organization today is facing the need to rethink and refresh its data management architecture. Data and technology advances of the past decade bring new opportunities and new complexities to data management, yet most organizations continue to work with turn-of-the-century architecture from the BI era. Patching new components onto the surface of obsolete architecture—a band-aid and duct tape approach—is not sustainable and won’t readily adapt to changes yet to come. Still, many avoid stepping up to modern data management architecture because it is complex and difficult. My goal with this article is to provide guidance that helps to manage the complexities and minimize the difficulties. 

Start with Business Capabilities

The first responsibility of data management is to enable the business to do the things that they need to do to get maximum value from their data. Defining data management architecture doesn’t begin with data, or even with goals like “cloud first” or “streaming first.” Those technical goals must be subordinate to business goals. Begin by working with business stakeholders to develop a list of data-dependent business capabilities. Make them tangible by identifying the kinds of data deliverables that enable those capabilities. Start with the reference list shown in the table below. Refine and customize to represent the needs of your business.

Business Capability

Enabled With

Inform about …

scheduled reports

ad hoc reports

self-service reporting

Inquire about …

managed query

ad hoc query

Analyze the behavior of …

OLAP

self-service analytics

Track … against goals

scorecards

Monitor current state of …

dashboards

Send/receive alerts about …

event monitoring

automated messaging

Examine alternatives for …

analytic models

Simulate the behavior of …

simulation models

Explore patterns and trends of …

data mining models

Discover hidden insights of …

data mining models

Predict the future state of …

predictive models

Recommend decisions for …

prescriptive models

Automate decisions for …

prescriptive models

AI/ML models

Refine and customize by brainstorming to add capabilities not shown here, to change terminology to match language common to your organization, and to remove any capabilities that you don’t need now and don’t anticipate as future needs.

Explore Business Requirements

Good architecture is a tool that helps you to meet business requirements. It is impractical to undertake exhaustive and detailed business requirements analysis as part of architectural definition. You’ll get bogged down in requirements details and find it difficult to get back to working on architecture. Instead, work with representative groups of users to collect a few sample requirements for each business capability. For example:

Inquire about order status.

Inquire about employee compensation.

Analyze the behavior of marketing campaigns.

Track customer loyalty programs against goals.

Simulate behavior of P&L for new product launch at various price points.

Recommend decisions for discount offers to customers.

Automate decisions for next best upsell offer to customers.

Collecting example requirements in this way helps to validate and refine the list of business capabilities. Example requirements are also used later to test the architecture.

Itemize Data Capabilities

Although not the place to begin, technical capabilities such as cloud capable are an important part of architectural definition.  Work with technical stakeholders to develop the list of essential data capabilities. Identify for each capability the architectural components that are needed to support that capability. Start with the reference list shown in the table below. Refine and customize to represent the needs of your organization. 

Data Capability

Enabled With

Support all data use cases

data consumption layer of architecture

Support all data latencies

batch data capture & ingestion

changed data capture (CDC)

data stream processing

Support hybrid data ecosystem

microservices architecture

containerization

cross-platform orchestration

Sustain legacy data warehouse value

legacy warehouse ingestion into data lake

Easy access for all data consumers

data access layer of architecture

data catalog

Work with all data types

data source layer of architecture

data source connectors

SQL and NoSQL databases

Scalable and elastic

cloud platforms

Smart and agile data pipelines

data fabric & pipeline automation technology

DataOps tools and techniques

This reference table illustrates examples of needed data capabilities. You are certain to have new and different needs from those listed here. Two good resources to help you brainstorm data capabilities are Wayne Eckerson’s articles Ten Characteristics of a Modern Data Architecture and Ten Things Companies Want from a Modern Data Architecture

Adapt a Reference Architecture

Now that you’ve expressed architectural requirements as business capabilities and data capabilities it is time to create a diagram that visually represents the architecture. This can be quite an intimidating task if you start with a blank page. A better approach is to work from a reference architecture and adapt it to support your list of needed capabilities. A reference architecture is a template that represents best practices and provides a starting place for architectural definition. A quick web search finds many reference data architectures. Remember that the focus here is data management architecture so be sure that the reference architecture that you choose represents data management best practices. Of course, I recommend Eckerson Group’s reference data management architecture (see figure 1). 

Figure 1 – Eckerson Group Reference Data Management Architecture

There is a lot of detail in this reference architecture. Stay tuned because I’ll write future blogs that describe the various layers and components in greater detail.

Remember that reference architecture is a template—a starting place from which you’ll adapt to create the architecture that best matches your organization’s needs. As you prepare to adapt I suggest reading (or rereading) my article about Modernizing Data Management Architecture Then adapt by mapping architecture components to your lists of business and data capabilities. Remove any components that you don’t need and add any components that are needed. Adjust terminology to match the language used in your organization. 

Finally, revisit your collection of example business requirements. Walk each example through the architecture to test that the data, the processing, and the use case are all supported by the architecture. Continue to adjust the architecture iteratively until all of the example business requirements are supported without compromising data capabilities such as low latency, large data volumes, high throughput, etc.

Help to Grow Architecture Discipline

As you put this process to work, please provide feedback. Let me know what works well for you and what you may have done differently. Share your lists of business capabilities and data capabilities so I can continue to evolve the reference lists. Describe the what and why of your changes to the reference architecture. With your feedback, we can crowdsource to grow the discipline of modern data management architecture. 


Dave Wells

Dave Wells is an advisory consultant, educator, and industry analyst dedicated to building meaningful connections throughout the path from data to business value. He works at the intersection of information...

More About Dave Wells