Ten Things Companies Want from a Modern Data Architecture
Our most popular article from 2018, "Ten Characteristics of a Modern Data Architecture”, defined data architecture and explained that it’s more than just a data warehouse and different from a data platform. It says a data architecture is “a living, breathing organism that detects and responds to changes, continuously learns and adapts, and provides governed, tailored access to every individual.” It then defines ten characteristics: customer-centric, adaptable, automated, smart, flexible, collaborative, governed, simple, elastic, secure, and (as a bonus) resilient.
These ten features are shaped by the intersection of customer demand and vendor capabilities. They represent an ideal state, something to shoot for when designing a modern data architecture. At Eckerson Group, we help a lot of organizations create data architectures to support current and future business needs. Along the way, we’ve noticed common themes for what drives customers to modernize a data architecture and what they want it to look like.
Fear and Opportunity
There are two main drivers behind the push to implement a modern data architecture: fear and opportunity. Of the two, fear is the most compelling and the one most likely to secure funding for a data team that is pitching a new data strategy and architecture. Fear drives most business executives to do something about data, while opportunity motivates most data leaders to push the organization into the modern data world.
Fear. Business executives fear being “Uber’d” or “Amazon’ed”—basically blindsided by a digital upstart that quickly gains market- and mind-share with a new digital and data strategy that is hard to replicate quickly. They also fear being surpassed by an industry competitor that launches a new digital/data strategy that gives them market momentum. They also are terrified of getting hacked and exposing sensitive customer information and corporate secrets, damaging their hard-won reputation and exposing them to regulatory fines.
Opportunity. Data leaders, on the other hand, see a whole host of reasons to modernize a data architecture. They see the power of new data and analytics technology to build flexible, automated supply chains, implement self-service analytics, and design predictive models that streamline business processes, improve customer service, and improve financial results. They also see the opportunity to retire less-than-optimal data warehouses that were never properly designed and cost too much for the value they deliver.
Company "Wants"
When it comes to defining a future data state, the data leaders we work with have a common litany of requirements. None of them are outrageously novel or revolutionary—most can be lifted from strategy documents from 2005 or earlier. But this time, they believe the requirements are no longer pipe dreams, thanks to a flood of new data and analytics technology that has dramatically reshaped the industry.
To be clear, I’m not talking about leading-edge organizations with legions of hard-core developers who eagerly embrace novel technologies. I’m referring to mainstream data teams that are conservative by nature and eager to follow the crowd—or at least not get left behind it. These folks cultivated their data expertise building data warehouses and OLAP cubes using graphical ETL and BI tools. These are business-facing developers who embrace technology to meet business needs first and foremost.
Here is what these types of data leaders and teams want:
A bridge. They want a bridge from the old world to the new. They want to leverage their existing skills and mental frameworks (i.e., BI/ETL/DW) while gaining the benefits of modern data technology (i.e., infinite scalability, elasticity, multi-structured data, managed services).
Cloud-first. They have “a cloud first” strategy, but often hedge, saying, “only if it makes sense.” Few companies these days want to run a data center, even if it is cheaper and more secure. Innovation happens in the cloud, not on premises, and it’s often more secure than on-premises data centers.
Streaming-first. They also want a “streaming first” architecture to give business users fresher data and address shrinking batch windows. Adding sensor and log data adds urgency to this requirement. The goal is to stream everything and batch events as needed for less time-sensitive jobs.
A data lake. Data leaders like the idea of a holding tank for all types of raw analytic data that can be repurposed across multiple data pipelines, including a data warehouse. In turn, the data pipelines can feed refined data (i.e., dimensions, master data, metrics) into the data lake for reuse.
Self-Service. They want business users to “converse with data” and create their own reports without IT assistance. This requires granular permissions that give different types of users access to different parts of the data architecture. (See “A Reference Architecture for Self-Service Analytics.”)
Data science. Data leaders want to move into the world of "predictive analytics" even though they admit they don't have a clue what they would use it for or how they would create a data science team. They know it's time to move beyond the reactive world of reporting and dashboarding to the proactive world of machine learning.
Scalability. They want a data architecture that holds all their data—relational, semi-structured, unstructured, external—at both summary and granular levels where it can be queried directly at speed by large volumes of concurrent users.
Best of breed. Currently, data users seek best-of-breed tools for key architectural components. As the industry matures, they will be more inclined to purchase all-in-one data analytic platforms from a single vendor.
Fast time to value. Data leaders like the cloud because they don’t have to wait to purchase, configure, install, and tune systems and networks. They also like free trials and extended proofs of concept so they can try before they buy.
Portability. Data leaders fear getting locked into a vendor’s platform and losing control over pricing and vendor responsiveness. Most want to play one cloud vendor off another in a hybrid (i.e., multi-cloud and premises) strategy.
BONUS: Predictable pricing. Most business and data leaders are ok with subscription pricing, but not usage-based pricing that is difficult to predict or core-based pricing that could cause steep bumps in fees.
Although this list is not comprehensive, it aligns with our experience working with many companies on data architectures during the past several years. Anyone who works with companies to build modern data architectures should heed these ten requirements.