Nov 15, 2021 / by Wayne Eckerson Data Strategy Insider

Rethinking the Data Mesh: Apply it Piecemeal

ABSTRACT: The data mesh is gaining popularity as a distributed architecture that mirrors reality. But its practical limitations mean it should be applied piecemeal.

Our clients now ask us about the “data mesh”- an emerging distributed architecture that addresses the shortcomings of traditional centralized data architectures which power most enterprises today.

For those of us who have been around this field for a while (almost 30 years for me), the term “data mesh” conjures up a déjà vu of past distributed technologies, whose monikers long ago entered the dustbin of analytics history: virtual data warehouse, enterprise information intelligence, logical data warehouse, data virtualization, and others. Whatever the name, these technologies all have provided a global semantic view of distributed data and a mechanism to federate queries across systems in real time.

To be fair, the data mesh as conceived by Zhamak Dehghani is broader than a traditional data virtualization tool. It embeds the technology within a decentralized ownership model, distributed data development, and federated governance. It also requires a common data platform that abstracts the complexity of building and managing data products and semantic models so ordinary business users can publish data to the mesh.

For companies exploring the architectural possibilities of a data mesh, it’s noteworthy that no one seems to understand it. There is confusion among people who have read deeply on the subject[1]. And it doesn’t help that Zhamak’s writing style is a bit opaque, which prompts skeptics to wonder if the concept is simply “old wine in new wineskins” and gives vendors license to proclaim that their technology powers it. If nothing else, the data mesh is now a huge marketing bonanza for vendors who want to hitch themselves to a hot industry buzzword.

If nothing else, the data mesh is now a huge marketing bonanza for vendors who want to hitch themselves to a hot industry buzzword.

There are two issues here: 1) how seriously should we take the data mesh? Is there any redeeming value in this concept? and 2) what is the nature of the data mesh technology that vendors are hawking today? We’ll tackle the first question in this article and the second next month.

Data Mesh Perspectives

Centralization versus decentralization. The data mesh is a reaction to centralized data organizations, architectures, and governance models that are prone to rigidity, inflexibility, and backlogs. To address these shortcomings, the data mesh pivots 180-degrees and embraces a decentralized approach. The data mesh gives business domain owners who oversee an operational system or function complete jurisdiction over data in their domains. With a data mesh, data never moves; anyone who wants the data goes to the domain to get it. That puts the onus on domains to model and publish data in a form that others can use.

Unfortunately, decentralized organizations and architectures foster a hornet’s nest of problems: they proliferate data silos and fragment data, making it impossible for executives to get quick answers to simple questions, like “How many customers do we have?” To me, the data mesh reinforces bad organizational habits and renegade behavior. If you want to modernize your data environment, do you double-down on data dysfunction and reinforce data silos? This is akin to paving the cow paths. But that is what the data mesh approach advocates.

To me, the data mesh reinforces bad organizational habits and renegade behavior. If you want to modernize your data environment, do you double-down on data dysfunction?

Importance of Federation. The only route to enduring value is to balance the imperatives of centralization and decentralization. Central teams deliver standards, scale, and technical expertise, while distributed teams deliver agility, adaptability, and domain knowledge. One without the other creates huge problems. The best way to design data organizations, architectures, and governance approaches is to federate them: align central and local resources and activity so you get the best of both worlds with few drawbacks.

The best way to design data organizations, architectures, and governance approaches is to federate them

Bottom-up federation. Fortunately, Zhamak recognizes this, at least partially, and builds federation into the data mesh concept. For example, each domain needs a product owner who must understand the needs of enterprise users; the self-service platform is built centrally (presumably) to help domain owners manage and publish their own data; a central team (presumably) needs to build cross-functional data products that no single domain can or will build; and perhaps most importantly, cross-functional teams of product owners and developers must convene to hammer out global models, governance standards, and application interfaces.

Top-down federation. This sounds a lot like what organizations with centralized data architectures do. The best ones align central and local resources by assigning technical specialists to data domains, establishing cross-functional governance boards, and scheduling regular strategic planning meetings with department leaders. In addition, central data teams build platforms and tools that empower business units to meet their own data and analytics needs and deliver cross-functional applications that no department has the time, money, or interest in building.

In the end, it doesn’t matter whether you federate from below or from the top as long as you do! While many centralized data teams have learned to federate best practices, standards, and development, the reality is that some departments (e.g., finance, sales, marketing) have their own data engineers and data analysts and don’t want to give them up. But with the right incentives, these departments can be coaxed to outsource costly IT-related tasks, like maintaining brittle data systems managed by part-time data engineers that provide a colloquial view of data, to a central team without forfeiting local control.

It doesn’t matter whether you federate from below or from the top as long as you do!

Data Mesh Drawbacks. Smarter people than me have evaluated the data mesh in depth and see shortcomings. Many of the drawbacks focus on the lack of expertise and interest within departments to publish and manage data on behalf of themselves and others. Here is a sampling of 30+ areas of concern mentioned by James Serra and followers of his blog. Serra is a well-known data architect who was a data evangelist for Microsoft and now is a data platform architecture lead at EY.

Some Challenges Facing Data Mesh Implementations

Domains will only think of their own data product and not how to work with other products…
Each domain could be using different technology, some of which could be obscure. And not have the experience to pick the right technology.
Convincing operational dev teams that they need to do their own data quality, stewardship, metadata and master data management, on top of their main day-to-day activities.
Getting product owners to prioritize data management tasks over customer-focused user stories on any sprint.
Some domains may not want to deal with data and just focus on what they are good at (i.e. serving their customers).

Our Recommendation: Apply the Data Mesh Piecemeal

The data mesh is an interesting concept, and I’m glad it’s being vigorously debated. Since many companies have an abundance of data silos, I can see how the approach has widespread appeal. But there are drawbacks.

First, most domains don’t have the time or resources to manage data for themselves, not to mention others. Second, the technology to implement the data mesh—namely the self-service data platform—doesn’t exist yet although there are promising developments, such as no-code data pipelining tools and data exchanges that make it easy to share data. Finally, the coordination costs of implementing global semantics, governance, and interfaces are daunting.

In the final analysis, the data mesh works if you implement it piecemeal.

In the final analysis, the data mesh works if you implement it piecemeal, not as an enterprise architecture. The data mesh works for data domains with their own data engineers who can manage and publish their own data, especially if the central team develops a mesh-like self-service data platform that makes it easy for domains to model and manage data sets. But most domains will continue to rely on corporate IT to service their data requests and build complex analytic applications. It is much more realistic to support these groups using a traditional centralized ownership model and architecture.

[1]: “Data Mesh: Centralized Ownership vs Decentralized Ownership” by James Serra, July 23, 2021.

Previous post by expert Next post by expert

Wayne Eckerson

Wayne Eckerson is an internationally recognized thought leader in the business intelligence and analytics field. He is a sought-after consultant and noted speaker who thinks critically, writes clearly and presents...

More About Wayne Eckerson

Nov 15, 2021 / by Wayne Eckerson Data Strategy Insider

Rethinking the Data Mesh: Apply it Piecemeal

Data Mesh Perspectives

Our Recommendation: Apply the Data Mesh Piecemeal

You Might Also Like

Wayne Eckerson