Data Products: Part of a Data Mesh Initiative or a Stand-Alone Strategy
ABSTRACT: The data mesh principle of data as a product can help you get more value from data, as part of a data mesh initiative or as a stand-alone strategy.
Sponsored by Starburst (blog 2 of 3)
Read the first blog in this series: Best Practices For Developing And Scaling Data Products
Despite constant innovations in data architecture, infrastructure, and analytics, most organizations today still struggle to realize the promised value of data. While this state of perpetual change introduces many new possibilities, it also creates a moving target for achieving the business outcomes we strive for. This includes improving decisions and actions, uncovering potential problems, and creating new opportunities.
Data mesh, conceived by Zhamak Dehghani, offers a new paradigm for conquering the ever-changing demands and opportunities of modern data. It rejects long-standing centralized data architectures such as the data lake and the data warehouse and their centralized teams. Instead, data mesh aims for flexibility and responsiveness by decentralizing data and distributing responsibility for it.
A key pillar of data mesh is the data product, which is a reusable data asset designed for a particular use and delivered according to agreed-upon standards and schedules. In this article, we’ll explore how data products can help organizations get more value from their data, either as part of a data mesh initiative or as a stand-alone strategy.
The Four Pillars of Data Mesh
The data mesh approach centers on four pillars that encompass the organizational and technical changes it espouses.
The Four Pillars of Data Mesh
Domain ownership. Business domain teams handle data quality, sharing, and analytics for data know best.
Data as a Product. Domain teams package data into useful data products. These must be accessible, usable, secure, and trustworthy.
Self-service data infrastructure. A data mesh platform streamlines data development and data sharing to empower both data providers and users.
Federated computational governance. Stakeholders share decision-making and accountability for data. Domain teams apply governance standards and policies by building computational governance—i.e., automated controls and observability—into their data products.
Addressing all four pillars of data mesh involves some significant organizational changes. Distributing responsibility for data to business domain teams can provide flexibility and increased responsiveness to stakeholder needs. But not every organization can or should commit to data mesh’s realignment of people, processes, and technology. You need sufficient data volume and variety, a sizable staff of skilled resources, and the business needs and opportunities to justify the effort of implementing data mesh. However, there is value in treating data as a product even if your organization is not ready to fully commit to data mesh’s organizational changes.
The Role of Data Products in Data Mesh
Proper Mindset. Beneath the notion of a data product is the foundational concept of data as a product. The former is a deliverable, the latter is a mindset. It’s important to adopt this mindset before creating the deliverable. The data-as-a-product mindset focuses on what stakeholders need from data to recognize opportunities, spot problems, plan strategies, and otherwise contribute to the mission of their organization. It applies product thinking and product management techniques to data.
Many organizations are familiar with product management concepts because they apply them to developing the functional applications that drive their business. They have well-established ways of learning what functionality stakeholders need, validating their designs with users, building applications, and refining them over time.
Data Products Without Data Mesh
Many data leaders see the benefits of applying product thinking to data. Developing data products in close collaboration with stakeholders helps them better understand business needs and deliver data solutions with proven value through iterative refinement. While data mesh is not the right approach for everyone, even a small, centralized data team can better serve their stakeholders by applying product thinking to their deliverables.
Challenges Common to Implementing Data Products
Implementing data products as part of a data mesh initiative or as a stand-alone strategy comes with some challenges that you should be prepared for. In this section we’ll explore three key challenges and crucial practices that address them.
Changes to people, process, and technology. Managing data like a product changes the roles and responsibilities of team members, the way you develop data solutions, and the tools you use. Adapting to changes like these can present challenges for many organizations.
Assembling the right skills. With new roles, processes, and tools comes the need for new skills, especially product management expertise. Growing and acquiring the right skills for developing and managing data products creates a set of hurdles to overcome.
Avoiding data chaos. Avoiding unmanaged proliferation of data products is critical because creating too many too fast, that are duplicative and inconsistent, leads to debilitating chaos.
The diagram below illustrates how three important practices—starting small, creating a data product platform, and data product governance—work together to address these common challenges. The practices are important whether you’re starting a full data mesh initiative or implementing data products on their own.
Practices for Addressing Data Product Challenges
In our last blog, Best Practices for Developing and Scaling Data Products, we discussed the importance of starting with one or two small projects each with well-defined scope and well-understood business value. By starting small you limit the amount of change to people, process, and technology that your organization must consume. This also enables your existing data team members to practice operating differently, especially elevating the degree of collaboration and iterative refinement that data products require. Starting small helps avoid the chaos of undocumented and unmanaged data products. Fewer fully developed products is better than many half-baked ones that end up cluttering the data landscape.
Data Product Platform
A data product platform that makes it easy for consumers to discover, understand, access, and trust data products addresses both the skills gap and helps avoid the chaos of too many products. Sophisticated search capabilities, automated publishing controls that ensure ample product documentation, data and query samples, and quality indicators, enable data consumers to find what they need rather than ask for a new product to be developed which consumes data team resources and creates duplicative detritus that clogs up the works.
A data product platform also removes overhead for product developers by accelerating their continuous build-test-deploy cycles. It lets developers focus on working with stakeholders rather than the mechanics of releasing data product revisions. Thus if your data team is small, the platform makes them mightier.
Governing data products involves defining a set of standards that all products must meet and then enforcing them. These common standards describe a minimal viable data product (MVDP). Minimum viability defines what’s good enough for stakeholders to start using a data product even if it doesn’t yet have all the requested features. MVDP standards can help ease the impact of organizational change by helping stakeholders understand and trust the new approach.
Automated enforcement of governance standards and policies is critical for staying ahead of and managing data chaos while also getting out the way of innovation. For example, an automated process that identifies and tags sensitive data combined with attribute-based access controls applied across your data environment enables data consumers to use products with confidence while mitigating risks of breaches and compliance issues.
Data mesh is an influential contributor to the art and science of data management. It’s designed for the high-volume, distributed, and diverse world of modern data environments. It asserts that we should decentralize both the ownership of data and the data itself, shifting them from centralized data teams to the functional domains that create and use data to run their business.
However, organizations implementing data mesh must work within the bounds of what their culture, existing technologies, and resources will allow. A data mesh strategy is not right for every organization. But treating data as a product is. Data products yield benefits regardless of a company’s size, data landscape, or business challenges by keeping stakeholder needs and value delivery front and center.
Investing in universal capabilities such as a data product platform and automated governance functions makes sense based on the efficiency advantages they offer. But, also keep in mind that your organization may evolve and become better suited to data mesh down the road. If you’ve developed these capabilities, you’re in a much better position to implement data mesh having already done some of the work.