A data mesh is an organizational framework for collecting, managing, and sharing data across the enterprise. It is a method of data management that empowers domain experts to own the data they create and make it available to consumers across business lines.
Put more simply, within a data mesh, the people who work most closely with a specific subset of data are responsible for maintaining its quality, and they become the go-to resource for accessing that data for your organization at large.
The phrase “data mesh” was popularized by Thoughtworks’ Zhamak Dehghani in 2019, and is defined as a socio-technical approach that marries product thinking with a move towards domain-driven data management. The methodology is defined by four principles or “pillars”: data ownership by domain; data as a product; self-service data; and federated computational governance.
It’s important to understand that you cannot buy a data mesh. Data mesh tools are needed to support the methodology, but it is not the solution in and of itself.
To implement an effective data mesh, you first need to understand the principles upon which the methodology is based.
The Principles of a Data Mesh Architecture
A data mesh approach to data governance is defined by four principles:
Domain Ownership
Decentralizing data ownership gives business domains (Sales, Marketing, Finance, etc.) control of the data they create. The domain’s consequent familiarity with the data provides deeper insight into where, why, and how it should be used.
Data as a Product
Treating data as a product makes data discoverable, understandable, and usable in the same way you search for and purchase products using your favorite search engine and/or e-commerce platform. When data is considered a product by the domain that publishes it, domain owners are empowered to become wholly responsible and accountable for their data, including its quality, representation, and cohesiveness.
Self-Serve Data Infrastructure as a Platform
Self-service data and analytics makes data readily accessible to members of the organization who need it to make informed business decisions. It simplifies data discovery and enables data democratization, making it quick and easy for anyone to surface relevant insights.
Federated Computational Governance
Federated computational governance establishes a governance policy that’s standardized across each decentralized domain, ensuring all domain owners operate within a consistent framework.
Benefits of a Data Mesh Architecture
A few short years ago, enterprise businesses began migrating their data to data lakes equipped with real-time data availability and stream processing functionality, with the goal of ingesting, enriching, transforming, and serving data from a centralized data platform. Unfortunately, while well intentioned, this type of centralized architecture often led to disconnected data producers, impatient data consumers, and backlogged data teams struggling to keep pace with the demands of the business.
Alternatively, domain-oriented data architectures, like data meshes, give teams the best of both worlds: a centralized database (or data warehouse) with domains (or business areas) responsible for handling their own pipelines. This allows for greater autonomy and flexibility for data owners, facilitates greater data experimentation and innovation, and lessens the burden on data teams that would otherwise be attempting to meet the needs of every data consumer in the enterprise through a single pipeline.
At the same time, the data mesh principle of self-serve infrastructure provides data teams with a universal, domain-agnostic, and often automated approach to data standardization, data product lineage, data product monitoring, alerting, logging, and data product quality metrics.
To learn more about data mesh and how you can implement it for your enterprise, read the whitepaper The Data Mesh Governance Framework You Can Implement Today.