It’s ok if you’re not sure. In fact, the topic is still debated by data governance industry experts on a regular basis. But there are differences, and those differences are important.
Here, we’ll define both data fabric and data mesh, provide use case examples for each, then highlight the important differences between the two.
But before we do, we want to make one thing absolutely clear: You cannot buy a data fabric or a data mesh. Both are types of integrated data management frameworks, but the former methodology leans more on the technical side, while the latter is formed via a combination of technology and people. Yes, you need technology to support these methodologies, but technology is not the solution in and of itself.
What is a data fabric?
As we explain on our website, “An enterprise data fabric is a data architecture that connects data and knowledge at scale in a distributed and decentralized manner. It simplifies data integration, providing a single, semantically organized view of your trusted data.” A data fabric doesn’t require moving data to a centralized location like a data warehouse or data lake, and instead relies on active data governance to unify data and metadata across an enterprise.
Gartner offered an expanded definition of a data fabric in 2021, explaining:
A data fabric leverages both human and machine capabilities to access data in place or support its consolidation where appropriate. It continuously identifies and connects data from disparate applications to discover unique, business-relevant relationships between the available data points. The insight supports re-engineered decision-making, providing more value through rapid access and comprehension than traditional data management practices.
To put it more simply, data fabric is a unified enterprise architecture underpinned by an integrated set of data platforms, technologies and services, designed to deliver the right data to the right data customer at the right time. (And with a data architecture that’s a fabric, the right time can mean real-time.)
But there are countless definitions, and the name “data fabric” can still mean different things to different people depending on their needs and use cases. (Just listen to the Jan 2021 episode of Catalog and Cocktails, “Is Your Data Fabric a Mesh,” to find out what we mean.)
Example of a data fabric use case
Modern fast-food restaurants now accept orders by phone, website, app… or a good ol’ fashioned face-to-face conversation. But with hundreds and thousands of locations, a dozen ways to place an order, and likely millions of customers, how do they compile all this data and make it usable for analytics that drive business improvements? By building a data fabric architecture that integrates data from these myriad data pipelines, ensures its quality, and makes it easily available for users across the company to access on an as-needed basis.
What is a data mesh?
Popularized by Zhamak Dehghani in 2019, data mesh is a paradigm shift away from a centralized data architecture to a modern, distributed architecture. It’s an organizational framework for collecting, managing, and sharing data assets across the enterprise, and it eliminates silos by empowering domain experts to own data products they create and make them available to data consumers across business lines.
What differentiates a data mesh approach from data fabric is that data mesh is a socio-technical approach toward data governance, with an emphasis on the “socio” — the people. Unlike data fabric, which is more technology centric, it marries product thinking with a move towards domain-driven data management, where the people who work most closely with the data are called upon to both maintain its quality and serve as subject-matter experts for the data products they “own.”
A data mesh methodology is defined by four pillars:
- Domain Ownership: Decentralizing data ownership gives business domains (Sales, Marketing, Finance, etc.) control of the data they create. The domain’s consequent familiarity with the data provides deeper insight into where, why, and how it should be used.
- Data as a Product: Treating data as a product makes data discoverable, understandable, and usable in the same way you search for and purchase products using your favorite search engine and/or e-commerce platform. When data is considered a product by the domain that publishes it, domain owners are empowered to become wholly responsible and accountable for their data products, including data quality, representation, and cohesiveness.
- Self-Serve Data Infrastructure as a Platform: Self-service data and analytics makes data readily accessible to members of the organization who need it to make informed business decisions. It simplifies data discovery and enables data democratization, making it quick and easy for anyone to surface relevant insights.
- Federated Computational Governance: Federated computational governance establishes a governance policy that’s standardized across each decentralized domain, ensuring all domain owners operate within a consistent framework.
Because of these pillars, a data mesh architectural approach gives data and business teams the best of both worlds: a centralized database with domains responsible for handling their own pipelines. This allows for greater autonomy and flexibility for data owners, eliminates data bottlenecks, facilitates greater data experimentation and innovation, and lessens the burden on data teams that would otherwise — in a data fabric architecture — be attempting to meet the needs of every data consumer in the enterprise through a single pipeline.
Example of a data mesh use case
data.world customer OneWeb — a communications company building a 700-satellite constellation to provide global satellite Internet broadband services to people everywhere — needed the ability to share mass amounts of data from satellites that were being sent to one of 31 different data lakes around the world. By establishing a data mesh architecture — supported by data.world’s knowledge-graph-powered data catalog — OneWeb empowered their engineers to find and access data across these separate data sources, and improved their product design and functionality by harnessing the collective knowledge of their workforce.
Data Fabric vs. Data Mesh: The Difference
If you think the definitions of data fabric and data mesh sound similar, you’re not wrong. And though there is some overlap, there are also some crucial differences between the two architectures.
The idea of a data fabric is largely focused on the collection of various technical tools that combine to produce an interface for the end-users that consume data. In some ways, a data fabric is part of a data mesh architecture, but data mesh takes this idea a step farther.
A data fabric architecture could be considered part of a data mesh, but the major differences are the four pillars upon which a data mesh architecture is based. These pillars lead to further expertise and insight into data, consistency and accuracy in data representation and quality, and easy self-service access to data across the organization. These pillars not only increase understanding and quality of data throughout the enterprise, they democratize data access for business users who want to make data-driven decisions and help to build a data-driven culture.