Why is a data catalog important?
of companies have not created
a data driven organization.
Data is the foremost competitive battleground of our time. The threat to every business is existential. Therefore, today’s winners prioritize giving employees accurate, clear, and fast answers to every business question.
However, most companies are losing.
Even after years of effort and money spent, companies still don’t pass the only three tests that matter:
- Clarity: Do your people understand data well enough to answer business questions?
- Accuracy: Do they believe in and rely on data’s accuracy when answering business questions?
- Speed: Do they answer business questions fast enough to matter?
You must be able to pass these tests to succeed, because you can’t afford to fail.
The problem: Your data is meaningless to most people with business questions.
As a result, complexity skyrockets, lakes flood with meaningless data, and your people still can’t answer business questions. Garbage in, garbage out.
The more money, misinformed solutions, and supposed silver bullets you throw at data, the faster the data’s meaning declines and the less use it has to the people with business questions.
This is our current, unfortunate status quo.
The solution: bridge the gap between data and meaning to create explainable data.
Explain your data to make it meaningful
In fact, there are two existing technologies that, when combined, create new explainable data and rescue meaningless data from uselessness.
The first technology is the data catalog.
“Forty-seven percent of respondents report untrustworthy or inaccurate insights from analytics due to poor data quality. Only 14 percent of stakeholders had a very good understanding of the data and that less than 60 percent of the data was well understood by stakeholders.”
What is a data catalog?
Data catalog tools make meaning discoverable and accessible
The second technology is the knowledge graph.
What is a knowledge graph?
How knowledge graphs connect data with meaning within the enterprise
If knowledge graphs are able to power the most widely-used search engine, they can definitely rise to any enterprise data challenge without disruption.
Data catalogs powered by knowledge graphs are the future.
What to look for in a data catalog solution
Your data catalog must empower your workforce so they can get more information from your data investments and make smart decisions quickly. If your data catalog can’t do that, it’s not an enterprise-ready data catalog.
How will you know which one is which? Gartner identifies three distinct subclasses of data catalogs and how they differentiate themselves in the market.
“A data catalog maintains an inventory of data assets through the discovery, description, and organization of datasets. The catalog provides context to enable data analysts, data scientists, data stewards, and other data consumers to find and understand a relevant dataset for the purpose of extracting business value.”
For data science and data engineering use cases
Although data catalogs provide tons of information to your data teams, they are unable to help companies achieve self-service business intelligence on their own. As a result, building a data-driven culture becomes increasingly difficult. Many technical people hold this impression of a data catalog, but luckily there are more and newer versions available.
For specific vendors or tools
Simply put, no one. Having one data catalog connected to all of your data sources with a single source of truth is much more optimal. Don’t believe us? Eckerson Group says an enterprise data catalog should work with all of your other data investments:
“The value of a seamless user experience throughout the analytics lifecycle is evident, so the trend in [data] catalog evolution is toward convergence. Most tools will mature to become fully integrated solutions supporting all three capabilities – cataloging, preparation, and analysis. Convergence, however, does not eliminate the need for interoperability, as self-service analysts often want to make their own choices of preparation and analysis tools.”
For everyone in the business
An enterprise data catalog is truly the foundation of data empowerment. It’s not just a place to index all of your information, but it can also unify your people, data, and analysis so that it is easier to build a data-driven culture.
Similarly to how Google revolutionized its search engine with the knowledge graph, you can supercharge your data culture with an enterprise data catalog.
Now that we’ve talked about the broad categories of data catalog tools, here’s how you should go about choosing one to adopt.
How should you evaluate a data catalog?
Data catalog tools are exciting because they can democratize data across an organization. However, data is only meaningful to business decision makers if it is enriched with context, which comes from people and metadata.
Connecting data to its context is the difference between making the right or wrong decisions with data. For example, when using the imperial versus metric systems, using the wrong unit definition to hang a shelf might not be a big problem. However, this gap in understanding data and meaning is part of the reason the U.S. economy lost $3.1 trillion to bad data in 2016.
Why data and analytics leaders need a data catalog
So what’s holding data & analytics leaders back from cracking the code and investing in a data catalog? After all, only a third of CDOs consider themselves successful at creating a data-driven culture despite their efforts.
“Early CDOs were focused on data governance, data quality, and regulatory drivers, but today’s data and analytics leaders are becoming impactful change agents who are spearheading data-driven transformation.”
Data-driven transformation takes more than just data.
To drive business change, data and analytics leaders need to solve problems that come from multiple directions:
From on-premise to the cloud, to hard drives and home laptops, data lives almost everywhere. Reliable and useful data is the core of modern-day business, however, some data may not be completely accurate, and data sources may not be known.
Despite the use of analytics tools, analysis is actually a thought process that predominantly occurs in people’s minds. Therefore, nothing gets documented or reproduced. You can’t see the assumptions, data, or insights behind the discoveries that analytics generate. Since it’s not preserved, determining what data and what approach to use becomes tedious and to be repeated for every project. To solve this issue, treat analysis like data: archive it, catalog it, and understand it.
Almost everyone in business works with data, but each person operates at a different level of data literacy. So to truly achieve a data-driven culture at your company, data must be accessible to everyone, not just to elite data practitioners.
It takes a village to become data-driven.
During your data culture transformation, no person can be left behind. Creating a data-driven culture requires convincing employees to adopt updated data practices, supporting cross-team collaboration, and empowering your people with data catalog products to help them work better, together. Most importantly, CDOs or other D&A leaders need more power to foster these changes.
However, CDOs and their counterparts don’t just need any enterprise data catalog, they need one that makes data easy to find, understand, and use to drive business change.
Part of driving adoption for your data catalog is choosing the right problem to solve with it at the beginning of your launch. Here are some examples of the kinds of challenges you could solve with the right data catalog.
Top data catalog example use cases
Close the discoverability and meaning gap with an active inventory of your assets
The discoverability (and meaning) gap
Finding and understanding relevant information is laborious and can cause you to miss valuable opportunities or make uninformed business decisions. This is common for companies that don’t have a well-maintained, active inventory of data and analysis.
So help your business reduce the time and labor gap between asking a question and producing an answer by inventorying your data resources, enriching them with useful metadata (meaning) and validations, and connecting them to meaningful business concepts.
Reduce the relevance and reusability gaps by curating the best data and analyses
The relevance (and reusability) gap
When data is disconnected from its relevant business concepts and initiatives, its context is lost. As a result, you have to start from the ground up on new analysis without building upon previous work.
Searching for the right data for an analysis can feel like being lost in a forest with no compass. So think like a cartographer and create a map of your best data with your data catalog. Because making your data assets accessible is the key to making them reusable.
This curated library of data sources can be anything from a slice of data from your data warehouse to your most popular, shared spreadsheets. Either way, the goal is to point the company to the 20% (or much less) of assets that provide 80% (or much more) of the value.
Bridge the impact and reproducibility gaps through data analysis and reuse
The impact (and reproducibility) gap
Our work with data is meaningless if it doesn’t influence the decisions we make. This is why sharing information with stakeholders ineffectively or incompletely increases risk and slows productivity. Lost cycles may cost hundreds of thousands of dollars, but a bad decision can cost millions.
Therefore, we need to ensure IT, data stewards, data engineers, analysts, and business people are collaborating. With cross-functional collaboration, analyses can be documented and shared in a way that is agile, iterative, and easily consumable. Workflows can also be reused and reproduced easily to deliver more consistent answers.
Which use case is most relevant for you?
- If your biggest problem is understanding what data assets the company has and what they mean, inventory your most impactful assets.
- If it’s figuring out which data assets are most accurate and reusable for any given situation, curate what’s useful.
- And finally, if you have a recurring analysis or business challenge, encourage your colleagues to analyze, share, and iterate their analyses to make them reusable.
Check out these real-life stories from data.world customers who successfully launched our cloud data catalog and saw value from these use cases.
Stories of how data catalog software drives impact
One of the world’s largest software companies created a business glossary and dashboard catalog.
A global management consulting company enables their data to be found faster than ever before.
This company uses our data catalog to create a curated, user-friendly data portal. Consultants are able to find the right data faster and use it more often with the organization’s new portal, which contains owned, purchased, and derived analyses. Since data.world automatically gathers context, ongoing analysis, and identifies relationships between datasets, projects, and teams, the firm’s employees are able to be more connected and efficient with data.
The Associated Press uses curated datasets to transform the way news is reported.
AP and data.world make data journalism accessible by transforming the way data reaches local newsrooms. Technical users can now create and share queries faster without leaving the platform or spinning up a database. Additionally, less technical users can slice data for their local news markets without any prior coding or data science knowledge. Now with the option of exporting results in common formats, anyone can dig in and get clean data faster. Newsrooms across the country now have actionable data that can be used to inform the public on how national events affect their local communities.
Mirum, a global digital experience agency, streamlines their data projects for thousands of people around the world.
With over 2,500 people in 25 countries. Data—and data-literate people—are the key to how Mirium creates unforgettable experiences for clients like Mazda and Qualcomm. With their already sophisticated approach to data analysis, Mirum wanted to take the next step and better package their data to make their expertise even more valuable.
data.world helped Mirum streamline their new data practices and improved processes seamlessly across projects and teams. Discussion—between coworkers, between agencies, and with client stakeholders—shifted from email to dedicated project comment threads. Now, the full data project lifecycle lives on a single platform, data.world. Teams at Mirum not only do the work through data.world, but deliver it to its customers through the platform as well.
Aceable, an innovative tech startup, saves time by streamlining workflows and providing self-service data access.
Aceable creates easily-consumable, mobile & digital first content for defensive driving courses. In order to recognize more revenue, they needed a quick way to retrieve data without exhausting the resources of its business analysts. With data.world, a single person at Aceable can now consume, integrate, and query the data to calculate revenue recognition. Streamlining this workflow reduces analysts’ workloads and avoids the time-intensive analysis bottleneck. Therefore, C-suite executives are able to receive important business data quicker.
Ready to begin writing your own success story?
Now it’s your turn. Prepare for your data catalog launch with these tips.
How to launch a data catalog for maximum value and adoption
Consult and collaborate with your evaluation team
First, work with your evaluation team and executive sponsors on determining and tracking key performance metrics, so you can measure the impact of your data catalog tools. Don’t skip this step! You need to welcome differing perspectives from your colleagues and align everyone around the same goal from the get-go, or you could jeopardize the launch of your data catalog.
Most importantly, you want to track the impact of your data catalog use cases at every stage of the data lifecycle and for every role to see if it’s working. In order to do that, you need to understand how each of your teams currently work with data, what they want to improve, and how they envision that improvement to materialize from their day-to-day work.
To do that, take these three steps while launching your enterprise data catalog:
Understand their unique perspectives
Polish your processes before onboarding others
Data catalogs become more valuable as more people use them, so creating hype and developing buy-in is your bridge to a data-driven culture.
These three critical components of your measurement plan will ensure that your whole organization benefits from your enterprise data catalog pilot.
What else can you do to ensure the success of your data catalog launch?
Track your data catalog’s impact beyond usage metrics
Looking beyond platform usage, be sure to also measure the impact of your data catalog on team productivity, organizational culture, and overall business results. If this seems unclear at first, don’t worry. This will be an ongoing process to refine as you grow.
Remember, you invested in a data catalog to bring people, data, and analysis together and to give employees clear, accurate, and fast answers to any business question. Design your measurement plan to reflect that.
Need advice on how to start? Try categorizing metrics in these 4 buckets as you build your measurement plan.
PRODUCTIVITY: Are you working faster and getting more done?
DATA-DRIVEN CULTURE: Are more people collaborating with data?
USAGE: Is the right data being used for the right projects?
BUSINESS: Do you have a clear way to measure impact in dollars and cents?
These categories should reflect your most important priorities as a data and analytics leader. Be sure to benchmark your current state before launching your data catalog. Productivity metrics are particularly great to record and measure from the start, since you can capture them while determining success goals and metrics with the evaluation team.
On the other hand, metrics, such as usage, will probably only be useful after you launch your enterprise data catalog software, so keep that in mind as you move forward.
Connect people, data, and analysis
Doing this brings you one step closer to making your organization truly data-driven. And that’s your goal, right?
The business value of a data catalog
Given today’s challenging times, you may expect the ROI from your data initiatives to be lower than in years past. But according to Gartner, companies that offer a “curated catalog of internal and external data to diverse users will realize twice the business value from their data and analytics investments.” In fact, data catalogs can provide outsized impact in times where data is increasingly important.
We are seeing all kinds of businesses – from banks to restaurants to tech companies – make abrupt and, in some cases, multi-million-dollar changes to their operations. For companies trying to forecast sales two-quarters out or assess the stability of their supply chain, data catalogs are an incredibly effective tool for ensuring the data and metadata that support the analysis and decision making process is up-to-date, accessible, and understandable.
Ready to infuse clarity, accuracy, and speed into your data work?
Want to see it for yourself? Get a demo of our enterprise data catalog and see what it’s like to begin connecting your data sources and building your datasets for collaboration!
Request a demo of our cloud data catalog!
We’ll show you how data.world makes it easy for everyone—not just the “data people”—to get clear, accurate, fast answers to any business question.
data.world makes it easy for everyone—not just the “data people”—to get clear, accurate, fast answers to any business question. Our cloud-native data catalog maps your siloed, distributed data to familiar and consistent business concepts, creating a unified body of knowledge anyone can find, understand, and use. data.world is an Austin-based Certified B Corporation and public benefit corporation and home to the world’s largest collaborative open data community.