Before we answer the question, “What is a data catalog?” keep this in mind…
What does it take to bring data and meaning together?
Once upon a time, searching Google for your favorite band was a serious challenge. If you typed the rock group “Chevelle” into the search bar, for example, you probably got results for the Chevrolet muscle car of the same name. It was all a bit confusing.
Enter the knowledge graph
Since then, Google has made numerous updates to its search algorithms, but the Knowledge Graph is, arguably, the most significant. It powers the information panels on search engine results pages. It optimizes the “people also ask” boxes that offer additional suggestions based on search queries. It makes searching for information so much easier. The Knowledge Graph made Google think as a real person would. It connected complicated concepts and separated irrelevant topics. It searched for things, not strings.
Enter the data catalog
While data catalogs might promise a Google-like experience when finding, interpreting, and collaborating on data in the enterprise, most of them lack the usability, effectiveness, and speed of the Knowledge Graph. When you don’t have a data catalog product that’s built to connect and scale, offers more than just the basics, and is good for the whole business, it’s so much harder to build a truly data-driven culture in your business. It’s as simple as that. You just won’t get as much value from your data investments or outsmart your competitors.
Aren’t those your top priorities as a Chief Data Officer?
Forty-five percent of Chief Data Officers spend their time on value creation and/or revenue generation, 28 percent allocate time to cost savings and efficiency, and 27 percent spend time on risk mitigation, according to research from Gartner.
It’s no wonder, then, that businesses like yours need a modern data catalog to help the entire workforce — not just your data elites — become more productive, confident, and skilled with self-service data and analysis. After all, the stakes for failing to build a data-driven culture are high, but the promised land awaits for those who succeed.
But how do you find a modern data catalog among all the others? First, you have to be able to explain it to others in your business when they ask, “What is a data catalog?”
So, what is a data catalog?
Let’s hear what Gartner had to say about data catalogs in 2018:
“A data catalog maintains an inventory of data assets through the discovery, description, and organization of datasets. The catalog provides context to enable data analysts, data scientists, data stewards, and other data consumers to find and understand a relevant dataset for the purpose of extracting business value.”
– Data Catalogs are the New Black in Data Management and Analytics (Gartner, 2018)
We couldn’t agree more. In fact, our own data catalog definition is similar.
A data catalog is a metadata management tool that companies use to inventory and organize the data within their systems. Typical benefits include improvements to data discovery, governance, and access.
But does every data catalog meet the criteria? Not quite.
Think about it. Your data catalog must empower your workforce so they can get more information from your data investments, gain better data insights as a whole, and make smart decisions quickly. If your data catalog can’t do that, it’s not a modern data catalog. How do you recognize the modern from the old-school?
What types of data catalogs are there?
Most importantly, the best data catalog is one that aligns most to your data strategy and organizational priorities. After all, the whole reason you’re using a data catalog is to make your company more data-driven.
“A company only needs a Chief Data Officer when it is ready to fully consider how it wishes to compete with data over the long term and start to build the organizational capabilities it will need to do so.”
A modern data catalog that offers more than just the basics, is good for your whole business, and is built to connect and scale is just the type of data catalog you need.
Don’t believe us? Gartner feels the same way.
They identify three distinct subclasses of data catalogs and how they’re different from each other.
1. Data catalogs for data science and data engineering use cases
These data catalogs collect and classify all the information in your data lakes. They primarily cater to the data elite and, as a result, tend to leave everyone else behind. While these data catalogs provide loads of information to your data teams, they won’t help you achieve self-service business intelligence. Nor will they build a data-driven culture unless your entire company becomes data literate overnight. This used to be the right answer to, “What is a data catalog?”
2. Vendor or tool-specific data catalogs
While these data catalogs provide businesses and data scientists with a way to find and analyze data, they have limited capabilities. Put another way: Do you really want to dig through a data catalog for every one of your data tools in order to find what you need? Or would you rather have a single data catalog connected to all of your data sources that provides you with a single source of truth? If you believe Eckerson Group, a modern data catalog should work with all of your other data investments:
“The value of a seamless user experience throughout the analytics lifecycle is evident, so the trend in [data] catalog evolution is toward convergence. Most tools will mature to become fully integrated solutions supporting all three capabilities – cataloging, preparation, and analysis. Convergence, however, does not eliminate the need for interoperability, as self-service analysts often want to make their own choices of preparation and analysis tools.”
3. Modern data catalogs for analysis and teamwork
Gartner defines these type of data catalogs as “generalist, business-oriented data catalogs for broader use in information governance and infonomics – targeted at the Chief Data Officer (CDO).”
Doesn’t that sound like the right answer to, “What is a data catalog?”
That’s because a modern data catalog is truly the foundation of data empowerment and not just a place to index all the information you have. Modern data catalogs unify your people, data, and analysis in a way that makes it easier to build a data-driven culture. If a data catalog can do those things, it’s the modern one you need.
While there are various options for selecting an enterprise data catalog today, think about how Google revolutionized its search engine with the Knowledge Graph. Then think about how you can supercharge your data culture with a modern data catalog. If you’re looking for a literal or figurative boost from a Knowledge Graph-like experience, a modern data catalog is the only way to go.
So, what now?
By now, you should feel confident in your answer when someone asks you, “What is a data catalog and why do I need one?” Congrats! You can now begin your data catalog initiative and start helping your company find, understand, and use your data to drive business outcomes.
Want to know what the next step is? Find out how to ensure a successful launch for your enterprise data catalog in this blog.