How to bring clarity, accuracy, and speed to your data and analytics with an enterprise data catalog.

What is a data catalog?

A data catalog is a metadata management tool that companies use to inventory and organize the data within their systems. Typical benefits include improvements to data discovery, governance, and access.

Why is a data catalog important?

Data is the foremost competitive battleground of our time. The threat to every business is existential. Therefore, today’s winners prioritize giving employees accurate, clear, and fast answers to every business question.

However, most companies are losing.

“Only 31.0% of companies say they are data-driven. This number has declined from 37.1% in 2017 and 32.4% in 2018. We are headed the wrong direction.”

NewVantage Partners (2019)

Even after years of effort and money spent, companies still don’t pass the only three tests that matter:

 

  1. Clarity: Do your people understand data well enough to answer business questions?
  2. Accuracy: Do they believe in and rely on data’s accuracy when answering business questions?
  3. Speed: Do they answer business questions fast enough to matter?
Created a Data Driven Organization201720182019
Yes37.7%32.4%31.0%
No62.9%67.6%69.0%

 

Note: We’re not asking can they, but do they. Today.

You must be able to pass these tests to succeed, because you can’t afford to fail.

“For every 100 employees, finding data and reproducing analysis is a $1.7M problem.”

–IDC (2018)

The problem: Your data is meaningless to most people with business questions.

Shared business meaning is essential for productive communication and cooperation across people, teams, and systems. It is precisely what’s missing from the majority of a company’s data.

As a result, complexity skyrockets, lakes flood with meaningless data, and your people still can’t answer business questions. Garbage in, garbage out.

The more money, misinformed solutions, and supposed silver bullets you throw at data, the faster the data’s meaning declines and the less use it has to the people with business questions.

This is our current, unfortunate status quo.

The solution: bridge the gap between data and meaning to create explainable data.

Explain your data to make it meaningful

Revive meaningless data by expressing it in familiar and consistent business concepts. Therefore, anyone can discover, comprehend, and use data to answer important questions. With unexplainable data, you overlook its value, leave crucial business problems unresolved, and prevent many knowledgeable and talented employees from data-driven decision making.

“Forty-seven percent of respondents report untrustworthy or inaccurate insights from analytics due to poor data quality. Only 14 percent of stakeholders had a very good understanding of the data and that less than 60 percent of the data was well understood by stakeholders.”

–Syncsort (2019)

Explainable data is becoming the new status quo.

In fact, there are two existing technologies that, when combined, create new explainable data and rescue meaningless data from uselessness.

The first technology is the data catalog.

What is a data catalog?

An enterprise data catalog is a data and metadata management tool companies use to inventory and organize the data within their systems.
Data catalog tools make meaning discoverable and accessible 
Data catalogs improve data clarity, accuracy, and speed in several ways:

Clarity: Everything needed to understand data is kept and maintained, from the beginning. As people use their data catalog, the data’s context deepens and its meaning becomes clearer.

Accuracy: A wider array of people can validate, improve, and correct data and analysis when they use a premier data catalog solution.

Speed: People can find what they need faster by organizing data and analysis in discoverable, business-friendly ways, providing Google-like search, and keeping all context within reach.

“44% of data worker time is wasted every week because of unsuccessful activities. 51% of searching activity is wasted, and 47% of preparation work is wasted.”

–IDC (2019)

This is why enterprise organizations are so fired up about them.

The second technology is the knowledge graph.

What is a knowledge graph?

Knowledge graphs take data clarity, accuracy, and speed to the next level.

Clarity: Knowledge graphs enable explainable data, expressing it in consistent, familiar, and understandable business concepts. Data from knowledge graphs can be exported to user-preferred formats that are compatible with the tools they know.

Accuracy: Knowledge graphs map meaning to data regardless of how it’s structured and where it’s located. Graph architectures are well-known for their ability to reference and collect disparate data sources, earning their spot in Gartner’s Top 10 Data and Analytics Technology Trends for 2019. Once you understand the data, you’re better equipped to evaluate its accuracy and correct errors.

Speed: Don’t waste your time searching for the naming, relationships, business meaning, and quality of your data. Knowledge graphs provide one clear view of data from multiple sources, so anyone can find data-driven answers quickly by using concepts that make sense within their professional domain. Graphs are flexible by design: you can add new data with no sweat no matter how many changes your business goes through.

How knowledge graphs connect data with meaning within the enterprise

“Knowledge graphs are large networks of entities and their semantic relationships”, as our CEO Brett Hurt has been explaining to executives in CIO. For example, Google’s knowledge graph allows users to search for things, not just strings (aka not just matching strings in search queries with strings in Web documents). It helps people find the right thing (e.g., Turkey the nation, not turkey the bird), display related facts from multiple sources (e.g., Turkey’s summary card contains a description from Wikipedia, a map from Google, flight info, etc.), and discover helpful information not explicitly searched for (e.g., “People also ask: Which is the best month to visit?”). This “things, not strings” approach has enabled Google to fulfill its mission to “organize the world’s information and make it universally accessible and useful.”

If knowledge graphs are able to power the most widely-used search engine, they can definitely rise to any enterprise data challenge without disruption.

“Nine out of ten of the most value-creating companies in the world in 2018 were using knowledge graphs.”

–PWC (2019)

Data catalogs powered by knowledge graphs are the future.

Now you know how data catalogs and knowledge graphs work together to give data meaning without disrupting its surrounding processes. This model can multiply—not merely increase—your data’s value by making it explainable. With a data catalog powered by a knowledge graph, your people will be empowered to answer business questions with clarity, accuracy, and speed.

What to look for in a data catalog solution

“A data catalog maintains an inventory of data assets through the discovery, description, and organization of datasets. The catalog provides context to enable data analysts, data scientists, data stewards, and other data consumers to find and understand a relevant dataset for the purpose of extracting business value.”

– Data Catalogs are the New Black in Data Management and Analytics (Gartner, 2018)

The best data catalog is one that helps make your company more data driven. It should align most to your organization’s priorities and data strategy.

Your data catalog must empower your workforce so they can get more information from your data investments and make smart decisions quickly. If your data catalog can’t do that, it’s not an enterprise-ready data catalog.

How will you know which one is which? Gartner identifies three distinct subclasses of data catalogs and how they differentiate themselves in the market.

For data science and data engineering use cases
These data catalogs collect and classify all the information in your data lakes. They are predominantly used by the most experienced data practitioners and therefore, tend to leave everyone else at your company out of the loop.

Although data catalogs provide tons of information to your data teams, they are unable to help companies achieve self-service business intelligence on their own. As a result, building a data-driven culture becomes increasingly difficult. Many technical people hold this impression of a data catalog, but luckily there are more and newer versions available.

For specific vendors or tools
Although these data catalogs give businesses and data-literate people a way to find and analyze data, they still have limited capabilities. Who wants to dig through a data catalog for every data tool in order to find the one you need?

Simply put, no one. Having one data catalog connected to all of your data sources with a single source of truth is much more optimal. Don’t believe us? Eckerson Group says an enterprise data catalog should work with all of your other data investments:

“The value of a seamless user experience throughout the analytics lifecycle is evident, so the trend in [data] catalog evolution is toward convergence. Most tools will mature to become fully integrated solutions supporting all three capabilities – cataloging, preparation, and analysis. Convergence, however, does not eliminate the need for interoperability, as self-service analysts often want to make their own choices of preparation and analysis tools.”

– Dave Wells, Practice Director, Data Management Eckerson Group

For everyone in the business
Gartner defines enterprise data catalogs as “generalist, business-oriented data catalogs for broader use in information governance and infonomics – targeted at the Chief Data Officer (CDO).”

An enterprise data catalog is truly the foundation of data empowerment. It’s not just a place to index all of your information, but it can also unify your people, data, and analysis so that it is easier to build a data-driven culture.

Similarly to how Google revolutionized its search engine with the knowledge graph, you can supercharge your data culture with an enterprise data catalog.

Now that we’ve talked about the broad categories of data catalog tools, here’s how you should go about choosing one to adopt.

How should you evaluate a data catalog?
Before embarking on a data catalog evaluation, you must figure out what you want to accomplish with one first.

Data catalog tools are exciting because they can democratize data across an organization. However, data is only meaningful to business decision makers if it is enriched with context, which comes from people and metadata.

Connecting data to its context is the difference between making the right or wrong decisions with data. For example, when using the imperial versus metric systems, using the wrong unit definition to hang a shelf might not be a big problem. However, this gap in understanding data and meaning is part of the reason the U.S. economy lost $3.1 trillion to bad data in 2016.

Why data and analytics leaders need a data catalog
Chief Data Officers spend 45 percent of their time on value creation and/or revenue generation, 28 percent on cost savings and efficiency, and 27 percent on risk mitigation, according to research from Gartner. So it’s no wonder that businesses need an enterprise data catalog for the entire workforce, because then everyone can answer business questions with clarity, accuracy, and speed.

So what’s holding data & analytics leaders back from cracking the code and investing in a data catalog? After all, only a third of CDOs consider themselves successful at creating a data-driven culture despite their efforts.

Data-driven transformation takes more than just data.
Today, Chief Data Officers (CDOs) and other data leaders tackle much broader challenges than just bringing data under control.

“Early CDOs were focused on data governance, data quality, and regulatory drivers, but today’s data and analytics leaders are becoming impactful change agents who are spearheading data-driven transformation.”

Valerie Logan, Research Director, Gartner

To drive business change, data and analytics leaders need to solve problems that come from multiple directions:

Data

From on-premise to the cloud, to hard drives and home laptops, data lives almost everywhere. Reliable and useful data is the core of modern-day business, however, some data may not be completely accurate, and data sources may not be known.

Analysis

Despite the use of analytics tools, analysis is actually a thought process that predominantly occurs in people’s minds. Therefore, nothing gets documented or reproduced. You can’t see the assumptions, data, or insights behind the discoveries that analytics generate. Since it’s not preserved, determining what data and what approach to use becomes tedious and to be repeated for every project. To solve this issue, treat analysis like data: archive it, catalog it, and understand it.

People

Almost everyone in business works with data, but each person operates at a different level of data literacy. So to truly achieve a data-driven culture at your company, data must be accessible to everyone, not just to elite data practitioners.

Data and analytics leaders need to solve problems beyond just the technical ones. In fact, according to Gartner, “The top internal roadblock to the success of the office of the CDO is ‘culture challenges to accept change’.” Additionally, “93% of executives identify people and process issues” as the barrier to building a data-driven organization. The Harvard Business Review found that “the difficulty of cultural change has been dramatically underestimated in these leading companies — 40.3% identify lack of organization alignment and 24% cite cultural resistance as the leading factors contributing to this lack of business adoption.”

It takes a village to become data-driven.
“We hear little about initiatives devoted to changing human attitudes and behaviors around data. Unless the focus shifts to these types of activities, we are likely to see the same problem areas in the future that we’ve observed year after year in this survey,” according to Randy Bean and Thomas H. Davenport at NewVantage Partners.

During your data culture transformation, no person can be left behind. Creating a data-driven culture requires convincing employees to adopt updated data practices, supporting cross-team collaboration, and empowering your people with data catalog products to help them work better, together. Most importantly, CDOs or other D&A leaders need more power to foster these changes.

However, CDOs and their counterparts don’t just need any enterprise data catalog, they need one that makes data easy to find, understand, and use to drive business change.

Part of driving adoption for your data catalog is choosing the right problem to solve with it at the beginning of your launch. Here are some examples of the kinds of challenges you could solve with the right data catalog.

Top data catalog example use cases
There are many valuable ways to use a data catalog. Read how our customers have benefitted from the following use cases to help them make critical business decisions with clarity, accuracy, and speed.
Close the discoverability and meaning gap with an active inventory of your assets
The discoverability (and meaning) gap
Finding and understanding relevant information is laborious and can cause you to miss valuable opportunities or make uninformed business decisions. This is common for companies that don’t have a well-maintained, active inventory of data and analysis.

So help your business reduce the time and labor gap between asking a question and producing an answer by inventorying your data resources, enriching them with useful metadata (meaning) and validations, and connecting them to meaningful business concepts.

Reduce the relevance and reusability gaps by curating the best data and analyses
The relevance (and reusability) gap
When data is disconnected from its relevant business concepts and initiatives, its context is lost. As a result, you have to start from the ground up on new analysis without building upon previous work.

Searching for the right data for an analysis can feel like being lost in a forest with no compass. So think like a cartographer and create a map of your best data with your data catalog. Because making your data assets accessible is the key to making them reusable.

This curated library of data sources can be anything from a slice of data from your data warehouse to your most popular, shared spreadsheets. Either way, the goal is to point the company to the 20% (or much less) of assets that provide 80% (or much more) of the value.

Bridge the impact and reproducibility gaps through data analysis and reuse
The impact (and reproducibility) gap
Our work with data is meaningless if it doesn’t influence the decisions we make. This is why sharing information with stakeholders ineffectively or incompletely increases risk and slows productivity. Lost cycles may cost hundreds of thousands of dollars, but a bad decision can cost millions.

Therefore, we need to ensure IT, data stewards, data engineers, analysts, and business people are collaborating. With cross-functional collaboration, analyses can be documented and shared in a way that is agile, iterative, and easily consumable. Workflows can also be reused and reproduced easily to deliver more consistent answers.

Which use case is most relevant for you?
Now that you have a good sense of the three high-impact data catalog example use cases, it’s time to consider which one you might want to take on for your company. It’s best to start by picking the most pressing problem to solve or gap to close, and then eventually working your way up to doing all three.

  • If your biggest problem is understanding what data assets the company has and what they mean, inventory your most impactful assets.
  • If it’s figuring out which data assets are most accurate and reusable for any given situation, curate what’s useful.
  • And finally, if you have a recurring analysis or business challenge, encourage your colleagues to analyze, share, and iterate their analyses to make them reusable.

Check out these real-life stories from data.world customers who successfully launched our cloud data catalog and saw value from these use cases.

Stories of how data catalog software drives impact
All of these organizations have made a conscious decision to close the data and meaning gap by launching a cloud data catalog. While each story is different, their goals were the same: answer business questions with more clarity, accuracy, and speed.

One of the world’s largest software companies created a business glossary and dashboard catalog.

A multi-billion-dollar software as a service company, focused on financial and human capital management, uses data.world’s cloud data catalog to index and organize its data assets, connecting them and the people who use them on a common business glossary. By building a single source of truth, the company’s employees are able to find what they need easily, ask the right colleagues for help, and use reliable data in a consistent way. Therefore, their data is more accessible, valuable, usable, and void of redundancy.

A global management consulting company enables their data to be found faster than ever before.

This is another customer who is renowned for being on the leading edge of innovation by applying data to help its clients answer their business questions. Any delay in answering a client’s question could mean lost revenue for the organization, so they aimed to streamline the process for finding, understanding, and utilizing data to produce analyses.

This company uses our data catalog to create a curated, user-friendly data portal. Consultants are able to find the right data faster and use it more often with the organization’s new portal, which contains owned, purchased, and derived analyses. Since data.world automatically gathers context, ongoing analysis, and identifies relationships between datasets, projects, and teams, the firm’s employees are able to be more connected and efficient with data.

The Associated Press uses curated datasets to transform the way news is reported.

On any given day, more than half the world’s population sees local news from the Associated Press (AP) through local media runs and reports. However, delivering story-relevant data to the right hands in local newsrooms is a daunting task. Previously, data would be distributed to the wrong people at the wrong time, getting lost in inboxes far from those who could use it. The most time-consuming aspect of this work (estimated at 80 percent of total project time) was finding, vetting, and cleaning data. As a result, the barrier to entry for using data was high, and local newsrooms usually lacked the time, staff, and tools.

AP and data.world make data journalism accessible by transforming the way data reaches local newsrooms. Technical users can now create and share queries faster without leaving the platform or spinning up a database. Additionally, less technical users can slice data for their local news markets without any prior coding or data science knowledge. Now with the option of exporting results in common formats, anyone can dig in and get clean data faster. Newsrooms across the country now have actionable data that can be used to inform the public on how national events affect their local communities.

Mirum, a global digital experience agency, streamlines their data projects for thousands of people around the world.

With over 2,500 people in 25 countries. Data—and data-literate people—are the key to how Mirium creates unforgettable experiences for clients like Mazda and Qualcomm. With their already sophisticated approach to data analysis, Mirum wanted to take the next step and better package their data to make their expertise even more valuable.

data.world helped Mirum streamline their new data practices and improved processes seamlessly across projects and teams. Discussion—between coworkers, between agencies, and with client stakeholders—shifted from email to dedicated project comment threads. Now, the full data project lifecycle lives on a single platform, data.world. Teams at Mirum not only do the work through data.world, but deliver it to its customers through the platform as well.

Mirum has always believed in data, and data.world has helped us extend its power to every aspect of client work.

– Amanda Seaford, CEO of Mirum US

Aceable, an innovative tech startup, saves time by streamlining workflows and providing self-service data access.

Aceable creates easily-consumable, mobile & digital first content for defensive driving courses. In order to recognize more revenue, they needed a quick way to retrieve data without exhausting the resources of its business analysts. With data.world, a single person at Aceable can now consume, integrate, and query the data to calculate revenue recognition. Streamlining this workflow reduces analysts’ workloads and avoids the time-intensive analysis bottleneck. Therefore, C-suite executives are able to receive important business data quicker.
“data.world allows me and my team the freedom to explore and analyze data quickly and make decisions faster than we could before.”

– Erin Defossé, Chief Product Officer at Aceable

Ready to begin writing your own success story?

Now it’s your turn. Prepare for your data catalog launch with these tips.

How to launch a data catalog for maximum value and adoption
Consult and collaborate with your evaluation team

First, work with your evaluation team and executive sponsors on determining and tracking key performance metrics, so you can measure the impact of your data catalog tools. Don’t skip this step! You need to welcome differing perspectives from your colleagues and align everyone around the same goal from the get-go, or you could jeopardize the launch of your data catalog.

Most importantly, you want to track the impact of your data catalog use cases at every stage of the data lifecycle and for every role to see if it’s working. In order to do that, you need to understand how each of your teams currently work with data, what they want to improve, and how they envision that improvement to materialize from their day-to-day work.

To do that, take these three steps while launching your enterprise data catalog:

Understand their unique perspectives
Qualitative feedback is just as important as quantitative metrics. So encourage your pilot team members to voice any opinions, good or bad, about their data catalog experience.
Polish your processes before onboarding others
Now that you have gathered both qualitative and quantitative feedback on your catalog implementation, it’s time to work out all the kinks. Measure what works and what doesn’t to give you a) proof of a successful initiative and b) more opportunities to improve your processes before onboarding the rest of the organization.
Bring proof
Elevate your credibility as a data and analytics leader by showcasing the quantifiable results and good ROI that the data catalog brings to your company. Highlighting tangible success will encourage others to support and contribute to your data catalog initiative.

Data catalogs become more valuable as more people use them, so creating hype and developing buy-in is your bridge to a data-driven culture.

These three critical components of your measurement plan will ensure that your whole organization benefits from your enterprise data catalog pilot.

What else can you do to ensure the success of your data catalog launch?

Track your data catalog’s impact beyond usage metrics

Looking beyond platform usage, be sure to also measure the impact of your data catalog on team productivity, organizational culture, and overall business results. If this seems unclear at first, don’t worry. This will be an ongoing process to refine as you grow.

Remember, you invested in a data catalog to bring people, data, and analysis together and to give employees clear, accurate, and fast answers to any business question. Design your measurement plan to reflect that.

Need advice on how to start? Try categorizing metrics in these 4 buckets as you build your measurement plan.

PRODUCTIVITY: Are you working faster and getting more done?
DATA-DRIVEN CULTURE: Are more people collaborating with data?
USAGE: Is the right data being used for the right projects?
BUSINESS: Do you have a clear way to measure impact in dollars and cents?

These categories should reflect your most important priorities as a data and analytics leader. Be sure to benchmark your current state before launching your data catalog. Productivity metrics are particularly great to record and measure from the start, since you can capture them while determining success goals and metrics with the evaluation team.

On the other hand, metrics, such as usage, will probably only be useful after you launch your enterprise data catalog software, so keep that in mind as you move forward.

Connect people, data, and analysis
When you measure success, you quantify the business impact of accessible and integrated data, collaboration, and analysis. You provide the necessary momentum to make your business faster, smarter, and more efficient as more teams and projects come on board.

Doing this brings you one step closer to making your organization truly data-driven. And that’s your goal, right?

“As data and analytics become pervasive across all aspects of businesses, communities and even our personal lives, the ability to communicate in this language – that is, being data-literate – is the new organizational readiness factor.”

– Valerie Logan, Research Director at Gartner

Here’s one last thing to consider as you begin building your business case for a data catalog.
The business value of a data catalog
To truly understand the business value of a data catalog, consider how much your company spends annually on data products (data lakes, warehouses, data security, cloud infrastructure, etc..) and people (data scientists, engineers, analysts, stewards, etc..). It’s likely a significant portion of your company’s overall IT budget.

Given today’s challenging times, you may expect the ROI from your data initiatives to be lower than in years past. But according to Gartner, companies that offer a “curated catalog of internal and external data to diverse users will realize twice the business value from their data and analytics investments.” In fact, data catalogs can provide outsized impact in times where data is increasingly important.

We are seeing all kinds of businesses – from banks to restaurants to tech companies – make abrupt and, in some cases, multi-million-dollar changes to their operations. For companies trying to forecast sales two-quarters out or assess the stability of their supply chain, data catalogs are an incredibly effective tool for ensuring the data and metadata that support the analysis and decision making process is up-to-date, accessible, and understandable.

Ready to infuse clarity, accuracy, and speed into your data work?
Now that you know what to do to find, evaluate, and launch the right enterprise data catalog for your business, it’s time to do just that!

Want to see it for yourself? Get a demo of our enterprise data catalog and see what it’s like to begin connecting your data sources and building your datasets for collaboration!

Request a demo of our cloud data catalog!

We’ll show you how data.world makes it easy for everyone—not just the “data people”—to get clear, accurate, fast answers to any business question.

About data.world

data.world makes it easy for everyone—not just the “data people”—to get clear, accurate, fast answers to any business question. Our cloud-native data catalog maps your siloed, distributed data to familiar and consistent business concepts, creating a unified body of knowledge anyone can find, understand, and use. data.world is an Austin-based Certified B Corporation and public benefit corporation and home to the world’s largest collaborative open data community.

 

 

 

By clicking “Submit” you are agreeing to our Terms of Service and Privacy Policy