How to Evaluate a Modern Data Catalog

by | May 17, 2022 | 2022, data architecture, Data catalogs, data tools

The data catalog market has exploded in recent years, and for good reason. Organizations have a ton of data and metadata, but it’s often difficult to find and understand, and people are not sure how to apply the data to their business. A modern data catalog serves as a centralized knowledge repository that makes it easy to discover, access, and use relevant data assets. But with so many choices for catalog partners – and near identical messaging – how do you choose the right one for your organization? A request for information (RFI) is a great place to start.

Advice for selecting the right data catalogs to RFI

Before we jump into modern data catalog RFI evaluation criteria, let’s talk about vendors. Namely, how do you narrow down the list from dozens of available solutions to those you want to RFI? 

Here are a few tips to help you get started:

  1. Use cases: What is it that keeps you up at night that you want a data catalog to solve? Many catalog vendors promise the world, but the truth is no single solution can solve every data and analytics problem. (And if they claim to, run.) Be specific and keep this top of mind when reviewing platforms.
  2. Total Cost of Ownership (TCO): What is your budget for this project? Think bigger than the base price of the platform. How easy is it to implement? Who manages updates? What is the cost to add users as adoption grows? Be aware that some catalog vendors differentiate by the type of “seat,” charging more for power users or editors than viewers.
  3. Personas: Who will be accessing your catalog? Are you seeking a solution your governance team can use to manage data access requests? Do you need a hub for your data engineers to monitor pipelines? Will the catalog be the go-to data discovery platform for business users? How you answer these questions will help inform the criteria specific to your search.

Aside from the three considerations listed, you should also think about your preferred data catalog deployment model (on-prem or SaaS) and its underlying architecture (relational datastore or knowledge graph).

Modern data catalog evaluation criteria

Now that you know your use cases, personas, and budget, it’s time to assess potential solutions. We like to bucket data catalog capabilities within these 10 categories:

Discovery

If you want to maximize the value of your data (who doesn’t?), you first need to understand it. Use the data discovery criteria to assess the sophistication of the catalog’s ability to crawl, scan, profile, and establish a rich metadata baseline for your existing and future data sources in an agile manner.

Curation

At the end of the day, business users care about one thing: can I find the information I need in the way I understand it. Your data catalog must meet these users where they are. Consider a solution that demonstrates fresh ways of delivering a business glossary, linking business terms to data attributes, and building a semantic layer.

User Experience

A data catalog is your portal to discover, connect and unlock the potential of your data assets. Your catalog must be intuitive, democratize knowledge, and become an indispensable part of your daily data analysis for all roles within the organization. 

Integration

Data catalogs are the glue that bind multiple components of the data analytics pipeline. Your solution should seamlessly extend catalog capabilities into adjacent areas of data governance, quality, management, and privacy, while letting you develop your own integrations.

Governance

Automations increase productivity and reduce the time to value for data explorations. Choose a catalog with functional governance capabilities to simplify catalog setup, onboard users, derive end-to-end lineage, enable data quality, and audit usage.

Collaboration

Dynamic business environments tap into subject experts across domains to build a single source of truth. Consider a data catalog with rich collaboration features to annotate, describe, certify, rank and build context for metadata.

Architecture

As business grows, your catalog must expand in tandem. It must always be available, reliable, and scale with increasing demands. Your catalog must also deliver effortless updates while simultaneously ensuring utmost security.

Vendor

Your catalog vendor is your partner in your ongoing success. The vendor should support efforts to build workflows, provide a significantly high value for your catalog investment, share customer successes and best practices, and continually introduce new innovations in its roadmap.

Business Outcomes

Data-driven organizations are constantly innovating as data producers and consumers modernize their business practices. An essential capability of your data catalog is supporting modern use cases, such as self-service analytics, cloud migration, data mesh, and data privacy risk mitigation.

Ecosystem

Your data catalog can be a change agent in your organization. Use this opportunity to assess the ease of installing, configuring, and linking your data catalog to third-party integrations and external sites.

 

Dive deeper with data.world’s Modern Data Catalog RFI Template

Now that you have an idea of the high-level criteria for selecting a modern data catalog, it’s time to dive deeper. Download data.world’s Modern Data Catalog RFI template to get started with your own evaluation.