We all know that data is growing rapidly in size and complexity. So is the demand for faster and deeper insights. As we try to tame the growing data chaos while speeding up integration, processing, analysis, and access, data quality suffers. Let’s explore the ways we can have our cake and eat it too. 

Data quality is contextual

Data quality means implementing the right processes and tools to achieve insights quickly while ensuring information is trustworthy and usable. How you define quality depends on how the data will be used. Thomas C. Redman mentions in Data Driven that high-quality data is “fit for [its] intended uses in operations, decision making and planning.” 

So...

You need to answer these questions to assess potential data catalog use cases that matters most to your business. This is key to implementing an iterative, agile approach to data governance.

Put data quality under your data governance umbrella

Agile data governance provides the principles, policies, business workflows, and technology solutions to ensure your data work accelerates access and insights while also ensuring safety. Data quality is a key pillar for governance because unless you can trust the data and quickly and easily put it to work, you cannot achieve democratization, self-service, risk mitigation, and compliance.

Build a company culture that embraces data quality 

Your culture must emphasize high-quality data and take responsibility for it. Otherwise, it’s hard to make progress against your data governance and broader data strategy initiatives because they will be bogged down in finger-pointing and hand waving.

So…

Now it’s time to size up the quality of your data

With your data catalog use cases, governance framework, and company culture embracing data quality, it’s time to measure impact and effectiveness. The “big 6” categories used to measure data quality include:

One of the biggest mistakes we see over and over again is trying to apply one-size-fits-all quantitative metrics. Companies will say “okay, the percent of nulls will be completeness… we’ll track percent change in values and anything over 5% change is a red flag… we’ll run data type validations to ensure strings are strings and integers are integers.” While these feel right in spirit for a Chief Data Officer, it very quickly leads to “boiling the ocean” and inevitably results in a high noise-to-signal ratio where alerts are constantly going off that don’t matter. And true problems are missed.

Start with an agile data governance approach

The challenges described above are why we strongly advocate a focus on the highest value use cases and the related high-value datasets and data systems. This way you ensure context-aware and fit-for-purpose metrics and business processes are in place to focus quality efforts where you get the biggest return on investment first. You can focus on the minimum valuable metrics that emphasize trustworthiness and usability. This is what we call agile data quality.

Data cataloging and governance platforms can ensure decisions are driven from good data, power a use-case driven approach, and support the metrics and business processes for agile data quality. Supporting catalog capabilities (with “big 6” categories) include:

You will be tempted to evaluate the quality of your data through quantitative metrics, but this will lead to boiling the ocean. Instead, focus on the data needed in the highest-value use cases. You can achieve data quality through a combination of technical features (such as data profile, lineage, inspections) and agile data governance processes. 

Remember, data quality is a team sport, not just a technical issue. Your company culture must embrace data responsibility!