This is Part Three of a four-part series about Agile Data Governance. In Part One, we covered lessons learned from software development history and how they should guide us in today’s data challenges. Part Two profiles stakeholder types and explains the process at a high level

If you try to build a data-driven culture with a top-down approach where every detail is planned far in advance, you will fail. But there’s wisdom in the saying, “How do you eat an elephant? One bite at a time.” (data.world does not condone eating literal elephants, just the metaphorical sort.) Agile Data Governance gives us a way to build an efficient data supply chain and create a data-driven culture one bite at a time.

In his book, Winning with Data, Tomasz Tunguz describes five main challenges companies must overcome to create data-driven cultures. Here’s how Agile Data Governance helps solve each of them.

1. Data breadlines

These are bottlenecks at the data producer/consumer threshold.  Data producers can’t keep up while servicing one ad-hoc request for data after another. Data consumers become frustrated with the delay in getting what they need. Analytics projects turn into endless email chains. By using agile principles, data producers, data consumers, and domain experts iterate together to build reusable assets that lower the frequency of ad-hoc requests.  New ad-hoc requests will be preserved next to the cataloged data assets and analysis for the next person to find and use before asking data producers for help.

2. Data silos and rogue databases

Everyone has encountered that “one person” who gatekeeps a special dataset and is the only one who can create a necessary report.  Perhaps this person built a one-off system to produce some analytics or scripts that only run on his or her laptop.  With Agile Data Governance, data consumers have a direct, clear way to request and iterate on data assets. This reduces the prevalence of “emailed spreadsheets.” Plus, data assets will be well-documented, so more people can find, understand, and use them. 

3. Data obscurity and lack of understanding

In most organizations, those who try to understand the availability and use cases of data assets encounter inefficiencies, partial answers, and confusing systems. This is primarily a documentation problem, and disconnected tools that aren’t built for agile processes make doing documentation both a chore and an afterthought. In Agile Data Governance, you do the documentation while you do the work.  This near real-time documentation increases global knowledge about what data exists, what it means, and how to use it. 

4. Data brawls

When data work isn’t transparent, people don’t trust it. People show up with different versions of the same analysis after months of work. They argue about small details, data sources, even project goals.  With Agile Data Governance, transparency means course correction and peer review happen as analysis unfolds. This creates a shared understanding which can be poured into business glossaries and other alignment tools. No more tense meetings with three different answers to the same question. 

5. Data Literacy

You can't have a data-driven culture if your people don’t understand the basic workings of statistics. They need to appreciate, and have simple ways to follow, the scientific method and other best practices that make analytics valid and useful. This may be the biggest long term benefit to practicing Agile Data Governance. Humans learn by copying and doing.  An agile process encourages participation with, and observation of, talented people doing amazing work. This increases data literacy and skill across your entire company.

Why you need tools designed for the job 

Only some tools are right for Agile Data Governance, in the same way the growth of Agile and Open-source software development demanded new tools. Agile software development meant throwing out heavyweight requirements docs and architecture diagrams that would take weeks to write. Sometimes they would consume entire office walls.  It required tools such as a data catalog to help teams iterate and collaborate in real-time, like new issue tracking and documentation systems, distributed source control, and continuous test and integration tools.  In data and analytics, this means data catalogs designed for inclusivity, crowdsourcing, exploration, access, iterative workflow, and peer review.

How do you build a data-driven culture with Agile Data Governance? One bite at a time.

When you're ready, here's Part Four, Principles of Agile Data Governance.