The meaning of “data governance” has expanded in scope as time goes on. What passed for effective data governance yesterday wouldn’t make the cut today. From data and analytics advancements to AI/ML considerations, what does the new data governance look like? And how do we build an AI-ready data governance foundation that can be effective tomorrow? 

At data.world, we’ve been thinking about how to create a framework for effective data governance that’s as future-proofed as possible. Bring on larger volumes of data. Bring on a proliferation of data sources. We’re setting the stage in the following 5 ways: 

  1. Shift to a “center of excellence” model for governance: This shift can be facilitated by AI and automation, so that data teams can focus on their core business, versus being a police force enforcing policies at the micro level. Of course, some businesses demand a centralized, top-down approach, especially when dealing with highly regulated data. But even in those situations, the rules of the road can be well-understood and codified internally. 

  2. Adopt a privacy-by-design approach: Given the increasing concern over data privacy, adopting a privacy-by-design approach is key. Consider data privacy and trust at every stage of the AI project, from initial design to final output. Implement robust encryption methods, use anonymization techniques, and create secure data storage practices. AI-readiness requires protecting sensitive information and building trust.

  3. Plan for failed adoption: Adoption of data governance programs will vary. Don’t expect a 100% adoption rate. In fact, it helps to be a little pessimistic about adoption; most data governance efforts are ad hoc, distributed across a spectrum that encompasses a wide variety of teams. Many teams start a governance program from a security, access control, or sensitive data standpoint, without much thought to best practices. As a result, the fervor and momentum slow down, and governance can feel antagonizing in the long run. Plan for that, and then plan for how to mitigate it. 

  4. Formalize a data governance reporting structure: Reporting structure can really impact the success of governance. Governance can be implemented in a variety of ways, but some things must be formalized, namely: accuracy, transparency and security.  Will the data you train models on ensure the accuracy of results? Are AI models being trained on the right data for those models' purpose? How are you protecting the data as you train these models? You need a way of quantifying and tracking all these things in a formalized way, with key roles and reporting structures in place. Establish formal responsibilities in the form of “product owners” or similar, that enables a direct line of accountability to be defined and - if necessary - enforced. 

  5. Create ways to measure success: You can measure success by measuring how many of your AI models are making it to production and returning value. Making it to production indicates that a level of trust in AI has been achieved. Data governance should not be self-serving. Instead, it should tightly align to the value chains your organization is working to execute. Then, assessing whether your AI-ready data foundation is working becomes more straightforward. Has a given initiative or effort actually moved the needle relative to business- or mission-centric measures and metrics? If so, you’re on the right track.