Don't lift and shift: Data governance to AI Governance with Karen Meppen

Tim Gasper [00:00:01] Hello, everyone. It's time once again for Catalog & Cocktails presented by data. world. It's your honest, no- BS, non- salesy conversation about enterprise data management with tasty beverages in hand. I'm Tim Gasper, longtime data nerd, product guy, customer guy at data. world. Joined by co- host, Juan Sequeda.

Juan Sequeda [00:00:17] Hey, Tim. I'm Juan Sequeda, principal scientist at data. world. As always, it's a pleasure. It is Wednesday, middle of the week. Today, we shifted things a little bit so it's still early in the day, but I know there's folks up on the other side of the pond who is probably now appropriate to have your cocktail. We are here Wednesday and super excited to finally have a guest who is a longtime listener. And I'm super excited that she's finally here and to chat about all stuff AI governance, data governance. Karen, how are you doing?

Karen [00:00:49] Hey, great. I am excited to be here. And yes, longtime listener, first time caller to join in on a conversation that is definitely a hot topic and definitely something that I think warrants more discussion.

Juan Sequeda [00:01:06] Yeah. No, inaudible-

Tim Gasper [00:01:09] And again, it can be for our topic today.

Juan Sequeda [00:01:09] Yeah. We're going to kick it off, Karen, who's the director of client services at Hakkoda. Let's kick it off with our talent toast. What are we drinking and what are we toasting for today? Karen, how about you?

Karen [00:01:20] Yes. I had to think about that considering it's 08:30 in the morning for me. My first thought was Bloody Mary but that sounds like a lot of work for me, so I found something that I thought might be appropriate. I have a Shibui Whiskey, and that's a little nod to Hakkoda. Hakkoda is a nod to... It's one of the snowiest places on earth and Hakkoda is a snowflake partner. And so, I thought there's a theme. So here is my whiskey in the morning.

Juan Sequeda [00:01:58] Wow.

Tim Gasper [00:01:58] Nice.

Karen [00:01:59] A Japanese whiskey to get it going.

Tim Gasper [00:02:03] You are truly dedicated to the cocktail cause. Even I have not done it properly. I'm just drinking some coffee.

Karen [00:02:09] I mean, I have a coffee just here, too. But hey, great.

Tim Gasper [00:02:11] And I will highlight my really cool Austin FC cup. I do like my little YETI cup here.

Juan Sequeda [00:02:15] Tim put some interest from my inaudible or something like that. So, some whiskey into my coffee here and do that. I'm having coffee, but... Cheers to you, Karen. Cheers to you to be listening to us for so long. I just want to go cheers all our listeners. We've been doing this for almost close to four years and we just do it because it's just so freaking cool that we have the opportunity to meet someone. So many cool people like you, Karen, and somehow get to impact people's lives around this stuff. So thank you, Karen.

Tim Gasper [00:02:47] Yeah, cheers.

Juan Sequeda [00:02:49] Toast to you, Karen. Cheers.

Karen [00:02:50] I got you. Thank you.

Juan Sequeda [00:02:52] So given that our topic today is about lifting and shifting life, our one more question is how many times have you moved?

Karen [00:03:00] How many times have I moved? Geez. I'm getting closer, if I were to guess, around 10 times. It's a very traumatic experience actually. I've heard somewhere that up there with funerals and managing the death of a loved one, moving is equally traumatic and up there in terms of how it affects you.

Juan Sequeda [00:03:19] Yeah, I'm counting right now how many times.

Tim Gasper [00:03:24] How many times have you moved, Juan?

Juan Sequeda [00:03:26] I think I've done six major city- country movements. But within cities, I have also moved a lot. My parents, they moved. My parents loved to change things up and get a new house and change of house. And to the point that even still today, that they don't even tell me. I remember once before during COVID, they did this weird thing, that they built a house and they didn't tell anybody. And then I came one day to go visit them right after COVID and I'm like, " Where are you taking me?" " Well, look at what we did." I'm like... So, I grew up in stuff like that. I probably have switched houses between 15 and 20 times.

Karen [00:04:09] Whoa.

Juan Sequeda [00:04:10] Yeah.

Karen [00:04:11] That is a lot.

Juan Sequeda [00:04:12] And I freaking hate it. I'm super happy that in the house I'm right now, I've been there since 2019. My wife and I are like, " We're not leaving anywhere." We're staying put. We love it.

Karen [00:04:25] You can't say that out loud because you put in into the universe and it just seems inaudible.

Juan Sequeda [00:04:28] Oh, crap.

Tim Gasper [00:04:29] Knock on wood.

Juan Sequeda [00:04:31] How about you, Tim?

Tim Gasper [00:04:35] I've traveled a bunch of different places, especially in the States, but never... I haven't moved that often in terms of cities. I grew up and I went to school in Cleveland, and then I moved to Austin. I've been here in Austin for almost 15 years now. I think I've lived in maybe four or five different places in Austin, but I have a house now in Austin so I have the luxury of not having to have moved very often. My wife and I keep on thinking like, " Should we move to a bigger house?" And then we think about the pain of moving and all of that, and we're like, " Nah."

Juan Sequeda [00:05:03] No, I tell you not to. Anyways, let's not talk about that. Let's talk about other things that people are considering to move, between data and AI. All right, Karen. Honest, no- BS, what is AI governance and how is this related to data governance?

Karen [00:05:18] AI governance is a reference to using, what I'll say, machine learning and creation of some type of what I'll call data product generative AI. Within that family, making sure that all the guide posts are met to make your desired outcomes or your data product. That is either a machine learning tool or gen AI, for example. Align to getting your desired outcomes then it meets your regulatory obligations. It is being respectful of the human beings' data that is included to provide you those insights and that it also is something that is easily understood as well. I'm sure we'll talk more about that. One of the things that I wanted to call out as well is that even... One, everybody has their own definition of governance. But even within AI governance, I noticed a tension when you're defining it between, what I'll call, more GRC type of governance. GRC meaning governance, risk, and compliance which is more like the legal validation audit- type approach for data management. And there's more the AI governance focus of from an analytics standpoint, what we would align more of what you see with DEMA domains and practices as it relates to data quality and fundamentals. And so, what I've noticed is that there's a lot of silos even between the folks who are attorneys or privacy engineers or pursuing compliance in that perspective of valid- checking the boxes and making sure that you're following ethical or privacy- aware best practices. And then there's the analytics folks who are coming from a data team that have their... They're shaking the stick about what AI governance means. And that even then, there's a disconnect.

Tim Gasper [00:07:35] Are both of them right? Because you have an expansive definition, I think here, and I think that's a good thing around AI governance. Is it all of this? Is that all of AI governance?

Karen [00:07:49] Short answer is yes and it depends. In terms of the context of what you're doing and the nature of the family of Handwave AI because that can mean a lot of different subdomains within that broad umbrella of AI. But yes, it is something that does need to be addressed. I do think it means all of it, in truth. And that you definitely have to take the analytics data management perspective and then also the GRC standpoint. All of it really needs to be done and they're all making valid points of what's being missed or what should be prioritized.

Tim Gasper [00:08:34] I think maybe to go to another sort of honest, no- BS question here that's closely related to what we started with, is there a difference between, fundamentally, AI governance and broader data governance? Because there seems to be a feeling in the data community, the governance community, that there's something different. AI governance is a new beast to be tamed and certainly, there's new technology at play here. But how different are we talking?

Karen [00:09:08] Yes, I think what is happening is more that tension of those that have been working in data and analytics, and more traditional I guess we'll say. Analytics data management are noticing that the GRC folks are making a lot of noise as it relates to regulations, compliance, and just the harms of what are those secondary effects. Whenever you have an LLM that's put into production, that affects people's lives or decisions on getting a loan or something like that. So what I think, that's the true disconnect. In truth, yes, you do need to do the basics before you can do the shiny things, which I've mentioned a few times. Which is something that I think that the folks who have been in analytics, data management, data governance realm have been making a lot of noise about is yes, data quality matters. Where I think and as it relates to making sure you have datasets that you're training on that are... You can do something with or that is, which is a challenge unto itself. Where I think we get into a different definition is that you can pick DCAM, DEMA, whatever domains of what we define as traditional data management and data governance. There are distinctly different concerns to be aware of when it relates to using AI and ML that are different. When you say accurate in the context of data management, that refers to making sure that that's the dimension of your data quality domain. Accuracies that relates to AI or an LLM, I'll just call it AI for now so we can move it forward, but that relates to the focus of your use case and how you're delivering your insights or answers or the focus of your AI agent, for example. And to me, I think there's that difference. And then, it comes to explainability of the model. That is something that is different and not something that we've been doing. That matters for many reasons. Just that transparency of, " What exactly are you doing?" And privacy awareness, yes. Privacy awareness, you should do no matter what as it relates to your data that you're providing insights on in whatever context. Where it's different is that once you have your trained model, it becomes intellectual property, the totality of it. And so if you have training data that may be providing really powerful insights to make decisions or change behavior for a business or something that's valued in the market, if it's including someone's PII, then that is something that creates big problems or has. Even in the US, that's one that's not okay. But two, if there's a violation that's called out by the FTC, then you can't easily pull out all of the data that was used to create this LLM or AI agent. And so, there's concepts like disgorgements or just being asked to destroy the model that you're using, which is not a desired outcome after you spent a considerable amount of time and money getting there. And so going through there, I think there are different data handling and security controls that are just different because of the nature of how you can corrupt and modify the data similarly. And then I think something that I see constantly on LinkedIn is this gacha of fairness and bias. Put in, " Show me a picture of a CEO," and I feel like several times a day, I see someone posting a picture of an older white man of... See the bias? Yes, that's a thing too. And it does have consequence when you're focusing on other things beyond creating pictures, yes. I think the point is that much of what's used in the training data is a reflection of what's in real life, of who we are, how we think, and what we do. That there is bias all around us. What you do about that though is something that can get very complicated or controversial. And also though, is something that's a difference from what we do on a day- to- day basis and within analytics, data management, and data governance.

Juan Sequeda [00:14:22] There's a lot to unpack here. What's going through my head is seeing this Venn diagram and an overlap between what we're calling AI governance and data governance. And in particular, I really liked how you called out the GRC, the governance, risk, compliance in the analytic governance. The overlap there is things when it's about... The PII is usually one of the things, the GDPR, all these compliance type of stuff. So, those are all the things that we are... Regulations are being afraid that bad things will happen right there. So, there's that overlap. And then there's other things that still fall into the GRC governance which may not be such a big focus or not a focus at all for the analytics governance. Things like biases come from images and stuff like that. The analytics, that's not your world, so that type of stuff is not specifically for you. Now, maybe if you start creating these larger models that... If you're building a model based on the data that you have, which is your customer base or whatever, that's the facts of your organization, right? You remember-

Karen [00:15:36] Right.

Juan Sequeda [00:15:36] So, you're not trying to figure out more biases or balance the biases. You just train with what you have right there.

Karen [00:15:42] Right.

Juan Sequeda [00:15:44] But then there's also things that the governance, they call it the analytics governance, they're doing but it's such not a big focus in the AI governance especially because it ends up being all this more images and texts. So, going back to the data quality and all these things. That is a focus on what you're talking about structured data. But when you're seeing a lot of this governance over AI, it's more about this unstructured, specifically text, documents, images. So I think that's the difference that we have to do. And I think to call out a post that today, Malcolm Hawker put out there, this was basically a conversation Tim and I were having with him a couple of days ago. It's like, wait, a disconnect. I think you're bringing up also the silo. The silos that we're seeing here is that you have the data folks saying, " Oh, AI needs to have governance. They have quality data." But actually, the AI work that's happening, it's not dealing with the structured data that has to deal with the data quality. So a lot of people are just like, they just want to be loud. As you said, they're making noise but sometimes that noise is not helping anybody. Like, just go figure. I mean, you haven't even figured it out yourself yet. I don't know, I'm ranting now. I'm going to stop. Does that make sense?

Karen [00:16:57] Absolutely. And I think this is part of... Let's acknowledge that this is new to everyone, and that there are AI experts that are bubbling up left and right often in our LinkedIn echo chamber. What I've noticed though is you can see for those that are talking throughout the community of people who have done it and have a little scar tissue to show for it, and those that are posting for clicks. I think it's important to acknowledge that we're learning and that there are a lot of valid points that we need to bring in to the iteration or the life cycle development experience of delivering value, which is the whole point of it. Of something that supports the strategic goals of whatever business you're working in, and that there are real consequences to what you're putting out into the world that affect people's lives, and to be aware of that. I think there's so many frameworks that folks are competing of like, the end all be all of what needs to be considered. And I think the point is that that's just calling out the disparity of how everybody's experiencing and working with the umbrella of all the different domains within AI, to deliver what's meaningful within their organization or how they interact with their customers. And that's really the point of that. Yes, there are different governance expectations and outcomes for unstructured data or semi, and then what we're more used to is the more structured data and what you can do with it. It's not that there... It isn't being discussed so much, but I think that it's being managed differently. And the GRC folks, as far as from their perspective, from a privacy engineering perspective, are the ones that are more hands- on in that domain. And that's perhaps why there's a perception that from the analytics governance folks, that it's not being looked at because we don't talk to each other often.

Juan Sequeda [00:19:16] Is this just jealousy from the...

Karen [00:19:20] I don't know about that.

Juan Sequeda [00:19:23] Look, it's like... We all talk about... We're in the data world, right? We're trying to make organizations be data- driven, right? The data scientist is the sexiest job. Like, " Oh, we have to look at what you do." And now comes the wave of generative AI which is very first, kind of text and stuff. And then all eyes just go to that, all funding goes to that. And I'm like, "But hey, me, me, me, me." And then everything goes there. Then they start talking using the word governance and they were like, " But we've been talking about governance over here. You're not paying attention to my..." But it's a different game, so I don't know. Part of me says we're not being part of that. The cool kids club has moved somewhere else and we're not part of it.

Karen [00:20:18] I would argue, at least from my perspective, that the folks that are in data or the analytics data space are probably driving the narrative a lot more than the GRC folks. But I have my own bias that I'll bring to the table. What I will say though is that it's undeniable that there's the hype cycle of the expectations of what AI is supposed to do. I think where we've been seeing the most adoption is in the spaces where there's a manipulation of data as it relates to selling and marketing to people, to customers. And so, that's also where you have a high degree of harms that can be experienced by those human beings. In the US for example, there's very little agency that an individual has to pursue harms against them. So, you're really just left to sit and take it. And that's really more where I think the GRC folks are speaking up. I think the more brick- and- mortar or more traditional enterprise organizations are still... For the ones that did get budget and got caught up in the FOMO, I think what you're seeing now is the reality of what it takes to really create a meaningful production- ready LLM or agent that helps the business, and also realizing what it can and can't do. It is not going to fold your socks and make sure that... And predict the future as like the Oracle of Delphi. To be realistic of your expectations, I don't know if you're familiar with the Cynefin framework of just making sure that it's a problem that... The Cynefin frame, I'm just about to tangent on sounding like a big dork for-

Juan Sequeda [00:22:29] No, please.

Tim Gasper [00:22:30] Please continue.

Juan Sequeda [00:22:31] Please continue. I might feel more comfortable and I want to listen to this.

Karen [00:22:36] It's a four box of just how you define a problem and whether it can be solved or how you solve it. And the point, fast- forwarding, is that make sure your goals, for what you're asking of your AI agent, are realistic for what it's made to do and to not overextend it of thinking that it can do it's more than what it is. We are still... I think it's best to qualify it. I heard think of it is your best intern. It's something that can facilitate a lot of more manual repetitive processes, for example with your oversight, as opposed to fully deploying something to be standalone, to do something, to duplicate your work. We still need a thinking subject matter expert to decide whether it's right or wrong, to give feedback for tuning purposes. I think that's more where a lot of folks just don't know and are learning along the way, and I think it's more important to call that out. Similarly, within our AI governance experience of we're learning, there are so many frameworks. In truth, you need to find your framework that works best for you, your use cases, and your team that you work with as it relates to who you're interacting with or who the beneficiary is of your AI agent, let's say. And then also, just the best practices. As much as we have like SDLC practices and it's involved in data ops, I think there's also something to be said that it's a blend of a lot of that. Plus plus to bring in the right people, human beings at the right time, to make sure that you're talking through the implications. And thing on top of it, even deployed to production, monitoring. To make sure it's doing what you're expecting it to do, and also being able to intervene at the appropriate time and recognize when there are problems accordingly because just like you have observability activities for whatever's in your data stack, similarly, you have different guideposts. When we talk about model poisoning, bad things can happen. There's a lot of different things you can do to your dataset that can have cascading negative consequences. Plan for that accordingly. I think that's the part that everyone's still learning, and I think everyone should be okay calling that out and acknowledge that this is a learning experience. But also if you're using that AI agent for within a business context, interacting with customers, that you need to be very thoughtful about how it's being used because it is affecting human beings. I think in the news we had, Air Canada just... There's been a few things in the news but most recently, we have a chatbot. Sounds pretty benign. That somebody was talking about or talking with a chatbot at Air Canada about booking bereavement fare and asked a question that's related to, " Okay, what should I do with..." What's the bereavement policy when his grandmother died? And the feedback that it gave the guy talking to the chatbot was that he should book the travel and then submit for a refund. And as a result, there is material consequences that his... Air Canada's being held accountable for that feedback. I mean, that's a real consequence that also has monetary impacts. And then separately, what I've been seeing is that there's been some traction in calling out an oldie but a goodie, a law in the US for wiretapping, the Wiretapping Act. And then if it's a third party chatbot that's capturing your communications on somebody else's on your data platform or on your website, that's something that has material consequence as well. And so, these are things to talk about but I think also are being tested in the courts. And also, from a data management perspective of what's realistic and also for what it costs from a FTE standpoint. Meaning all the human beings, all the intelligent human beings that have to contribute, and the time and money, significant budget that it costs for training your data, tuning it to bring it to production. Is that realistic for your desired outcomes to make it a worthwhile expense for everyone?

Juan Sequeda [00:27:38] I love this the whole... This is a T- shirt, " Be realistic." Right? Is that realistic?

Karen [00:27:44] Right.

Juan Sequeda [00:27:45] I love it. By the way, what is the name of this framework?

Karen [00:27:49] The Cynefin framework? Let me... Oh. C- Y- N-

Tim Gasper [00:27:53] C- Y- N-

Karen [00:27:54] Framework. Yeah, C- Y- N- E- F- I- N.

Juan Sequeda [00:28:00] I got it.

Tim Gasper [00:28:01] Okay, interesting. I think this is an interesting topic area we just hit which is around use case governance, not just around what's realistic but also... For a second there, I thought we were in analytics governance land, but it then came back to GRC again too, because there's an impact on our legal obligations. I mean, if a chatbot says, " Hey, you can do something," which actually isn't the policy, is a corporation liable for bad advice that a chatbot offers? And the answer is, probably. I mean, if that is a representative of your company, right? And this goes into architecture too, and there's a technical aspect of this, too. You want to control these experiences, govern these experiences where maybe in that situation, rather than it actually prescribing the policy, it should have said, " Hey, here is where you can read up on our bereavement policy." Or, " Here's the phone number you can call in order to learn more about the policy." But it actually went and it proposed something, so this creates a lot of interesting things here. I think what I'm seeing from this conversation is that even though they share common foundations, broader data governance and AI governance aren't wholly the same. You can't just go into this whole lift and shift. Or kind of analogy we want to give here, there's not just a lift and shift of data governance to AI governance. There's new things to take into consideration and extended aspects of the foundation which you have to take into account.

Karen [00:29:38] Absolutely, yes. And I think that's the point in that I find myself in that push and pull between it's its own culture onto itself, like the privacy practitioners at so many large organizations that are jumping in and doing what's called privacy engineering, to those that are on the cybersecurity side. And the point is, I think the new normal is that that's what governance is and will be, and that I've seen tension from every point of view of that. I've learned in working with the privacy folks that they don't have an awareness of what master data management is and the impact of that as it relates to consistency in your data for changes that may occur or updates that... That that's one thing that I've noticed. And then separately as you're calling out, that the analytics governance folks, when you think of that have been doing data governance thus far, see their boundaries as it relates to a lot of the frameworks we've already been talking about of DEMA or DCAM or whatever. They usually see a hard line when you go over to the cybersecurity folks, the privacy folks, where they, " Oh, that's a legal team. I'm not an attorney." But it's really something that you have to have awareness of because you have to do all of it. And then the question is, what's the most efficient way to do it? I've learned it more because I feel like I've needed to MacGyver my way through a lot of things of like, " I need to figure this out." But what does that look like in real life for us going forward? For a tenable governance experience to do what's right and also to deliver something that the CEO or the CFO gets excited about in terms of what we delivered for... From what we originally set up in our business case when we started the experience, right?

Juan Sequeda [00:31:45] What I find really interesting about this is that you're saying, if I understand correctly, this new normal which is going to be more about having privacy, a big focus on privacy cybersecurity. But then, that means that... Again, I always talk about the pendulum swings. It seems like we're swinging the pendulum also to one side because of like, " Oh, bad things can happen." But there's also these core data management practices that are not well- known coming from the privacy security side. That there may be privacy security issues there but there's just more education that needs to be happening across all these different silos that we're seeing right now. So, it begs the question is where in an organization how is this all being managed? You have data governance offices and then you now people doing AI, and people then their security in other places. Just by how things are structured, everything is siloed. How is this all going to start connecting or not? I don't know.

Karen [00:32:48] I mean, the reality is nobody's talking to each other, and I think that's pretty consistent. What I've learned is that it's confusing often when I may land in a data team where we're working on pulling together their data stack and working on perhaps creating some data products together to ask for, " Hey, who leads your compliance team? Who is responsible for your risk management? Can we bring them in now and make them a part of the experience?" And often that, there are a lot of follow- up questions like, " Why would you do that?" So, I think that's just a new emotion for everyone. What does it look like going forward? That's a great question and I think it's more what I think will happen will. That organizations will change when it's worth their while to do it. As with anything else, if it behooves them to improve value streams and turn it into something that affects the P& L, I think that's when there will future change. And that's more where I also think that there is that tension, which is I definitely understand where the GRC folks are coming from, absolutely. I also understand what the folks in the MarTech space are working on and what they're trying to do. That's where I think the point is, to show that you can make considerable deliver really powerful, profitable. Multiples of revenue by doing the right thing, by being privacy aware, by being compliant. And I think that's really what the point is as compared to what I have seen, which is there is zero awareness often of for those data scientists and data engineers. They are working together to just ingest all the data and come up with some incredible insights and behavior. But without any acknowledgement of some human being's information that, really, is surveillance- type information about behavior location so that they can be better marketed to or all of that sold to somebody else as a package. I think that's where either there's going to be an inflection point where something really catastrophic happens to get everyone's attention. Or I'm hoping more by setting patterns of doing the right thing, that we can better align organizations so that all the folks within the data teams, compliance and risk, and privacy or the legal teams all work together. We all must work together in a cross- functional team. That really is the only way to work together, primarily because it takes a whole lot of time to go speak with everybody individually. But also, it's the right thing to do. And I think it's more moving away from more tribal but like fiefdoms or more like feudal system, which is really where we're at at the moment for most organizations.

Tim Gasper [00:36:16] Yeah. I know that inaudible.

Karen [00:36:17] I have my goals, you have your goals that are aligned for the year. I'm looking out for me and don't get in my way. And often, what I've noticed as well is that they make conflict in some way where to reach their said annual goals and stuff. So yeah, it's a challenge. And great call- out, Juan.

Tim Gasper [00:36:41] No, that's really a huge challenge. One thing that gets triggered, it's related to this but a slightly different topic is I know before we started doing our live show today, we were talking a little bit about data products and data mesh approach and things like that. How do you see data products fitting into the picture around AI and AI governance? Is that going to help with some of these problems around fiefdoms and these different silos?

Juan Sequeda [00:37:15] I will add to that. We're continuous to talk about all the silos, just the culture of people silos, and I wonder if this is a way how we all rally up together to start the communications. We may all have different perspectives, we have different goals, but what we can agree on is that this thing here would make value for all of us. So, let's go figure out this thing. And I'm holding something. I mean, this is the product which I'm doing this-

Tim Gasper [00:37:41] The product and then the governance, I don't know.

Juan Sequeda [00:37:44] Yeah, really.

Tim Gasper [00:37:45] We're gesturing for our listeners.

Karen [00:37:47] That's right, yeah. So let's take a few steps back. I'm a founding member of the data product leadership community that Brian O'Neill started. And what really caught my attention many years ago was listening to a podcast with him talking about something that I saw over and over again, which is that the data team creates a report of some sort, analytics. It's really moved into ML and AI, but the gist of it is they put a lot of time and effort to... They get requirements, they deliver the report, and then the end user says either, " That's not what I wanted at all," or they just never use it. I kept seeing that over and over again. The point is, that data product is more a verb in the sense that it's the process of how you create that outcome which engages a cross- functional team with all the folks that should be involved at the beginning. And that you create together either... First, the process flow that you're looking to evaluate and all of the points where the personalities or personas interact. And you decide together the intersection of the persona, let's say a customer and the order cycle at a particular point where you see that as the highest value starting point to create that data product. And then you work together with someone from the tech team that understands what's under the hood. Someone from finance and accounting who understands how you make your money. A senior leader who understands your strategic goals. Someone who interacts with the customer, subject matter experts on the domains, and to make sure that they're all involved at the beginning. And then you take that prototype and do a short round of, typically, five folks to interview and interact with your rough prototype before you even write a line of code. And the power of that is that, one, you're collaborating with everyone at the beginning rather than them seeing at the MVP phase for the first time. That you get, I think, you hit on the org change management part of it. That you took the time to ask and that they're now getting muscle memory to talk to each other. So speaking of governance, have basically your governance counsel for that use case when you're done. And that when you're going through and creating it also, it's very structured, and using the Cynefin framework, that you are asking the right question and solving the right problem at the beginning. And the power of that is really, the output of the data products as opposed to it being more like an analytics insider, an AI agent. And so that's where it's very powerful I think, not so much... The data product is meaningful and can be... And a decentralized data mesh, meaning your domains for finance and marketing. Where you're bringing the subject matter experts that are closest to the business, who know the business best and bringing down those barriers to identify the insights quickly but also having a repeatable process that everybody can get on board with to do that verb, to deliver the data product. And the data product then, obviously, you can work with. Governance agility is the concept of what you see in a data ops platform or which the permutation from SDLC to what we in the cloud. That it's included, so that's not something that you have this great idea and then you spend for months or years because of your technology. But that's more an enabler of the business rather than the technology being the outcome. And similarly, the data product is the outcome of the people and processes, which is the most important part.

Juan Sequeda [00:42:03] This segment right now has so many golden nuggets. And what I really, really enjoyed listening to you right now is that it tied basically everything that we had discussed earlier today. I mean, talking about... I love how you say it's a process around this, and you want to have to have the people from the beginning. That's how we were previously talking. It's like, " You should bring in folks from the privacy stuff." That's where you start bringing them in, from the beginning around that stuff. And then you tie it back to the framework, the Cynefin framework. And now, we know that we're actually focusing on something, we're solving the right problem, that it's something realistic. We're not just going off and throwing things out, like coming up with something that we... Some magic stuff that we really can't do or can't do yet. And then tie this back to the analytics governance, the AI governance, and then the risk compliance governance, and also thinking about the structured data and unstructured data. Heck, if it is just a data product about unstructured, like, " Here's all the documentation about our product. And we're going to go create an AI agent that's going to build this. We're going to do a chatbot." Then, the first product right there is going to be all the sets of documents that we know that we... Is this the latest version of it? Do we know that this is the latest stuff? Can we go trust the stuff that's in here? Now, we start bringing in those same practices that the data folks have been doing when it comes to what is trust? Is this high quality? I knew that.

Karen [00:43:35] Absolutely.

Juan Sequeda [00:43:35] And then-

Karen [00:43:36] There's a big difference between a single source of truth and master data if we want to get super pedantic. But yes, absolutely. Trust is the point. Is that it's trusted, right?

Juan Sequeda [00:43:49] And then we actually know who the consumer of that's going to be, and what are the implications of putting this all and training on that stuff, right? We just have all those conversations from the beginning. Then at that point, you say, " Yeah, this first version is only going to use this text." Fine. But as a good product management, you're like, what could happen later? What could be the next steps that we could dive then into some text, some structured data stuff? I think we need to start thinking more about the data products as is, again one is that verb, the process. And it's not just about the structured data, which I feel it ends up being there, right? " Oh, it's living in my warehouse," or stuff like that. No. It can be any type of data from structure or unstructured. That's the way. So, having this conversation about products, it's what's really going to start uniting all these different silos that we have. I don't know, that's the takeaway I'm having. I'm feeling really happy about this because I've been seeing a lot of disconnect and this is helping me figure out how to connect them. I'm actually going to ping Malcolm right now saying, " Hey, I'm curious inaudible."

Karen [00:44:55] I feel like we should have a big discussion. I'm a big fan of Malcolm as well, and he makes a valid point. We all need to come to the table. I would argue we need to bring some of the folks that are making a lot of noise from, what I'll say, the GRC space. I don't know if they would appreciate called inaudible but representatives who are speaking about the more cybersecurity voice being in the room as well as privacy leaders. It's all important. It all needs to be discussed and included in the experience.

Juan Sequeda [00:45:30] I just texted Malcolm right now.

Karen [00:45:31] All right.

Juan Sequeda [00:45:35] Tim, you got any? Because we can keep going but we got to start wrapping up with our lightning rounds and takeaways.

Tim Gasper [00:45:41] Yeah. Maybe a lightning round question before the lightning round, except it's an open lightning round, is there's some good... Malcolm had a great post today around AI governance. There's also a lot of... I think Karen when we inaudible theater. There's social theater going on right now as well. Do you have any recommendations to listeners or folks that are watching on how to navigate all this advice out there? That maybe, it can be construed more as sort of blockers or... What's your advice on navigating some of the information that's getting kind of loudspeakered out there?

Karen [00:46:21] I think the best recommendation is to just get hands- on and do it yourself in whatever context is meaningful for you. For example, I have my own use case where I have a bunch of cookbooks. And when I'm cooking, I don't know where all my recipes are. For particular, if I have a bunch of extra, I don't know, cucumbers or something that I need to get rid of. It would be great for me to have a real problem for you to solve. And what I want to do or I'm trying to put together is how do I scan my existing cookbooks and be able to pull up a recipe for cucumbers or that uses cucumbers so I can power through all of it before they spoil? Find something meaningful for you and give it a try. I think that's more where it doesn't have to be a really sophisticated, create a new value stream for increasing customer traffic. Find something that's meaningful for you. I think that gives you a lot more context. There's a lot of information out there for tutorials, often free. They tend to be biased to particular flavors of data platforms, but the whole point is to get started. You know best after you do it. And then I think that that's where you improve the feedback and voices in the room. And you can tell from the feedback in the post, not just on LinkedIn but just talking to folks on Substack, other places of folks who are really doing it versus someone who turned it through their ChatGPT for whatever buzzwords to get their post for the day or whatever.

Juan Sequeda [00:48:14] I love how we're calling out the folks who just want to post things. I mean, I really respect the folks who are like they're really hustling out there. But I think as a reader, as a consumer, you also got to dig in and find out the honest, no- BS behind things.

Karen [00:48:32] That's right. Yep.

Juan Sequeda [00:48:35] All right.

Tim Gasper [00:48:35] Lightning round.

Juan Sequeda [00:48:36] Lightning round. Man, looking at our notes, we got so much stuff in here. Okay. First question is AI governance in the scope of the data governance teams?

Karen [00:48:50] Wow. I would think yes.

Juan Sequeda [00:48:56] inaudible.

Karen [00:48:55] If it has to be a binary, I would say yes, and... But I'll leave it at yes. It's got to start somewhere and I would think that the point is that it's a little bit more than that, and I think that's what everyone's learning.

Tim Gasper [00:49:07] Maybe the tip of the spear on it, but it's bigger.

Karen [00:49:09] Right.

Tim Gasper [00:49:09] It's bigger than just the data governance team.

Karen [00:49:11] Exactly.

Tim Gasper [00:49:13] Interesting. Second question, will there be or is there already interesting software and tools that are focused on AI governance?

Karen [00:49:27] Yes, and there's many. I think it reflects the bias of the folks we've been talking about today as to how they're targeted and released into the market. I think there's a missed opportunity for the totality of it to do it efficiently, to serve business goals, to bring all those voices into the room programmatically.

Tim Gasper [00:49:52] So, a GRC bent to it?

Karen [00:49:55] It's either GRC or it's more the DNA of analytics governance.

Tim Gasper [00:50:02] Okay. So, it could be biased in one direction or the other.

Karen [00:50:04] Then that's my point. I know that there are tools coming from each perspective that I haven't seen one that actually does all the things in one place.

Tim Gasper [00:50:13] Yeah, okay. Interesting.

Karen [00:50:15] Without Frankensteining your way through it.

Juan Sequeda [00:50:18] Yeah. No, it makes sense because also the people building those tools come from one of the sites. They're not talking to the other folks. So yeah, this is going to be interesting on how-

Karen [00:50:29] When you say Conway's Law, it reflects the nature of the organization or the good or bad, the differences in them working separately.

Juan Sequeda [00:50:42] Yeah. That's why to avoid getting to that Frankenstein moments, right? Again, it's people and process, but also I think you hit the golden nuggets that you shared today was on how we can start thinking about this from a product perspective, and that's how we start putting the people together itself. All right, next question. Will the GRC concerns around AI overpower the AI governance conversation?

Karen [00:51:12] I think yes, in the sense that when and if there are regulatory compliance rules in whatever domain or jurisdiction you work in that have teeth, that forces the conversation pretty quickly. And that if you can't do business because of the consequences of what you've put out into the public domain, that gets folks attention pretty quickly.

Juan Sequeda [00:51:43] Yeah, the clickbait things. All that gets... You don't want to be in the news, right? So, you want to avoid being in the news.

Karen [00:51:48] Yeah, there's some interesting stuff going on with the FTC and then as it relates to what's going on in Europe as well. It's still early days, but we'll see.

Tim Gasper [00:51:58] Good advice there. All right, last lightning round question. Can enterprises innovate around AI without a well- formed AI governance strategy?

Karen [00:52:13] I have my bias but I'll say, can they? Yes. Will they do anything that someone wants to pay for, that will add value for whatever the business strategy is? I'm going to say probably not. But I know that there are a lot of propeller- heads out there that are cobbling some stuff together to help them write code or something that is exclusively on the technical side that doesn't rely upon so much datasets as it does processes. I know that there's a lot of that going on successfully. It doesn't have as much of a lift or value, like multiple at the end, but it's happening.

Tim Gasper [00:52:57] Yeah, no offense to the people who are enjoying playing around with the technology and finding cool use cases to throw this at. But if you really want to take it to the next level and do this in a scalable way, both that protects you and predictably creates value.

Karen [00:53:14] Right. Predictive and prospective outcomes are really where the value's at, where everybody's shooting for. And that's where you do absolutely need to have a full governance framework, AI or whatever you want to call it. That's where it matters.

Juan Sequeda [00:53:38] All right. Tim, takeaway time.

Tim Gasper [00:53:42] Takeaways. All right, so many good things. I'm going to do my best to keep this brief. We started off with the honest, no- BS question of what is AI governance and how is it related to data governance. And Karen, I think you did a great job trying to really explain what it is and decomposing it. I think you had a lot of well- thought- out and carefully chosen words around AI governance is a reference to machine learning and generative AI data products. So, both across the different AI- oriented domains. And making sure that all the guide posts are met to make sure that your desired outcomes around ML and AI, your regulatory obligations, and your respect to human beings and their data is all achieved. You also talked about the value and it being something that's easily understood. I thought that was a broad but really effective definition with a lot of carefully chosen words there, which I think was very, very, very helpful for those who are listening and trying to understand what is this domain that they're exploring here and what does it cover that is familiar but also not so familiar. Everyone has different definitions of governance but there tends to be two angles at governance in general, and certainly is very true with AI governance more the GRC angle. Governance, risk, and compliance angle tends to be more driven by legal and regulatory concerns. And then analytics governance, which is more driven around the data insights and the data quality. Both of them obviously are very important to AI governance. We talked about the scope of AI governance and I posed the question, "Are they really that different?" And I think the conclusion is that although they share a lot of the same foundations, there are some pretty major differences and new things to really explore and understand here. For example, you gave an example of a chatbot that was being used by Air Canada and some of the complexity there with the responses that it gave, that then put them into some... Not just user experience jeopardy but also probably into some legal jeopardy. And these are new issues around privacy. Privacy is not a thing that we're unfamiliar with. But privacy and legal obligation, accuracy, quality, these things have a new light and new ways that we have to try to understand and enforce them in this situation. Training bias, content infringement, all of this is introducing some new things that as organizations we have to navigate. And of course, very importantly, value because we don't want to overindex just on the GRC side of this. We also want to make sure we're focused on the right use cases, leveraging our money in the right ways to have the biggest return on investment. So much more but Juan, I'm going to pass it over to you. What were your big insights?

Juan Sequeda [00:56:33] One of the things I'm just right now reading already so much is this Cynefin framework. I'm looking it up. It is five decision- making contexts or domains around is it clear, complicated, complex, chaotic, and confusion. We should be able to apply this right now to AI, make sure that we're actually trying to solve the right problem, something that's being realistic. Again and also, be realistic about what problems this technology can actually solve well. The co- pilot experience, as a good intern. We got to be honest and hope, yes, we find ourselves on what we can and should be doing right now. And bringing human beings at the right time, again this is also the clear takeaway here. We need to call out that we're also learning around this stuff. I was telling I'm going around doing hackathons with our customers because I'm like, " We don't know. We need to figure this stuff out." Let's all be honest about it. We're figuring it out. We need to be thoughtful about how AI agents are going to be used and how it affects people. We're just talking about cases like the Air Canada example right now. And I love this. Just ask yourself, is it realistic? I think this is a very important one. Then we talk about the new normal because of this balance of having the analytics governance and the GRC, right? There's all this focus right now on cybersecurity and privacy, that this is going to be a new normal. But we do need to find out how we can be efficient within this new normal. And also, there's all these silos. We need to be able to, how do we get everybody talking to each other? Do the privacy folks actually know what's in the master data and how that can be effect? What are the implications of that when it comes to privacy and so forth? Can we bring in the risks, the legal folks from the beginning? And people will ask, " Why even go do that?" But that sounds weird to go do. We'll talk about that in a second here. But organizations will change when it's worth it. When it's going to show the value, it's going to affect the P& L. So, these are the things we need to take into account to make sure that people are actually going to go drive to make these changes. And I love how you said we can provide multiples being privacy- aware and compliant. And the way to go do this, we need to start talking to each other, bringing all this stuff together. We need to manage the politics, which is like, " I got my goal. You got your goal to go do this." That section, I was like, " Okay. I'm understanding all the issues here. How are we going to go address this?" And I really, really am enjoying figuring out how to take the data product notion as a way to tie all these different political, cultural, people silos that you have. Because the data product is a verb, right? It is the process of how you engage a cross- functional team at the beginning so they can decide together. There's persona inaudible perspectives. We want to have the highest value starting points are. You want to be able to go meet with those subject matter domain experts. You want to collaborate before writing a single line of code. And so, we build this muscle memory of how we're talking to each other. Ask the right questions and solve the right problem at the beginning. This is where the Cynefin framework comes in. I think these processes that we... There are all these different processes, these frameworks. This is the time to start saying, " Okay, we're going to go put this all together with the goal of how this is going to provide a value to organization, taking into account all the different perspectives and all the different personas you have in your organization." And at the end of the day, you just got to get your hands dirty. There's probably things you can do for your own personal products, so you can map it to what's working and what you're doing at your day- to- day job. Just figure it out and go have fun, but play. Put your money where your mouth is. Karen, how did we do?

Karen [01:00:08] I feel like we can keep going for a long time, but I know we're out of time.

Juan Sequeda [01:00:13] Yeah. Again, as always, this is all your content. We're just taking notes from what you said. So, thank you so much for this very valuable content you're sharing with everybody. I want to wrap it up. Three questions for you at the end. What's your advice? Who should we invite next? And what resources do you follow?

Karen [01:00:33] You bet. What's my advice? This is non- technical but I feel it's very powerful. There was one thing that my dad taught me at a young age, which is you should learn to say please and thank you in every language. I found that to be a great challenge but also something that is really meaningful to honor folks, regardless of if you can't say anything else. At a minimum, being able to say thank you and please is a big deal. So, I'll throw that out there. Who should you invite next? I'm going to throw out, following along the data products themes we discussed is Anna Bergevin. She's a senior data product manager at ResMed, and really iterating along in her own experience of AI and how to make it applicable in the real world along the way. I think you'd have a great conversation with her.

Juan Sequeda [01:01:35] I've been following her on LinkedIn. I really, really like her content.

Karen [01:01:40] She's awesome.

Juan Sequeda [01:01:41] One of the things she's been experimenting a lot of publicly is about, " Oh, I'm going to use ChatGPT for this." Like, " Oh, this worked. Oh, this sucks." And then she's doing this. So, I'm giving her just... We're in channels with Shane Gibson.

Karen [01:01:56] Yes, she's good too. And I joke for Anna that she's... I told her, " You're livestreaming your whole learning experience," which I think the point is the authenticity of it.

Juan Sequeda [01:02:07] Awesome. Inaudible we have to appreciate.

Karen [01:02:07] That we really are plotting along. You don't have to be a polished expert. We really are.

Juan Sequeda [01:02:15] It's just the honest, " We're all figuring this out." And I really appreciate that she's just being public about it, right?

Tim Gasper [01:02:20] Let's learn together.

Juan Sequeda [01:02:21] Let's learn together, and it's awesome.

Karen [01:02:25] Yes. As far as resources that I follow, that's a tough one. I am all over the place. Just off the top of my head right now, I'm a big fan of Benny Benford. He's on LinkedIn and I think he's making... You can really tell that he has gone through it for a lot of the commentary he's made on LinkedIn primarily. I also am throwing out John Cook, who is a part of my data product group but is also really doing some, I'll say it, bleeding edge things and making some insights that I think are really powerful in terms of what does it actually look like when you are doing it. Meaning, the AI enablement for business benefit that someone actually like a CEO can get excited about. And then the last one is someone from the GRC space that I think has been really great about AI governance specifically, Katarina Koerner. She is really great about bringing in the context of whatever domain, whether healthcare, financial services, customer enablement or interactions of staying close to what's going on on the GRC side and making it attainable and engaging. I think she's great as well. Did I get all the questions? Do I have more?

Juan Sequeda [01:03:55] Yeah, you did awesome. And then Katarina Koerner, I want you to please connect us with her. I'd love to go meet with her and also invite her to the podcast.

Karen [01:04:04] I think you'd have actually a good conversation because it's kind of bringing together the two-

Juan Sequeda [01:04:09] Yeah, it was great. And both Benny and John have both been on the podcast and highly, highly recommend them. They're both... I love how Benny is also sharing publicly his entire world, leaving from the corporate and starting his own consultancy stuff. And John's also just a true honest, no- BS builder guy. I love that. So much stuff. There's so much great content and people to go follow. Karen, thank you so much. Quick reminder, next week we have Scott Taylor, the Data Whisperer. We're going to have... Scott is such an awesome guy to go watch. If you have not seen Scott, just look him up on YouTube. He's freaking awesome. And Karen, thank you so much. As always, thanks to data. world who lets us do this every week. This was a phenomenal discussion because it helped clear up a lot of the questions we've had and I know many people listening, too. Karen, have a great rest of your week. Thank you so much.

Tim Gasper [01:05:03] Cheers, Karen. Thank you.

Karen [01:05:03] Thank you. All right, thank you. Cheers.

Catalog

Explorer

Marketplace

Governance

Workbench

Catalog

Explorer

Marketplace

Governance

Workbench

Financial Services

Healthcare

Higher Education

Insurance

Federal

State and Local Government

Financial Services

Healthcare

Higher Education

Insurance

Federal

State and Local Government

Data Leaders

Data Engineers

Data Governance Professionals

Analysts & Business Users

Data Leaders

Data Engineers

Data Governance Professionals

Analysts & Business Users

Integrations

API Documentation

Reference Implementations

Support

Integrations

API Documentation

Reference Implementations

Support

Snowflake

Oracle Database

Postgres SQL

Databricks

dremio

Snowflake

Oracle Database

Postgres SQL

Databricks

dremio

Blog

Events

Podcasts

Webinars

Reports and Tools

Blog

Events

Podcasts

Webinars

Reports and Tools

Who We Are

Our Team

Our Partners

Why data.world

Who We Are

Our Team

Our Partners

Why data.world

Press & Media

Events

Careers

Legal

Contact us

Press & Media

Events

Careers

Legal

Contact us

Catalog

Explorer

Marketplace

Governance