NEW Tool:

Use generative AI to learn more about data.world

Product Launch:

data.world has officially leveled up its integration with Snowflake’s new data quality capabilities

PRODUCT LAUNCH:

data.world enables trusted conversations with your company’s data and knowledge with the AI Context Engine™

PRODUCT LAUNCH:

Accelerate adoption of AI with the AI Context Engine™️, now generally available

Upcoming Digital Event

Are you ready to revolutionize your data strategy and unlock the full potential of AI in your organization?

View all webinars

What is the Future of Data Catalogs? with Malcolm Hawker

Clock Icon 63 minutes
Sparkle

About this episode

Malcolm Hawker, Chief Data Officer at Profisee and host of CDO Matters Podcast, recently sparked a heated discussion on LinkedIn about where data catalogs are heading. His view? Data catalogs today are being commoditized and we need to pivot from data management to knowledge management. Join Malcolm, Tim, and Juan as they break down this debate and explore what other data professionals had to say about this shift in thinking.

Tim Gasper [00:00:32]:
Hello. Hello. Welcome. It's time for Catalog & Cocktails. It's your honest, no bs, non salesy conversation about enterprise data management. With tasty beverages in hand. I'm Tim Gasper, longtime data nerd product guy at data.world, joined by Juan Sequeda. And we have a special guest today, Juan, don't we?

Juan Sequeda [00:00:50]:
And this is, this is super exciting because multiple things. First of all, we are start. We're repeating the guest, which we've only done a couple of times, I think. So we're. So that's a big thing.

Tim Gasper [00:01:05]:
You're part of the VIP club.

Juan Sequeda [00:01:07]:
Yeah. So we have like the, the podcast alumni, but now I think there's a podcast super alumni or something. So welcome to that group. And so who are we talking to? We're talking to Malcolm Hawker. And if you don't know who Malcolm is, like, I think you've probably also been underneath the rock for the last many, many, many, many years in the data world. And Malcolm, how are you?

Malcolm Hawker [00:01:28]:
I'm, I'm fantastic. I'm fantastic. I'm. I'm beyond thrilled to be like, you know, return, return guest. Which is awesome. I'm. I'm not being features on my side with the zoom, but I assume I'm being featured on your side, which is the oddity of the zoom. Yes. And not the Restream. Okay, good. I don't have to worry about it. That's just my ego talking.

Juan Sequeda [00:01:48]:
We can see. Just. And just to kind of give you all some honest, no bs, kind of what's happening in the background. We were. This is a pre recorded episode and we're. Something happened with our platform. So we're doing this on Zoo Zoom right now and then we're streaming it. So you're seeing us Streamed on Wednesday, October 30th.

Malcolm Hawker [00:02:06]:
Yes. Cool. All right, so anyway, thank you for, thank you for having me back. But I do believe, I do now keep me honest that our first topic, our MDMS Dead, was like number one or close to being number one in terms of your 2023 total number of views. Yeah. Was it not.

Tim Gasper [00:02:27]:
It was a list topper for a long time.

Juan Sequeda [00:02:30]:
And, and I think per this when we got the Spotify kind of update and that stuff. The Spotify unwrapped. Yes. The Number one episode was. Was MDM Day. I will say the honest note. Bs. Is that one of the. I would suspect one of the reasons that your episode, I think, was in February that year. So it was able to.

Malcolm Hawker [00:02:48]:
Oh, I had the whole year. Okay.

Malcolm Hawker [00:02:52]:
I had their whole year. Yes.

Malcolm Hawker [00:02:53]:
All right.

Juan Sequeda [00:02:54]:
Well, you beat the ones from January, I guess, right?

Malcolm Hawker [00:02:57]:
I'm just saying. Okay. All right.

Malcolm Hawker [00:02:59]:
I'll take it. But.

Juan Sequeda [00:03:01]:
But be okay. Before we get to the topic, we do this very quick. tell and toast. So what are we drinking? What are we toasting for right now?

Malcolm Hawker [00:03:08]:
For. For. For. For me. This is a lovely, fully oaked California Chardonnay. It's a relatively cheap one, but it's still okay. It is the, you know, well known Claud Bois. You can buy it anywhere. You can buy it at Publix, you name it. But it's actually not bad for like a $10 bottle of wine.

Juan Sequeda [00:03:25]:
Yeah, I like it when you enjoy it. You find. And you really enjoy a $10 bottle of wine. It just makes it. Not only. It just not only is a great wine, but knowing that it's $10 makes them taste better. So.

Juan Sequeda [00:03:35]:
Yep.

Juan Sequeda [00:03:35]:
Tim, how about you?

Tim Gasper [00:03:37]:
You know, I. We're at like 250 episodes or something now, and I'm running out of cocktails that I can make easily and quickly. And so I'm starting to loop back to some favorites. This is just the classic Manhattan, you know, Got my amarena cherry in there. So, yeah, doing all right.

Juan Sequeda [00:03:55]:
Nice.

Juan Sequeda [00:03:55]:
And I am tasting something that my good friend Tim gave me for my birthday, which is the Port Charlotte heavily peded one. This is beautiful, man.

Malcolm Hawker [00:04:03]:
Thank you.

Malcolm Hawker [00:04:04]:
Peded one what? Scotch?

Juan Sequeda [00:04:05]:
Yeah, yeah, the Port Charlotte. Just a single malt. Isla Scotch.

Malcolm Hawker [00:04:09]:
Scotch. Scotch, Scotch, Scotch.

Juan Sequeda [00:04:10]:
All right. Cheers, everybody.

Tim Gasper [00:04:13]:
Cheers.

Malcolm Hawker [00:04:13]:
Cheers.

Juan Sequeda [00:04:14]:
Okay, so we are here because a couple days, weeks ago, whatever, Malcolm wrote a post which was, what is the future of data catalogs? And oh, my, oh, my, that thing really exploded. I mean, just as of right now, there's 200 likes or 150 comments like, this is as. This is a phenomenal post, which right now, anybody who is thinking about data catalog and stuff, is it. There's so much knowledge in here. So we reached out to Malcolm saying, you know what? Like, this is such an important topic, obviously, this catalog and cocktails for a reason. Catalogs here. Let's go really digest what happened. You're Malcolm's position, our position to. And kind of go through a lot of the topics here. Rules of the game. I mean, we still, honest, no bs, non salesy. We're not going to talk about any vendors. We will acknowledge that Tim and I are employed by Data World, which is an enterprise data catalog vendor. And we obviously have some biases because, I mean, Tim is a chief product officer and I'm the, I'm the scientist here and here. But nevertheless, we are not be. We're not going to be pushing any, we're not going to be pushing Data World. We're not going to be throwing any catalogs. So we're not going to talk about vendors. So just want to be very clear with everybody here. Honest, no BS, non salesy. And with that, Malcolm, honest, no OBs. What's the future of data catalogs?

Malcolm Hawker [00:05:31]:
It's a really good question and I think there's two answers here. There's what I would like to see and what I want the future to be and what I think the future will be. So let's start with what I want the future to be and what I think a CDO or anybody leading a data function would want to see. What I'd want to see in the future is a data catalog that bridges the gap between data management and knowledge management. That starts to play a role of helping people in the world of data to deeply understand how data is being leveraged to drive business value within an organization, to go away from the idea of data lineage to business process lineage, or at the very least be able to map them. It's interesting, it's anecdotally interesting to know that data at point A, like this customer record, goes from point A to point B. Excuse me, but what happens there? Is there a contract executed? Is, is there an item shipped to a customer? What business process is actually being enabled because that data is now been integrated into some downstream system where it wasn't before, right? What, how, how often is data being used? How often is it being referenced? How often is it, is it, is it associated to a successful business transaction and how often is it associated to an unsuccessful business transaction? Yes, I fully understand. A lot of this starts to look like analytics and I, and I get it. And I'm not saying data catalogs need to become analytical platforms because they don't need to become that. But they need to start acting like this connective tissue between knowledge and insight and how, what is actually driving the business, right? And what we are managing as data librarians, that's, that's the future I would love to see. I wish, I wish, I wish. But evidence seems to suggest that the path we're on is this fully commoditized, very, very librarian centric. And I, and I use that term lovingly, librarian centric. Where, where, where it is this thing where we manage a list or inventory of all the stuff that we manage, and it's got basic lineage and, and, and, and it's got glossaries in it, and it's got some basic governance rules in it, and it's doing kind of the full, the full, you know, metadata management that is always done, but it doesn't really progress much beyond that. That's, that's that. And that's my fear, because there's a world of business outcomes and incredible transformative value, and then there's the world of what we've been doing, and I want to see the former and not the latter.

Tim Gasper [00:07:56]:
So maybe to try to interpret your two worlds here. One is a dystopian future and one is maybe a utopian future. I don't know, it depends on what you like. But in this utopian future, it's this active kind of brain for the organization, maybe metadata management on steroids. Right. Of like, it's the decisions, it's the business context, it's the explanation of why things are happening. It's for humans, it's for robots, it's the man behind the machine. It's this sort of, this fabric. Right.

Malcolm Hawker [00:08:40]:
Well, I love the fabric metaphor. I love it. I mean, I think, I think that's, I think that's a metaphor. And I think it's also maybe potentially literal.

Tim Gasper [00:08:48]:
Yeah, yeah. If, if we, if we start to go down that, that data fabric conversation path. Right. And then dystopian version of this is maybe, maybe still. Okay, there's probably some business value there. Right. But it's, it's. What's the cheapest thing that gives you the basic features that you want and maybe is interoperable enough that if come renewal time, you want to find a new catalog, you can port your metadata over to the new thing so that you can be a librarian over a new user interface.

Malcolm Hawker [00:09:21]:
And it's interesting, Tim, because I see this spectrum situation, the spectrum thing playing out across so many different worlds. Data products, for example, right shift left, shift right. Data product can shift all the way left, where it's just this container of data that is used as a raw material to create analytics. Right. And I think that you could make the same metaphor here. A data, a data catalog could be shifting left. And that's all it kind of does. That's the dystopian world that you're talking about. Or it could be the other thing on the other end of the spectrum where it is an integral part of understanding how our business operates and how data is driving business value. Now again, I do understand that this, the utopian future starts to get a little murky because we get into the Venn diagram here starts to overlap a lot. Right. It starts to overlap with things like mdm, it starts to overlap with integration, it even starts to overlap with advanced analytics. Right. We could have a very interesting conversation to say, okay, we can agree that in this world of a knowledge driven world, and here's where I'd love to I'm poking at Juan here. In this knowledge driven world where knowledge graphs are playing an integral part of understanding the context of data right near and dear to his heart, does that knowledge graph have to live live in a data catalog or can it live somewhere else? I don't know. Right. Maybe it does, maybe it doesn't. But the lines here that have historically kind of demarked data management solutions, mdm, advanced analytics, even data science integration, data quality tools, all these worlds are starting to kind of coalesce. And I, I, and I don't know what that really looks like, but things are getting murky. Yeah.

Juan Sequeda [00:11:06]:
So let me, let me, so what I, what we want to go do here is like kind of go through several of the comments also that came up and I'll read them verbatim because they're just, I just want to be very kind of spot on what people said. But I'll start, sorry with what I wrote on your comment here to kind of follow up giving your questions here. So I said, Tim and I put out a while ago that we need to shift from a data first world to a knowledge first world, which means people, context and relationships first. And of course data is part of that mix. Right. And I say I dislike the term data catalog because it shouldn't be just about cataloging data, it should be about cataloging data and knowledge. Like today we are the catalog. We are cataloging that technical metadata, columns, tables, dashboards, ETLs to be connected to the people, to the decisions, to the metrics, to the outcomes and model the business processes of organization. Call it the brain of organization or that digital twin. We should evolve from data lineage to business lineage. We should find the data about customers and also ask the questions, how many customers? With the response back with all that context. And this is why I'm a strong advocate that knowledge graphs is the way to manage this because it treats knowledge as a first class citizen and then so I want to go into the part where you say, well how do we manage all this knowledge? And is it the graph and does it live into one place? It should. It doesn't need to be live, live in one place. And the reason why I'm very adamant about that is this really mimics the way the web has been designed and how it is created and how it lives today there. The beauty of the web is that anybody can say anything about anything, right? Anybody can create any website. I can link to another website and then I can follow that link and I jump to another website and suddenly I'm on a different server and I can keep following. And there are certain governance of stuff. There are incentives for people to go do things. That's why we have standards around that. But I think there is no need to be able to say that needs this needs to live in one place. Having said that, we have things like Google and search engines which they do bring in everything and they catalog the web such that they can provide great user experiences. So you, when you start searching for that stuff, right? When that's why you can get a million Results under in 0.02 seconds, right? Because they did decide to go, go save that it right? They crawled the web and do that. So I think that is how, if I take that analogy of the web, which we take for granted, again my parentheses, it's the web, the Internet, right? I think that's the mentality that we should go do so that we should head towards. So you have a. Systems like MDM quality, like they're all going to be doing these types of work that involves metadata and connecting data knowledge together. But we need to have the ability to say hey, that thing over that in system A can be connected to system B. And if you have other things that want to be able to pull that together, that's fine. And that can be a catalog, that can be an MDM tool, that can be quality tool, that could be whatever you want. But I would argue that a catalog would be kind of like the Google and I think in some of the.

Malcolm Hawker [00:14:09]:
Well, yeah, but does it have to be standards based, Juan? Like so, so, so you know, you're, you're porting to, to the web and W3C and stuff. I know that you're like deeply involved in and know a lot more than I do, but like getting into OWL and RDF and all this other, like does it have to be standards based or can the AIs just kind of figure this stuff out? And I Know that I'm being like really pithy and glib, but they seem to be pretty good at language, they seem to be under pretty good at context. Right? Did they, does this world need to be standards based or can it be more probabilistic? And given all these factors, given all the things that we see in the metadata, this is kind of Gartner's view of active metadata, right? Getting all the things that we see in the, in the metadata. Can't we just figure it out? Does it have to be standards driven?

Juan Sequeda [00:14:54]:
And one thing is it being standards driven, another thing is, is it being non deterministic in a way? Right. I think we shouldn't separate that. We should, we should.

Malcolm Hawker [00:15:05]:
Fine. Yeah. Okay.

Juan Sequeda [00:15:07]:
And I would argue that these, these gen or remember that this gen AI stuff is trained and stuff from texts and this metadata is not necessarily text. So you gotta go extract all that stuff. So bottom line, I mean this is a possibility. I mean but at the end of the day you do, you do want to be able to track like the, the observations of the world and I mean, I would rephrase the argument that you're bringing is I don't need a database anymore. LLMs can ask anything about that. So maybe let's screw databases into it.

Malcolm Hawker [00:15:39]:
And I think maybe, maybe, maybe. Do you remember a conversation we had? Do you remember a conversation? I don't know where we were because you and I run into each other like in these amazing cities in the world and it's such, yeah, we live crazy lives. But I asked you somewhere, probably after several glasses of wine or beer, I can't even remember. And I asked you the question of okay, so what would happen in a world where business applications like CRM and ERP or even transactional systems, e commerce sites, doesn't matter, like any transactional system or any business application. Why do we put things into tables? Can't we just start capturing the narrative of the business like as literally a story, right? Like what we are capturing is a story that describes the business. Right. And I know this is getting like really way out there, you know, stuff. But you told me at the time you said you were going to go to Japan and figure it out. I think, I'm not entirely sure if you went to Japan to figure it out. You probably didn't. But this describes two ends of the spectrum which is highly deterministic, rules based intersection of a row and column. And I know I'm oversimplifying, but our current world and the future world. Here is a story of the business. And, and how do we bring these things closer together? I think a data catalog will play an important role in that in the future. I don't have all the secret, you know, look forward, I don't have any magic forward looking glasses. I wish, I wish I did. I'd be, you know, crypto billionaire. But, but I, I do know that we need to start thinking about these things differently. And you touched on something, right, which is, you know, non deterministic. And, and to me that, that's, that's, that's a paraphrasey way of saying, you know, context matters and it can't be a one rule, rules them all. It's multiple rules based on the context, based on who's creating the data and who's using the data. So anyway, can we, can we start capturing a narrative of the business? And what role would a catalog play in being the historian, as it were, of the business or telling the story of the business and telling the story of the things that matter the most to the business? I don't know, I'm being rhetorical here and I don't have all the answers, but that's, I love the idea of a knowledge based world and catalogs helping provide a bridge between where we are and where we need to be. And it's the stuff that you're working on day in and day out. It's all the research you're doing around knowledge graphs and vectors and finding ways to stop LLMs from doing crazy hallucinations.

Tim Gasper [00:18:18]:
Juan, you want to go first? You can go first.

Juan Sequeda [00:18:19]:
You go first.

Tim Gasper [00:18:20]:
All right, well, I'll let Juan chew on that for a minute. The one thing that I wanted to jump in on is that I think that if we want catalogs to be more active, to be more involved and closer to the business, to be that fabric, right. That we kind of talked about in the zootopian vision. It needs to do different things than it traditionally has done, right? Because there's a, there's a version of the future that is just more of the same, right? More sources, more lineage, more rules and policies, right. There's a version of the future which is just more, more of that. Right. And I think that leads more towards that commoditized path. But what you're talking about, I think, and I think what. Juan, you know, the research you've been doing and the direction you've been trying to push is like in a new direction. And what that new direction is, is I think a little bit opaque right now. Like, I don't think we're 100% sure what that new direction is. Right. Is it just all in on AI and let's let the LLMs run loose and you know, see what exciting things they can do. I for sure know that AI is part of the solution, but is it the solution? I don't know. Right. Another interesting X factor, unstructured data. Something that's super important to LLMs. They speak that language. Traditionally, catalogs don't really care or deal with unstructured data. You may hear other things from vendors or them say other things. But the honest, no BS is that in general, other than the metadata about unstructured data, catalogs tend not to have to do with the unstructured data itself. And there's a lot of knowledge in there. Right. The third and last thing I'll mention in this kind of rant is that interestingly, and this is potentially the most salesy thing that I'll say in this entire conversation today, but don't interpret it as a salesy thing, interpret it as a proof point is that at Data World, the customers that we see having the most adoption are the ones that aren't hyper focused on the technical aspects of a catalog. It's the ones who are building a marketplace. They're building data products, they're focusing on AI use cases, they're building out a catalog of recipes of the kinds of analysis that you want to do. Find the recipe to accelerate your ability to get the answer to those questions, not the customers that are hyper focused on lineage, which is a little bit counterintuitive. Right. So I think that's an interesting data point towards what could this knowledge oriented feature be?

Malcolm Hawker [00:20:59]:
Yeah, well, so while Juan is stewing, not stewing, processing to number one, to number one is are LLMs a parlor trick or are they the future? I really see them being the latter. Right, I see. And it's not necessarily just don't think about, you know, ChatGPT and Bard and Gemini. Don't necessarily think about that and don't think of them as just LLMs because I think it is the combination of LLMs plus all the AI that we've always been doing. Right. Natural language processing, you know, just even just, you know, garden variety correlations, machine learning, put all these things together. But I think how we'll interact with those insights will be through LLMs because they, they kind of look and talk and think like us. Right. It's, it's, it's, it's the, it's the chat body agent type nature of LLMs. It's, it's, it's their, it's their interface that I think is going, is going to draw us naturally towards them. And they, I honestly truly believed him that they will form the backbone of in essence a new computing platform just like the Internet did and just like the cloud did. I think LLMs will become, become a next generation computing platform. And if you are a data management vendor or if you a CDO and you're not thinking about this stuff, you need to be thinking about this stuff. So to point number two, unstructured data. Couldn't agree more. Good grief. Like, yes, it's outside the scope of most data catalogs because it's outside the scope of most governance programs. Right. How many, how many CDs out there know how many files, how many PDFs are sitting in their marketing SharePoint sites? None. I mean it's like marketing is just doing, doing its stuff out there. Customer support is doing its stuff. They're creating FAQs, they're creating all of this unstructured data, all these PDF docs, all these word docs, you name it, that are the narrative of the business, right. That are describing the core product. They're describing even, even customers the customer wants and it's, and they're sitting out there in unstructured data and it is the wild wild west. For two years we've been saying, data leaders have been saying we need to double down on foundations, meanwhile largely continuing to ignore unstructured text. While it is the thing that LLMs are built on and optimized by. Right?

Tim Gasper [00:23:15]:
So just think about like Microsoft Copilot and these different tools, like they're getting pointed to your, like SharePoint and oh God. As a governance professional, when's the last time you thought about governing your SharePoint? Probably not.

Malcolm Hawker [00:23:29]:
Well, right. Like cold shivers, right? Oh my God, what's out there? I don't even know what the people in marketing are saying, right? Like just, just, just basic stuff beyond access controls. It's like even what are they saying? This is their candidate use case. It's where you cut something out floating under a SharePoint server that is like woefully inaccurate and an LLM picks it up and then all of a sudden you're getting sued. So, so yeah, we've been talking about getting our hands around foundations, but we've been doubling down on our old foundations, not the foundations we need to be focused on in the future, which absolutely, positively will include unstructured data. And between where we, the dystopia and the Utopia unstructured data. To me, that's a layup, right? Tagging this stuff, profiling stuff, tagging this stuff, getting that into a data catalog.

Malcolm Hawker [00:24:09]:
If you're not doing that.

Malcolm Hawker [00:24:11]:
Yeah, yeah. If you're not doing that, I mean, you've got to be doing that. So the third point. What was the third point? I only had one glass of wine. But.

Tim Gasper [00:24:24]:
The fact that adoption really comes not from the most technical use cases, but the knowledge orientation.

Malcolm Hawker [00:24:29]:
Yeah, I mean, yes, absolutely. I do see catalogs playing a role in being the marketplace, as it were. But as an end user, I don't want to go search for an object. Right. Most data catalogs that I look in and if I do a search for a customer, I get like 400 things that spew out to me. I mean, it's not actionable. I can't. What I want to do is I'm trying to cross sell more. Hey, catalog agent, hey catalog copilot. I'm trying to cross sell more. What do I do? Right. Like, and, and, and maybe we could argue that's not a use case for catalog, but I see data observability as being this little tiny microcosm of, of what is possible in arguably, let's call it a fabric enabled future. Right. If we can look at, you know, scads of web logs and transactional data, if we can apply some basic ML to start understanding when pipelines will fail, we most certainly can start doing the same thing to business processes and say, you know what, some salesperson did something really stupid and inputted something incorrectly or missed a field or something, and we know this process is going to slow down because we've got all this transactional data that tells us the process is going to run slow and the data looks like this. So then take this action to fix it. So go ahead.

Juan Sequeda [00:25:55]:
That example that you just gave, that should be kind of a very clear stepping stone of what's next. And do we need LLMs for that? Whatever, I don't know, probably not. But at this moment I don't care. Hold on, don't talk about the LLMs for a second. But I think you just nailed it right there that we, we continue focusing so much on the technical stuff. Right. And then we need to kind of elevate it more. I talk about the business lineage you just said right there, like, I want to have like business observability. It's like, oh, right. I think we have all these, we have all these kind of analogies. We can do this. This is the stuff. This is the tech we do for this data. How about if I use that for the business and then I think that should be the evolution. What you just said is what I hope becomes commoditized in a future type of data cattle whatever it's called. It should be hopefully in a couple years. It shouldn't be like obviously we have a system that automatically detects when a salesperson puts in bad data in here and it's going to affect how we're calculating our, calculating our projections for our revenue this quarter. Like that's the type of stuff that the freaking CRO wants to know about this immediately. Right? So that's the. I want to go push the notifications to people that's going to drive to end users. That's going to drive money. So I think we go goes back to the topic I want to hit on kind of connecting to the business value but going back to the whole. We'll take it back to the, to the LLMs. I think the LLMs and all this generative AI is also going to help commoditize more of the data. The data catalogs we like I don't want to fill in descriptions for these things and find relationships like I know nobody wants to do that. That technology is going to go and commoditize that. So that is something hopefully in the next year it's like done like this is something that we know is happening. I think the, the other question you said is if I'm searching for data, right? Because why are you searching for data? Because I'm trying to do some cross sell analysis or whatever. It's like so you're not really searching for data. That's a means to an end. What you're trying to find the cross sell that stuff and definitely that's where the LLMs are going to go. Help. Because I can be able to have that interaction and be able to say this is the problem I'm trying to solve. Help me solve this problem. And the system will say well one way to solve that problem is that you need data and I know where that data is. And another thing that you need to know to solve that problem is the context because you want customers. Well we have all the different customers so we want to have like that you want to be able to talk to that brain, whatever. I mean this is why I don't, I don't want to call it anymore catalog. The evolution of this is it's called the enterprise brain, right. Which enterprise brain is cataloging has all this knowledge about what's in your brain. So that's another thing how alarms are definitely going to help. So number one, make commoditizing it faster. Number two is having this, it's your. The way one of our colleagues, VIP Farmer from wbp, he says AI now has a UI and it's great because he gave us this UI to be able to go interact. That's, I think that that's the second one and a third one. It's going to be an accelerator to catalog all of this knowledge that exists not only in text but also in people's brains that they've actually never put it down where I can finally say I can interview you or record a conversation that lasts one or one minute and that's enough for me to say, boom, I'm going to go pull that in. So I think that's going to be another accelerate. So I definitely, I'm with you about how the, these LLMs are going to be, are definitely part of the future but it's all going to be focus on thinking not just about the technical stuff. And I think thinking about the knowledge management is, is, is already a path to, to think about and then always, always, always driving the business value.

Malcolm Hawker [00:29:38]:
Well, so what I, what I see happening is, you know, I, I gave a presentation, I gave a keynote last week at a semantic layer symposium in Munich and I was talking about this transition from data management to, to knowledge management, which is really metaphorically what you're talking about in the last rant one which is the knowledge of the business. And some people and I made a couple of posts last week about this and people rightfully said data management is not going to go away. Right. And maybe the metaphor here is like photography. We went from dark rooms to digital photography and you still need to know the best photographers will still know how developing will actually happen. But I honestly think that data management as we do it today will largely be obfuscated and automated. Right? It'll be automated and then it can be obfuscated if there's appropriate levels of trust, if there's appropriate, if there's things are appropriate for the use case. And I think that AI will largely help do that. Is this a two year horizon? No. But is it a five year horizon? Yeah, yeah, I actually think so. And it's not a stretch to start thinking about fully automating a lot of the data management we do. Getting back to the use case about the, you know, business observability, I love that trademark business observability. Brilliant. If we get back to that what we can start to do is basically, you know, if I can hold a phone up between somebody speaking Mandarin and my brain and tell the phone to tell me real time what that person is saying, they're speaking Mandarin and what I'm seeing is English, we can put something between the business application and the database layer to immediately correct problems or bad data or problematic data. And the AI will know that this is a problem given the context, given the use case. So the more we do that, the more we automate this, the more the business processes will improve, the more efficient we're going to be as businesses. And that's a virtuous cycle that will continue to feed on itself over and over and over again. So I would love to hear comments and I've had a lot of I've made a lot of posts about this and it's kind of like poking the bear on LinkedIn. When I start talking about automating data management and automating data governance, there's a lot of pearl clutching that goes on, man. But, but I think we can do it. I think we necessarily will do it.

Juan Sequeda [00:32:02]:
So before we go in there, I do want to read some comments here on the topic. Several of these come up on your post. So so one Ole Olson actually also former podcast guest I hope, I hope everyone here gets that knowledge management is an existing domain with existing it is and that's really, really important that let's let I mean I'll acknowledge that I come from the data from the data world and I don't know myself so much about the existing knowledge management world and there's so there's so much over there to do that. So that that's an interesting point. Jessica Talisman, also a former guest it says I believe the tool or technology exists in the knowledge management space like anything else. Catalogs are evolving tools in the data domain and the knowledge management aspect is part of this evolution. While cliche libraries cannot exist without catalogs and I believe in the data space catalogs have been implemented incorrectly, are underutilized, or have not evolved with the business. Hate to say it, but a data catalog that can leverage a knowledge graph is pure gold and can support knowledge management. Anna Bergevin, also former podcast guest this is Tim.

Malcolm Hawker [00:33:09]:
It's the who's who. I brought him out.

Juan Sequeda [00:33:12]:
We were so freaking lucky. We like we get to meet with all these people say and I said I agree that knowledge management is the way of the future for data catalogs. The structuring of context from one where AI has access to free text and a small amount of metadata to one where AI can access known facts between the business concepts, metrics, reports and systems. That's the game changer Parentheses Malcolm that's what we're talking right now of like I love it. Want to be able to. I'm here to go solve this problem about cross selling. Right. That's what I want to go to. Continues. So much knowledge sits in the minds of humans or at best long form text document. We need more verified facts and relationships to help AI move towards greater precision and reliability. And in the meantime that documentation also makes things clear for the humans who are accessing that catalog. Andrea, who's actually Andrea Goya who's going to be a guest next week or this week, plus one for knowledge management. Anyways, there's more knowledge management over there. But I think what was really interesting is that when you, when you wrote this post, knowledge management became a theme. So that's just an interesting observation.

Malcolm Hawker [00:34:18]:
Well, I, I'm, I'm a recent convert. I mean I'm not, I'm not entirely a recent convert. I, we've been talking about context for a while. You know I live and breathe in the MDM space and, and, and our first interaction one got back to context and, and what is the context and you know, customer to one person, maybe something different that is customer to another person and they're both right. I love the comment about business facts. I love that. And it kind of aligns to the kind of the business observability thing. Right. Which is could I turn to the catalog through some sort of agent or copilot maybe and say if I increase the price by 5% and maybe this isn't the data catalog, I don't know. Or maybe if I improve the data quality by 5% of our customer related data, what happens? Right? What, what are the downstream impacts of that? What, what are the downstream impacts on customer service? What are the downstream impacts on logistics, on supply chain, on, on other, on. I, I, I don't know. Right. Like that's kind of where we're starting to get to because those are the relationships that matter. They're buried in the relationships of the data. But ultimately it's, it's the relationships that the business cares about and not necessarily the data. It's the relationship between customer satisfaction and revenue. Like that's the stuff that we need to figure out. We've got the relationship from like a join perspective. We know that this table is linked to this table and we may even have this existing knowledge management world, to Ole's point, to others point, we may have that, but we don't have that connection between the data, the ontology of the data and what actually happens at the business.

Tim Gasper [00:35:56]:
How can we actually make that happen? And just to give that a little bit more context, I feel like, you know, we've been stuck in this, I'll call it a local maxima, right? Assuming that there's this promised land of AI and fabric kind of in the future. Right. But we're in this local maxima right now where it's like, you know, we've got databases, we've got ETL tools, we've got analytics tools and BI tools. Right. All of these things are ultimately hammers. Right. And, and, and you know, to tie back to your example you gave about 10 minutes ago, like cross sell, I want to cross sell better. Like that is something conceptual today. The interface between sort of the conceptual use case and, or the intent, right. And the data and everything like that is some combination of metadata, business logic, rules and frankly probably the vast majority of it is floating around in people's heads. And that's why we have data analysts. Right. How are we going to bridge these things?

Malcolm Hawker [00:37:11]:
Well, you're really talking about, I think, you know, articulating the business value of better data management of the, or the insights that we're talking about. And let's stick on the cross sell use case because I think it's a really, really easy one and maybe. And my dogs are wrestling in the background. So if you hear growling, it's my, it's my dog's wrestling in the background. This is the joy of working from home. Let's take the cross sell use case. I mean this is a simple relationship, right? Where you've got customer A is related through company to product A, but they're not related to product B. Right. Like, like that, that's, that's a simple graph we can run right now that says they're buying product A, but they're not buying product B. Right, done. How do I like and fuel me that and provide me that insight? Which, which, which we can, we can do today. What you're really kind of talking about, Tim though, is what is that actually worth to me? Right. And, and, and how do I calculate that? That's a separate podcast issue. I, I will, I will, I will go to my grave like Mel Gibson, Braveheart ish style screaming, we can do this business valuation. We can quantify the value of better data. We can quantify the value of these Insights and anybody that tells you you cannot, I think is perpetrating a myth. Can we do causal relationships? No, we cannot do causal relationships. But there are business units like marketing, like hr, like procurement that are running entirely on attribution models. Can I say that one data in mark, $1 spent on marketing, $1 spent on the honest, no BS, podcast, cocktails and catalogs and cocktails podcast will equal $1 of sales. No, I cannot. But I certainly can get to the point of helping attribute sales or attribute brand awareness or attribute other things. The marketing function literally runs on it. And we could do something easily, we could do something similar in the data world. So kind of getting into a little bit of a rabbit hole related to valuation and how do we value these things. But again, it's all in the data. It's all in the data. It's in the metadata, it's in the transactional data. Right? It's in the knowledge graphs that we need to be deeply embracing in the world of data and analytics. It's all there. And it may not be causal, but the relationships most certainly exist.

Juan Sequeda [00:39:30]:
I want to bring up a couple of comments on this topic about the business value and sorry, I'm going to be honest and honest, no BS and bias. I'm picking up a lot of former guests here. So Aaron, Aaron Wilkerson.

Malcolm Hawker [00:39:42]:
Love Aaron.

Juan Sequeda [00:39:43]:
We need a reset and a fresh start. As an industry, we took the business too literally. When they said I want to know what data we have, we took the phrase and uploaded our entire database, data warehouse, lakehouse environments into a data catalog, only to realize that was way too much for a reasonable human to understand. Does anyone really need to see the thousand plus tables you maintain? We need a more simplified curated experiences for these catalogs. We need to adopt the business first mindset. But we also need to understand the use cases that require catalog implementation and dedicated resources to maintain. This is where the knowledge first perspective is helpful. How do we track business objective initiatives back to the data team's work and present that in a platform? So I think that that's, that's bingo, bingo on here. And then another, another former guest, also Neil Burge, he did, we have an episode about him doing all his research about like why people buying catalogs. And he says in his research there are four broad buckets that drove people to buy one. So right, number one, someone else just got a compliance fine by a cheap data catalog. So we can take the box and go back to, we got, we got McKenzie, we got Sanctuary, right? We got, we got A big, the big, the big folks coming here, the board, are on a path to a data driven business. The data catalog is a stepping stone towards the success, or not, of the CDO being fired in two years. That's what you require. We have a sprawling data state and an inefficient data team. We need a single PL pane of glass and the data catalog will provide it a long hard slog and mostly technical exercise. And then we have grown using, using data and outgrown our data landscape. So our growth is at risk if we fail to buy tools for the data team. A data catalog might help if we focus on the data driving our growth. So I think it's, that's also very clear. Right. Compliance. Right. That's what I'm being told to go do. Making efficiencies and actually supporting for growth. And I think at the end of the day the, I think it's kind of obvious that a future data catalog should be, well, it's connected directly to the business, what we're trying to go do, how to, how to be more efficient and how to drive growth.

Malcolm Hawker [00:41:48]:
Yeah. So, so we can, we take just like one minute to acknowledge like two things. One is that I continue to be overwhelmed by the fact there remains a place online, there remains a social media, we can call it social media, but there remains a social media platform where people can have great discussions about this stuff. I find that amazing in our day and time that there is still, yes, it's LinkedIn, but that it exists and we can have conversations and we can disagree and we can grow and we can learn. And I've learned day in and day out from LinkedIn. The second thing I want to acknowledge is that people take the time to add their input. Right. And to add these comments. And whether it was Neil, whether it was Aaron, whether it was Jessica, whether it was Ole, whether it was anybody, we've talked about like I, I, I, when I think about this stuff, like, thank you, thank you, thank you. Because I'm smarter, I'm more enriched, I know more because of what they're doing. And I just kind of, honestly, when I step back, I'm like, I get a little, get a little overwhelmed by it because it's so awesome. So maybe it's an advertisement for LinkedIn, maybe it's an advertisement for podcasts like this. But you know, it's a really, really good thing that we're having these conversations as data leaders because we need to be having more. So anyway, yay, business value. Cheers.

Juan Sequeda [00:43:03]:
Cheers for that.

Juan Sequeda [00:43:04]:
Amen to that.

Juan Sequeda [00:43:05]:
Yes, for sure.

Malcolm Hawker [00:43:06]:
Which is awesome. And all those guys, like, you know, so smart and have so many good things to offer. It's just I love being on LinkedIn and I like hearing what they have to say about all of this stuff because my intentions. Yeah, yeah.

Tim Gasper [00:43:21]:
For those that are listening, you know, follow and connect with us and the community, the discussion is happening. So if you're not a part of it, come, come join in.

Malcolm Hawker [00:43:32]:
Yep, yep, the water's warm. Lots of room for, lots of room for everybody. So, you know, to finish on the last point. Yes. I mean Neil's points I think are extremely well taken, you know, and if you were building a product roadmap and you know, maybe, maybe Tim, this is, this is where we can start talking about some of the, some of the difficulties faced in building a product roadmap here. But if you were just building a product roadmap based off of that type of customer feedback, right. You would be building a fairly defensive, fairly status quo, ish, fairly MVP type catalog. And you know what? Honestly, for a lot, it would be enough. It would be enough. Right? And that's okay, right? If that meets your requirements and you are, and that's all you need to do. But I, but I do see, I honestly, truly believe that there's transformative value here. And yes, we need to acknowledge it's not just a silver bullet, it's not just a technology. We've been talking about a different way of managing our businesses, a different way of managing governance, a different way of managing data. But I just, I just aspire for a lot more because I get really excited about this stuff. I get really excited about it because we've been talking for literally decades about how data can transform businesses. And I think we are on a precipice here. I think a lot of it has to do with AI and people who figure this stuff out will be driving transformative value. I have no doubt about that. No doubt.

Juan Sequeda [00:44:57]:
So I'm connecting all this stuff in my head right now and I see a lineage, a path around this which is, I think three main topics we've had in this discussion has been around AI Gen, AI LLMs has been about the whole knowledge management and I'm going to put kind of bringing the unstructured stuff in here and the people and the knowledge management and then of course the business value around this. Right. Connecting always to the business. And I feel that we've always having these conversations about, oh, how do we provide, show the valuable product of the business. I'D argue that a part that has been missing has been that knowledge management. And I think we are now starting to talk about the knowledge management and realizing that we actually need it is because of the LLMs and generative AI. So I think it's fascinating that the way I'm seeing is that the generated has, is this catalyst, this inflection point to say oh, it's amazing what it can do. And I see it from a consumer point of view. I want to bring it to my enterprise, but for my enterprise I have more context. I need more of that knowledge which is not. Which there is already in all the technical metadata that I'm doing. But there's just so much that I need help to automate a bunch of that stuff. So that's one thing. But also there is all this there, there's like I, I now need to kind of move from this data first to this knowledge first context, first world if I want to do that for the AI. And guess what, like the business, if I'm talking about the metrics, talking about the cross selling, whatever, that's really the knowledge of what's in people's heads and stuff that I need to be able, that has been missing. So I kind of feel that we're making this really great connection around of generative AI and LLMs with all this knowledge management and which altogether will help drive the business value. And I think the question for me is kind of where we start off is this one place, is this one tool, is this one technology? And, and, and I think it's kind of hard to say it's going to be just one thing because I think the pendulum always swings, right? We have, we go and it goes back and I think we'll find kind of a balance of things of, of like what are, what's the minute, what's that? What's an MVP of, of a best. I know we say this a lot, the best of breed stack of things but I think in, in this MVP, this MVP will include things of course LLMs and generative AI will include the whole knowledge management which itself, there's so much to unpack and then overarching it altogether is very clear. The business focus, the business value.

Malcolm Hawker [00:47:35]:
Yeah, I don't, I don't know if it's going to be one thing. It probably isn't because. Because most IT buyers and most IT leaders have like kind of these antibody responses to monolithic tech stacks that do everything and pretend and you know, portend to solve world hunger. You know, our BS meters Just kind of immediately go up. Speaking of BS meters, you know, I was, I was, I led a data analytics function for many years and when I would hear people like me and analysts talk about all this future related stuff, I'm like, okay, that's fine dude. You know, that's interesting. But, but let's, let's bring it home, right? Because, because I got problems to solve right now. In that vein. If you align and kind of are excited to some of the things that we've been talking about, if you see the future of data catalogs going this way, if you see the future being more aligned to knowledge management, the traditional data management, what could you do today as a data and analytics leader to start moving gradually that direction? There's a few things that I'd be doing. Tim touched on this earlier. 1. Unstructured data. We need to expand the scope of our governance and management practices into unstructured data. No if, ands or buts. 2. Knowledge graphs, complex rag patterns, vector databases. Implementation of some idea of a chatbot copilot today, right? Using maybe more of a complex rag pattern that is optimized through text, right? Or optimize using a knowledge graph to answer maybe some fairly basic use cases that are internally focused only so you can mitigate and manage any of the fallout that may be associated to hallucinations. Because you're never going to eliminate them completely and you probably don't have the resources to put a human in the loop with every touch point between a copilot or smart agent and somebody you know looking up your HR policies through your new HR chat bot, right? You probably don't have the manpower to put a human in the loop all the time. So there will be spillage, there will be hallucinations. But if you are looking to be a forward leaning data leader and you want to check the box on AI, there's some great stuff here and there's some steps that you can start taking today, right? I'm not advertising for Microsoft, it doesn't matter. But there's a lot of tools out there to go build copilots right now. They're out there right now. Learn by doing. Start tinkering, start playing with some of this stuff now. Get familiar with knowledge graphs. Knowledge graphs are not going away. They're going, they're going to be part of this foundational stack. If you don't have this technology, if you don't familiar with this technology, you need to learn it another way. Start thinking differently about data. Start thinking differently from the perspective of it's. Not just these deterministic rules. It's not if ands or if. It's not these kind of highly structured ways of looking at the world. Think, start thinking about context and more probabilistic approach to the world. Start thinking more probabilistically about things. Start thinking about how context matters. And what is true to somebody in marketing may not necessarily be true to somebody to finance. And they can both be corrected at the exact same time. That's okay, right? We need to break away from all or nothing thinking. We need to break away from deterministic ways of thinking that pigeonhole us to this false belief that things are either all one thing or all another. Because they're not. They're not. These are some simple steps we can take as data leaders today. Up our game around AI up our game around what do, right? What do LLMs actually run on? How were they built? How were they created? What data went into creating open AI? What data is needed by these systems to run better, to run faster, to eliminate hallucinations. These are simple things we can do now. Just right now. Sorry, I'm ranting a little bit, but as a data person, you start talking about future stuff, it's like, okay, that's great, but what can I do today? And there's some things that we need to be doing right now.

Tim Gasper [00:51:17]:
Yep, I think you stated that beautifully. And part of that is, part of that are things that organizations can do just programmatically and via initiatives and via the projects that they're running. You don't necessarily need a vendor or a product to help you start to tackle the governance question around unstructured data, or to start to develop your copilot for your business, or to start thinking differently about data. You can start to do that today as a data leader or as a data manager, trying to align yourself with where exciting innovation is happening and where the best value is happening.

Malcolm Hawker [00:52:02]:
Yeah, couldn't agree more. There's one thing that I didn't really touch on, and we had talked about this earlier, is this idea of a catalog as a marketplace. And I would aspire for more because I don't want to be a service desk, right? As a data leader, I don't want to be a service desk. Take your ticket, get your solution and move on. I don't want that. And I don't want to be just a store. I don't want to be transactional. I don't want to be where somebody walks in, gets their stuff and walks out. That's not what I want I want to consult to my business partners. I want to help them understand better ways to run their business. This is a consultative interaction, not a transactional interaction. So if we can start thinking about that way. Yeah, I love a catalog as the storefront, right? As the door that people would come through. But instead of there being this transactional, I'll give you this and they'll go away. Here's your dashboard, right? Or total diy. Maybe some DIY is appropriate for some use cases for people that are, you know, no SQL or not NoSQL know SQL. For some maybe DIY is okay, but for others DIY is not going to be okay. Your CIO, your CIO is not going to go or your CEO is not going to go. DIY tableau. It's not going to happen. Right. So maybe a more bespoke interaction. But again, this idea of the catalog becoming the place where business insights, where you go for business insights instead of just going for a dashboard. I would argue a part of that would necessarily include this idea of product management, not product data products. Data products are the how, you know, like data products. But, but the real why and the what here is product management and putting the customer at the center of everything you do and design centric thinking and other things that Neil has talked about and many, many others have talked about. So yeah, yeah, I love that. I love the data catalog as a storefront, but I don't want to be transactional because there's very little value in being transactional and just selling stuff like you were on Amazon. It's really about becoming a business consultant.

Tim Gasper [00:53:58]:
To interpret kind of what you're saying in another light. You tell me if I've got this tick wrong here. The future of data catalogs is not to become a data marketplace. That's a very small and potentially very useful, but it's a very small aspect of the future.

Malcolm Hawker [00:54:14]:
Well, maybe there's an agent there that is your internal version of a Deloitte consultant. Right? Maybe that. Maybe the face of the catalog, for lack of a better term, is like, you know, here's, here's your internal HR consultant, your internal BCG or your internal McKinsey consultant. Maybe I'm totally spitballing here, like. Yeah, but not just being transactional.

Tim Gasper [00:54:34]:
I mean, this is, this is what I think bridges into what could the future look like. And maybe as a sort of a final question, sort of a two part question here, you know, what does the future look like in the medium term for catalog and metadata management? And then what does the Future look like longer term for catalog. And I have some thoughts on this too, but I want to kind of start with you, Malcolm. What's your take? What's the medium term look like? What does the farther future look like?

Malcolm Hawker [00:55:07]:
I think that the medium term is data catalogs and the practitioners that are using them day in and day out start using them as a way to move closer to this world of data management or of knowledge management. And a good starting step there is using the catalog to manage and govern unstructured data. Just spread your wings over all of that unstructured data. Help your catalog to do it. Push your vendors to do things like profiling and discovery of this data, tagging of this data. If your catalog isn't doing that, push your vendors to do that. That's, to me, that's the medium term where it is like, okay, wow, I know now that I've got 15 different versions of the same FAQ that all say different things, right? That's, that's, that's worthwhile, right? Because I can deploy data quality to that. I can start to fix that. Longer term is everything that we've been talking about is, is becoming, is becoming this place where the facts of the business. I love that term, the facts of the business. It's the repository of the facts of the business. It is holding the narrative of the business, of the truths of the business in a highly contextualized way. I love that.

Tim Gasper [00:56:19]:
I like that language as well. The repository of the facts and the narrative and the story of the business. So I think that's a great vision of the medium term and the long term here. I think I agree with a lot of your perspective here. The one additional thing that I'll add, which is just a sort of who knows, like a who knows what might happen, right? Kind of a comment. And then, Juan, I'm curious for your thoughts on all this is I feel like the traditional boxes in the architecture diagram. In the longer term, I think the medium term it's going to be pretty stable. But I think in the longer term, AI and LLMs and whatever LLMs become as they continue to iterate. Iterate is going to very much confuse, hopefully in a good way, the data space. Because to your comments earlier, right? Data management, like we can, you can, we're going to get to a point where we automate data management. Like you're going to get to a point where you can say, I got this much money to spend. Here are my systems that I use. Hey, agents, manage this stuff for me. And if you're confused about anything, you let me know and I'll weigh in. Right. And it's basically going to be your infrastructure or data engineer, you know, automated autopilot. Right. That's going to happen. Is it going to be five years from now? Is it going to be ten years from now? We don't know. Right, but that's going to happen.

Juan Sequeda [00:57:40]:
One quick parenthesis. Yeah, I just like, I believe with that like writing the code, the pipeline code, the observability that things that, that all will be automated.

Tim Gasper [00:57:49]:
Yep, yep. All of that's going to get automated. Right. It's going to be conversational and when it has questions, it's going to converse with you. But more and more, if you're, if you're crawling your unstructured data and all your knowledge and things like that, it's going to know there's going to come a point where your AI knows more about the business than if you got your 10 smartest people in the room talking to each other. The AI will know more. Right. We'll get to that point pretty soon here. So then what happens to the data stack? Like what is going to become of that? Do you need analytics tools? Do you need ETL tools? Or does that really start to look much more like knowledge capture, knowledge governance and where appropriate, human in the loop? What is that? We don't even have a name for that thing right now. Right. Like that to me is like 10, 15 years from now. The future is, might look pretty different.

Malcolm Hawker [00:58:38]:
Well, if you ask me, there are indicators of this already happening and I hate to use a specific vendor, but I'll use a specific vendor because from my perspective, they are a little ahead in this. I look at what Microsoft is doing from their instantiation of the fabric and it's a V1 product. It's got a long way to go. But, but what is what I would argue what they are doing. What they are doing and, and yes, it's a conversion to Delta parquet and it's fine. But what, what they are doing is commoditizing persistence. Right. For it, it for a long time. Well, persistence mattered. And this vendor had this type of persistence and this matter had this vendor did this and this vendor did this. And we are fast approaching the point where who the hell cares, right? Is persistence really a value driver to the organization? So if we think about a stack, if we think about it literally as a vertical stack, where we are moving is that at the bottom of that stack will be increasingly commoditized and we will as we commoditize persistence, we will increasingly commoditize the management. Then we will, then we will commoditize analytics to a certain degree, but there will still be value at the higher levels of the stack. And that's why the focus on business and business outcomes and business management and business processes is so important. Is so important because that's where the value is moving. We are commoditizing the base layers of the, of the stack increasingly. Increasingly. So I would even argue the data management software will increasingly be commoditized. Him. We talked about this earlier to the, to the point where 1. There will be no fundamental difference between one and another because this will all be a function of the AI. But I think we're already seeing this, I think we're already seeing it with, with arguably what is, what is the commoditization of the persistence layer.

Juan Sequeda [01:00:29]:
Th. This, this moment right now is why I feel extremely lucky that we get to do this podcast talk to so many people. What you just said, Malcolm just connected so directly with a podcast episode that we did with Dale Williamson. He's the field CTO of EMEA Databricks. And we had this exact same conversation.

Malcolm Hawker [01:00:50]:
Interesting.

Juan Sequeda [01:00:51]:
Like the stack going from data to knowledge. And he's like, look at the bottom of the stack, which is where we focus. It's data and kind of technical, it's engineering mindsets and it's where we, where we focus. We're able to be organized standards because heck, yeah, it's easier to go do we have parquet, we have all these title formats and then we go up on the stack. That's why we have sequel, all these things. But the moment they start getting higher, we start getting to this fuzziness and then we have to go deal with the people and the knowledge and the politics and all that stuff. Like that's the part where it's, it's harder today because if you deal with all this stuff, by the way, the LLMs will actually help us basically help us drive that context. So getting up the stack is, is, is we're driving up that stack and as, and as data professionals, we, we get into that fuzziness and what happens, well, that's too complicated, humans. And we go back to our stack and I think this is, this is where we're going to start seeing the definitive leaders, people saying, overcoming that and using the tools and technologies that we have today with this new generation to help us overcome that fuzziness and feel comfortable getting into the knowledge world, working with the people and overcoming that fuzziness. So, yep, that is great combination. So this. Tim and I were like, do we have a takeaway? Like, Tim, have. Tim and I, we have been taking notes, but I don't think we would do a service of, like, summarizing, because this conversation is the takeaway of, like, 150 comments of your podcast. And I think. I think people need to listen to this entire episode. Or do you want to do any. Any final words, comments? Because we've already been here for, I don't know, over an hour now.

Malcolm Hawker [01:02:29]:
No, for me. No. It's my honor to be here. It's my honor to have a platform to share my cockamamie ideas about what's going to be in the future. And I reserve the right to be completely wrong. But you'll never say that I'm short on passion.

Juan Sequeda [01:02:45]:
I think there's something people take out of it is that the three of us are very passionate about what we do.

Tim Gasper [01:02:53]:
We're excited to see a better future. And I don't know. Malcolm, any final thoughts? Any final parting thoughts for the listeners?

Malcolm Hawker [01:03:03]:
Well, if you're still here for one, thank you. I hope that you have found value in the last hour. That is my goal. If we're not Already connected on LinkedIn, please connect with me on LinkedIn. DM me. Tell me I'm crazy. Tell me I'm right. I don't care. I will have a conversation. Love to continue the conversation online. Please let's connect on LinkedIn. And thank you, Juan and Tim, for doing this is fantastic.

Tim Gasper [01:03:29]:
Absolutely. Let's, let's. Let's make the utopian future become a reality, because I think that we all, all in the data industry and beyond, we have our finger on the button and we can control where this thing goes. So let's, let's make the future that we're excited about.

Juan Sequeda [01:03:47]:
Thank you so much, Malcolm. Cheers.

Malcolm Hawker [01:03:49]:
Thanks, all. Cheers to you guys.

Tim Gasper [01:03:50]:
Thanks, Malcolm.

Special guests

Avatar of Malcolm Hawker
Malcolm Hawker Chief Data Officer at Profisee and host of CDO Matters Podcast
chat with archie icon