NEW Tool:

Use generative AI to learn more about data.world

Product Launch:

data.world has officially leveled up its integration with Snowflake’s new data quality capabilities

PRODUCT LAUNCH:

data.world enables trusted conversations with your company’s data and knowledge with the AI Context Engine™

PRODUCT LAUNCH:

Accelerate adoption of AI with the AI Context Engine™️, now generally available

Upcoming Digital Event

Are you ready to revolutionize your data strategy and unlock the full potential of AI in your organization?

View all webinars

Generative AI, Data Products and Business Value with Jon Cooke

Clock Icon 62 minutes
Sparkle

About this episode

With the hype of Generative AI, how do we keep focused on the goal of delivering the right valuable and business facing analytics and data use-cases with as low friction and maximum agility as possible? Jon Cooke, founder of Dataception, has a lot of honest no-bs thoughts to share!

Tim Gasper [00:00:06] Hello everyone, welcome. It's time for Catalog& Cocktails, your honest, no BS, non- salesy conversation about enterprise data management with tasty beverages in hand presented by data. world. I'm Tim Gasper, longtime data nerd, customer guy, product guy, joined by Juan Sequeda.

Juan Sequeda [00:00:22] Hey Tim, how are you doing?

Tim Gasper [00:00:23] I'm good.

Juan Sequeda [00:00:24] It's Wednesday, middle of the week, end of the day, really late where our guest is coming in from tonight. We're glad we've been on the road for the last couple of weeks. We've kind of the, honest OPS thing here is that we've kind of taken a little bit of a break, but without taking a little bit of a break, we hadn't had any guests. But anyways, super excited to be back. And today we have Jon Cooke, who is the founder of Dataception. Tim and I met Jon last year at Big Data London, finally in person and it was a blast to meet up with you again a couple of weeks ago and here we are. How are you doing, Jon?

Jon Cooke [00:00:56] Awesome. Yeah, really good guys. Yeah, really good to be on. And yeah, things are going really well. As you said, it's a little late now, so I'm a little bit bleary- eyed. I haven't quite got the match sticks out, you know. It's all good. Yeah, things are going really good. The whole data product thing is obviously what I've been looking at the last year or so, has really taken off, and generative AI, LLMs. I mean, I've been in industry 30 years now. Every epoch that happens, like big data and cloud and server architecture and whatever, I think that's going to be the last one, but there's then a few years later there's another one. So I still love being in it 30 years later, right?

Juan Sequeda [00:01:28] This is truly exciting. And every time we can say that we're in the very exciting times inaudible. So really excited. But hey, before we get into it, let's kick it off. What are we drinking and what are we toasting for?

Jon Cooke [00:01:40] Cool. So I am drinking, I don't know if you can see this, a gin and tonic with Aperol. So it's a proper cocktail and yes, it's very tasty. It's quite a fruity type stuff and I'm, I guess toasting again, the whole data product movement, which I'm loving and just having these great conversations with people and just toasting a great time in tech and data. So there we go. Cheers.

Juan Sequeda [00:02:05] Love that. Cheers. Tim, how about you?

Tim Gasper [00:02:07] Yeah, I'll toast to the data product movement as well, as somebody who is a product guy through and through. to see product management become more adopted by the data community in its own way is very, very exciting. And by the way, I've never tried Aperol in my gin and tonic. That's something that I got to try out. That sounds-

Jon Cooke [00:02:24] It's nice. Don't put too much in though. You can over it. Just a little dash, it's very toasty. Yep.

Tim Gasper [00:02:30] Okay. All right, I'm going to try that. But today I am drinking a little bit of a Treaty Oak Ghost Hill Texas Bourbon. They're single barrel, so pretty close to Austin, about maybe 45 minutes away is a distillery called Treaty Oak. One of my favorites. A good local place.

Juan Sequeda [00:02:46] Well I have a gin, I found some gin from Stockholm in my bar. I don't know who left it there. And then I am testing it with some cranberry sparkling water and then I put some strawberries and lime. Nice refreshing drink for a Wednesday afternoon.

Tim Gasper [00:03:04] The tail end of summer.

Juan Sequeda [00:03:05] Yeah. Texas is still kind of summer hopefully. I think it's supposed to get cool in the next day or so.

Tim Gasper [00:03:12] This weekend. Yeah, this weekend going to get-

Jon Cooke [00:03:14] Guys, I'm from England, so talk about cool, and I mean in terms of temperature...

Tim Gasper [00:03:20] We'll still be wearing shorts and short sleeves but comfortable instead of sweating our balls off, right?

Juan Sequeda [00:03:27] Anyways, cheers to the data product. I think that's something that we definitely...

Jon Cooke [00:03:31] Cheers.

Tim Gasper [00:03:31] Cheers.

Jon Cooke [00:03:31] Cheers.

Juan Sequeda [00:03:33] All right, so we got our warmup question today. It's not really that funny, it's really serious, but let's start with some seriousness here of generative AI. So what's the most surprising thing you've seen with generative AI? Or actually what is the greatest failure?

Jon Cooke [00:03:50] So yeah, really interesting question. I mean, I say, I just done my first generative AI project back in 2016,'17 before transformers and wasn't even using neural networks, it was actually using Naive Bayes and basic machine learning. And the most surprising thing was one, A, we got it working, it was a chatbot and we were querying that kind of stuff. But what was really surprising, the amount of actual effort of manual handcrafting we had to do to try and get the corpus of data into a shape where it can actually be queried and you had to build the whole chatbot stuff. What's really surprising to me in the last probably 12 months is I can do that probably with a few lines of code, and I know how much, it took us three months to do it that long ago, but now with LangChain and LLaMA and other open source products, you can actually do this really, really quickly. So that's really the surprising thing for me.

Juan Sequeda [00:04:45] I think for me right now, like combining with the vision stuff, now it's just getting crazy. The vision stuff that got announced this week or last week, whatever, every week you get amazed in things that you didn't even think about. So I think this combination of vision is something that is-

Jon Cooke [00:05:02] The LLMs are still no good at doing forecasting stuff they don't know about. So that's a real interesting piece around this, something I'm looking at at the moment. So they're very good at getting into a bit more, but seeing the facts, what's happened, and even a little bit of the why, but actually when you start looking and say what's going to happen, what do I do next? All that kind of stuff, which we can get into, there's still a little way to go for that.

Tim Gasper [00:05:24] Yeah, I agree with that. Yeah, I agree with Juan on the vision piece. That's been very exciting. There's all the gen AI hype Twitters and one of the threads, somebody took a picture of a really convoluted parking sign.

Jon Cooke [00:05:39] Yes, I saw that. Yes.

Tim Gasper [00:05:40] It was basically like, am I allowed to park here? And it was like, yeah, you're within the hours. It's like, oh good because I had no idea what the sign was saying, right?

Jon Cooke [00:05:47] I wonder, could you sue it if it gets it wrong and you get a ticket?

Tim Gasper [00:05:50] That's the thing, right? Nobody wants to have the liability, right?

Juan Sequeda [00:05:55] All right, so let's kick it off. Our topic today is data product generative AI business value. And I want to start off with honest, no BS, do you have a pedantic definition for data products?

Jon Cooke [00:06:08] I do and it doesn't matter. This is the thing. And we've talked a lot about this in Brian's Data Product Leadership Community and actually he has a very nice way of saying it. The verb is more important than noun, i. e. the process of the product management of actually what you come out on the back of it is much more important than how you actually clinically define a data product. And the other thing I would say in the real world, we don't define a product. If we're going to buy a running shoe or I go and get a mortgage, those are products but they have very different classifications. So why do I try and classify them? So why in data do we need to feel the need to actually clearly define a data product as this, this, and... you know? So I tried it a year and a half ago. I put my definition about what it is and this is the things I talk with businesses about. So I'll go to a business person and say, " You want a forecast or a metric or a credit model or what have you." I'll talk about products in that way, but I won't use the word data product. But actually that's the language they understand and actually resonates with. So coming back and saying a data product is this and it's only this, to me it's about the business outcome and what the product management process to get there is. That's actually more important than defining a product, in my mind.

Juan Sequeda [00:07:18] So I wanted to start off with this. I was very curious what you were going to you say because a year ago, I think the whole conferences, right, we're getting out of COVID, people were going off and then meeting with people and it was all... I mean, data mesh was the big topic last year. Then out of data mesh you hear data product, but then everybody was like, "Well what do you mean by data product?" Tim and I, we didn't want to give a definition for data product. We also were kind of giving our definition, we came up with our framework, the ABCs of data product. And then I think the argument's like, well if you say it's anything, then anything could be a data product. And then it defeats the purpose, which I guess it's true, but at some point then we just become too pedantic about it and it's like, wait, we're forgetting about, it's the process you're doing, but again, the business value. So I think we all got tired of figuring out those pedantic definitions. If you hear somebody trying to be pedantic, so what do you tell them now?

Jon Cooke [00:08:14] I used to be quite vocal and try and say, " No, that's not it," and what have you. I could be a bit spiky, you know me guys, I'm not backwards in coming forwards, if you know that expression. But now I'm just like, well, I've actually been through this process a number of times with business people and I've actually gone through and actually come out with products that actually define their business strategy. And the way I did this with a company a few months ago, and they're a small startup and they do credit models and we came out with three actual products at the end of it, I sat in a room for a day with the CEO, the COO, the head of product and the CTO, and we thrashed it out. We came up with three products and we said, " This is the data we think we need. This is the strategy," this kind of stuff. Then they came out and now they're building it. It's like that, to me, is a win. The clear definition for them was what these things were, but for someone else it might be different. That's the process we need to get in. So now I actually get, I'm much more relaxed about it. If you want to call a dataset a product, then go for your life. I don't think datasets are data products. They're important pieces, but the main thing is solve the business problems and resonate with the business people because data doesn't really resonate with business people. Those kind of things that solve their problems resonate with them. That's where you get the outcome. So yeah, I tend to talk through that sort of process rather than strictly getting into some sort of debate over what a product is or isn't.

Tim Gasper [00:09:30] So you mentioned credit models, right?

Jon Cooke [00:09:33] Yeah.

Tim Gasper [00:09:34] And so I think that some people think very narrowly about data products and they're kind of like, " Well, it's got to be the data itself and it's something about the data itself." And credit models obviously are more of a complex thing. It's sort of a combination of statistics and a model, more of a math model behind it that has various data sources and things, right? And interestingly, so I was visiting a customer of ours a few weeks ago and they had a very, very expansive kind of view of data products, like hey, if it's a dashboard and somebody created a derived version of that, that's a derived data product, and because it's something we're going to maintain. They had their list of things that they thought kind of made it a data product, but it was a very inclusive and expansive view. Kind of curious from your perspective, do you think it's kind of better to take a more expansive view? Do you encourage that?

Jon Cooke [00:10:31] So it really depends on the maturity of the organization I think as well. The first thing I do is actually define within an organization what does it mean to them? What's the unit of delivery that solves the business use case for them? So if it's the dashboard, you could argue either way. And again, what you really don't want to get is this philosophical kind of debate because that doesn't get you anywhere. What you want is something that actually there's tangible value and actually solves the problem, and actually has a customer or internal and what have you. So for me it's defining that. If someone wants to call a dashboard a product, if it's got a product management process, if you're looking at solving a business problem, if it's your customers, you've got users, that sort of stuff, then that's absolutely fine. But the interesting thing is it's that customer- user interaction. It might be a direct one from the business person, operations or what have you, or it might be an indirect one, like a recommendation engine. It might be a really simple one, like a simple item collaborative filtering type recommendation engine that's serving an e- commerce website. That's a data product in my mind. It hasn't got any direct users, it's obviously publishing recommendations on the website, but that's got a lifecycle to it, it's got business use case, it's that kind of stuff. A simple metric could be a data product. Imagine the total monthly sales that the CFO, the CRO and CEO all agree and that gets deployed to a container, whatever it is, but they can all hit that metric and that gets the same calculation. That, to me, is a very, very simple data product. And that's really kind of the angle I come in, that the data itself is a communication mechanism. It's not really, unless you productize it by sticking a nice, like a flight information deck on an aircraft, all raw data, but it's all being presented to the users, that kind of stuff. So that data has now been productized. Data itself isn't the product, it's the whole thing and the whole experience, in my mind.

Tim Gasper [00:12:16] So you've got some flexibility in terms of how you, as an organization, may want to define it. Obviously there's a clear couple of trends right around sort of outcome oriented. There's a customer, which in our ABC framework we think of as D, downstream consumers, or there has to be a customer, usually user, somebody's using it, right?

Jon Cooke [00:12:33] Yeah, exactly.

Tim Gasper [00:12:35] And also there's some kind of a repeatability aspect to it. Like it's a reusable component.

Jon Cooke [00:12:40] Yeah. I mean I guess sort of the equivalent is there's a market. I mean, if you want to go into product management theory, there's a market, there's more than one consumer. Obviously if it's internal versus external, there's a subtle dimension because obviously if it's external, it's simple. Is someone going to pay you money for it, right? That's the thing. If it's internal, you tend to be more pushing it out as part of an operating model for multiple users and that kind of stuff. But it's also trying to get multiple customers, not just from a technical perspective reusable, but to agree on the business rules and the business logic and what that's doing. Like I said, you've got to agree that if three different departments are going to use a business component, business orientated component, it is the same thing and they all understand what it is and they all agree on that kind of stuff because otherwise you get three different versions of it which we've all seen, right, " Just give me the data and I'll recreate that sales dashboard or that finance report or whatever." That's what you want to avoid. You want to say, if I want to give you a number, we want to agree what that number is and everyone agrees on the same number effectively.

Tim Gasper [00:13:41] Right. Yeah.

Juan Sequeda [00:13:42] Okay, so the data products, I think we're in agreement here and I think we've seen this evolution that it's really about the process, make sure you have a process behind it and there is business value, there's a market for it, there's users. So I think that's a clear thing and then we don't have to be pedantic about it. So by the way, you said something I think is another thing we should put on a T- shirt, because we're going to talk about T- shirts in a bit. Data itself is just a communication mechanism. I like that. Goes back to we got a startup T- shirt-

Jon Cooke [00:14:15] Yeah, indeed. The side hustle.

Tim Gasper [00:14:16] I feel like that's going to become a running joke of this show now. People who listen for a long time are going to be like, " You guys have been saying since episode 30 that you're going to start a T- shirt store."

Juan Sequeda [00:14:27] Hopefully somebody actually starts it and then... Anyway, okay.

Jon Cooke [00:14:31] Brilliant. Where does Gen AI, what LLMs fall into this?

Jon Cooke [00:14:35] So this is interesting. I did a presentation a few weeks ago, actually just before Big Data London around data products and the nexus of data products and LLMs. And the way I see it, again, this is the way also I've been using it, I've done a number of generative AI projects over the last four or five, six years. And from a data product perspective, there's kind of three different kind of use cases, archetypes, patterns you use for LLMs in a data product kind of approach. The first one is basically where you're doing what I call copilot type use cases where you're asking the LLM to generate some intermediate representation of code or something like that. So write some SQL or a data contract or something like that, and you're getting the LLM to write that for you and then you're executing that separately out of that. And that's kind of the copilot use case. And what I'm actually doing at the moment is actually looking to generate data product definitions using LLMs. So create me a forecast, create me a metric, bring in that dataset, that kind of instruction- based stuff where you're actually building, I've got a thing called the data product pyramid, which is basically a dependency graph of data products, of analytics components. And I'm doing that all manually at the moment through UX and that kind of stuff. So actually, you have an LLM doing that for you in front of the business person. That's super powerful. So that's like the copilot version. The second archetype is basically what we call a platform feature or an infrastructure feature. So when you're bringing in data, that's the classic model, like doing data point extraction. That was a customer use case I did fairly recently, or sentiment analysis, or I'm looking to language translation or to augment the data as it's coming into the system. That would be like a platform feature. And that's really where the LLM, it doesn't have to be a huge one like an LLaMA, could be BERT or something, a really much smaller one doing classification or NER or something like that as part of the platform feature. That's another classic use case. And the third one is where you've actually got the LLM as the core part of the product. ChatGPT, in my mind, is a really good data product. It's a very complicated one, but it is actually a whole data product and the LLM's right in the middle of it and it's been the core USP. We've seen this kind of explosion of domain orientated LLMs, like Bloomberg's got one, I saw there was a medical NER one that's just been published where you can actually go through medical classification of terms and stuff like that. That might form the core of the third one. So actually, you can actually use it to do things like query the data product graph. I think we've talked a little bit about this one. Imagine you've got this ecosystem of data products with all the APIs on them. I want to be able to query in that, so what's the forecast going to be? What's happened, what does it mean? What's the credit rating of this kind of stuff when you're interacting with it using natural language, but it's actually calling through either RAG or through functions and that kind of stuff to actually call the products themselves. So there's lots of scope where LLMs can come into the data product process and architecture.

Tim Gasper [00:17:25] Interesting. I think on your LinkedIn you have this image, it says business ask, gen AI/ LLM, data product business value. So what does that mean and why is gen AI so early in that flow?

Jon Cooke [00:17:43] It's based on the process that I've just talked about. So when I go into a business, I go to a business person and I say, " Right, let's build some analytics for you, let's define the decision you want to make, the outcome you want to make and oh, that's a decision model. What do you need to make that? Oh, you need a forecast of what's going to happen, you need to understand what's happened before." That's a metric, that's a forecast, that's a credit model, all that kind of stuff. I'm doing all that stuff on whiteboards at the moment, whiteboards and PowerPoint and this kind of stuff. The idea is that you actually then can use natural language processing to actually create those things in front of the user. So actually you'll create the metric in front of their eyes, fundamentally. So if you can imagine, you're going through that process, you're using the LLM to actually instruct to build that dependency graph of data products, you're prototyping them and then you obviously can industrialize them and bring in the data and all that kind of stuff. But that's actually giving you the business output and then you publish them straight out to a mesh or fabric or some of the inaudible architecture and you've got an instant living, breathing ecosystem of data products that actually give the business value. And because you defined them with the business, which is very unusual for that kind of breadth, you've got this end- to- end kind of what we call the OODA loop, observe, orient, decide, act loop, which you can continually go round very quickly, very easily, but you're using an LLM to really accelerate that whole process. So that's really what that's describing.

Tim Gasper [00:19:06] Interesting.

Juan Sequeda [00:19:07] I'm processing where you're going and let me rephrase this in my own words here. You give three different kind of descriptions or characteristics of three types of things around this data, products and LLM.

Jon Cooke [00:19:19] Yeah.

Juan Sequeda [00:19:20] So the first one you call it the copilot, and the way I'm interpreting this is you use LLMs to help you build the data product?

Jon Cooke [00:19:28] Yes.

Juan Sequeda [00:19:29] Then the second one is like, oh, it can help me do classification sentiment. So it's just another feature that I go use in my data product later on. So that's another one.

Jon Cooke [00:19:40] Data product or part of the infrastructure that sits underneath it or... Yeah, absolutely right, yeah.

Juan Sequeda [00:19:44] And then the third one is that the LLM itself becomes the data product, that it's really the brain, the chat with your data, chat with the organization approach.

Jon Cooke [00:19:56] Exactly right, yeah.

Juan Sequeda [00:19:57] And then at the end of the day it kind of... it's like a feedback loop. It all gets together.

Jon Cooke [00:20:03] Exactly right. In the process, again, what it does, it supports that genuine product management process where you're asking the business, you're prototyping, you're ideating, and then short- cutting the production of those obviously into production and then you're iterating around as the business changes, and that's really the thing that I've been sort of trying to push probably the last couple of years, is really that agility. And something I've always struggled with a bit with a lot of data architecture strategy, you don't have that agility. It's like we've got to get the data all in one place before we can do anything with it. And for me it's actually the product management approach, it's actually the other way around. If you look at the startup world doing product market fit, you're getting a bare prototype out as quickly as possibly, genuine MVP, and then you're iterating with the business, with your customers to actually see if it's going to work. If it doesn't and you pivot, you can do that quickly and you iterate around it. And that's what I'm trying to bring to really the analytics world in general with a process, with some tech, with some thought leadership and some IP in there as well.

Juan Sequeda [00:21:03] This is what I'm personally really excited about. The generative AI, it's a fact. This is fact now. The amount of productivity gains that you generate, it's just outstanding. This has been very well documented. So I think this is one of the things that we should be figuring out. What are tasks that we can reduce the amount of time, what can give me more bang for my buck? I love that you're bringing up the whole, well, I'm talking with the users, I'm trying to be able to extract stuff. Getting that process, that knowledge out of people's head and taking that out of their head and putting it into a computer. That has always been such a hard thing. I mean fucking wikis is what we've been doing a wiki, right? Wikipedia has been an amazing thing and we try to go to wikis internally and then they start out with excitement. It dies, right? We have Confluence and things goes off, but if we had a way to just kind of streamline it, that would be ideal. And I think this is one of those approaches. The beauty of it, I think, is that we can really structure this knowledge. At the end, you know me right now, I'm all about graphs. Everything that we talk about, you said it to yourself, I draw it on the whiteboard. You draw these graphs, it's like I can literally start talking to people, say, let's just codify this and then everything starts getting connected. So I think this is, I'm truly on board with this and I think we need to have more, what I've always been calling the knowledge engineer, the knowledge scientist, the data translator role, the data product managers, the people who work with both sides, this should be your tool that you use every single day to make sure that I can extract as much knowledge of what people are talking about and just turn this into code as fast as possible.

Jon Cooke [00:22:53] Exactly correct. But it's not just extracting knowledge, because one thing I found working a lot with the business, sometimes it's about leading them, maybe not driving them but leading them. Sometimes you've got to reframe the problem. And I've had this many times where, how many times do you ask a business, " What do you want?" And they're just like, they don't know or they can't articulate or what have you. And actually, when you drill down into, it's actually something completely different. What they need and what they want, that's sometimes very, very different. And that's something that's really, really resonated with me the last 30 years is how do we actually get something in front of them, especially something visual and say, " Is this what you mean?" Or, " Is this actually what you need to do?, And this is why the pyramid process I talk about, it doesn't talk about data, it doesn't talk about analytics, it says what decision do you want to make? I want to get away from I need a lake or I need this kind of model, I need my KPIs. No, like what decision do you want to make? So you've got an ops manager saying, " I want to reduce cost." Okay, so let's have a look at what are the areas you want to reduce cost in? Oh, you want to reduce basically staff cost, which is not great, or you want to make people more efficient or reduce the number of tickets or you want to get lower cost. Those are the sort of conversations that you need to have as an analytics person. In my mind it's not what KPI do you want or what data do you need, it's that type of stuff because actually when you start reframing the problem or framing it properly in business terms, the light bulbs sometimes go on. Sometimes it goes the other way. It's like, I don't want to do that because that's going to change my job and I don't want to do that, which is part of a business change thing. This is the thing I've also been driving through, that analytics is a business change process and it should be, and it very rarely is seen as that. We could talk a lot about CTOs and stuff, not having the empowerment, not having seats at the table. But part of that is actually not... If you're going to go and build some analytics for your business and it tells the business to do something different, the business has got to accept that. There's no point going to the chief revenue officer and saying, " I can increase your conversion rates from 10: 1 to 3: 1, but you've got to change your sales process." And CRO says, " I'm not going to change my sales process." That's a key part of it. And being able to actually have those kind of conversations and drive those sort of outcomes and change them as they go is a super important piece, in my mind.

Tim Gasper [00:25:01] Do you think that between gen AI and data products that this business change process is going to get easier or is it actually either harder or the same, it's just more important?

Jon Cooke [00:25:21] Well, it should get easier. The whole point around it is actually applying a UX kind of process where you design something, put it in front of the business quickly and then iterate around. Trying to get that kind of mentality into analytics, because what you don't want to say, " Right, we're going to go and build a credit model or do a forecast or come back in six months time and we'll show it to you then because we have to go and get the data, we have to do this kind of stuff." And then the UX, oh here's some scatterplot charts as well. So there's the UX piece in that as well. How do you actually show to the business how to do that? And it's also these things like not just show this is what the analysis we've got, it's like let's do some what- if analysis on that. So you might say, " I want to optimize your sales funnel." It's like, let's change the ratio between SDRs and salespeople. Let's decrease the unit cost, let's muck around with that kind of stuff. And that is absolutely super powerful, but very few organizations really get to that point. Most of the time it's like, oh, we spend six months just getting the data and trying to do the KPIs and the metrics, the information layers as I call it. But actually when you want to start, what does it mean to me, which is the next level up really, the knowledge and kind of decision piece of it, that's where it's really important. Obviously LLMs don't really help with that too much. What they're very good at doing is obviously the lower level stuff, accelerating the lower level stuff. Like you said, capturing knowledge and translating it into knowledge graphs or curating it, that kind of stuff. And also saying what's happened and a little bit why it's happened, if you've got enough in your corpus to be able to do that. But the real value is actually getting up the level and being able to do that quickly and turning it, because businesses changed. I mean there's some businesses, we all know the market is very turbulent at the moment. If you can't react to it, if you can't change it, you can't measure it, then actually you're going to be in a lot of trouble. So it's really trying to get that drive, that whole kind of process.

Tim Gasper [00:27:12] That makes sense. And related to this, just thinking about whether it's gen AI or if it's data products, is there a way to leverage those things where you don't necessarily need to have dedicated roles around it? I think about DevOps and Agile, even though yeah, there are people who have DevOps engineer and things like that in their title, like by and large, DevOps and Agile became a part of how engineering is done. Is data product management going to be similar? Are data people just going to do that as part of what they do or are you more on the side of, oh man, we need the role, we need the data product manager who's going to come in and kind of be the change agent?

Jon Cooke [00:27:56] So I'm on the second one. I actually do believe that ultimately data teams need to become product teams, fundamentally. There's a lot of shifting data from one side to the other, which takes a lot of time and energy and that sort of stuff. But actually if you go from the business first, because you can go either way, go from data first, we've got this data, what can we do with it, or you can go for the business first, let's try and see, solve the business problem, if we don't have the data, let's go and try and find it, augment it, create it, change it, what have you. I'm very much of that kind of second bit. And actually if you have a product management team, and I do believe data product management is a subset of product management, and there's even an argument to say, actually you should put them under the CPO, the chief product officer, just as another type of products that get delivered which are analytics based rather than digital and the other side.

Juan Sequeda [00:28:42] All right, sorry, a couple of topics I want to hit you on on this one, because you're talking about where should they report. I think this is really important. I want to hold that thought for a second, but I'm going back to you Tim. Like you brought up maybe it's a role versus it should be embedded and I'll have that product management, but if you bring in that product management as part of who you are, that means that you actually would have the role then of a product manager.

Jon Cooke [00:29:11] Absolutely.

Tim Gasper [00:29:12] Well, and it makes you think about the topologies of software teams. And interestingly, even though things like Agile and DevOps have been more kind of embedded in with the way that teams work, or even modern testing, like modern testing tends to be done by the engineers themselves. What we haven't seen is product managers as a role go away. That is still a role and a very important one. And so when we think about the data side, maybe that role, to your point Jon, is an important one. And there's kind of a theory around software team topologies, where as you grow as an engineering organization, you kind of create these teams, you create these pods, but you get a little bit of specialization, right? You get more of your infrastructure and your platform teams that need to manage the platform of what the product is built upon for repeatability, for scalability, for performance, all that. But then you have the product teams that are building certain functional areas, certain features, certain user interfaces, and so that kind of a topology, maybe some data teams are embracing that, but most aren't, and maybe that's where things need to go, will go.

Jon Cooke [00:30:23] Agreed. If you look at cloud, right? Cloud's a classic example. We go back 10, 15 years ago, large infrastructure teams, they're all being shrunk down because you've got the cloud team, which is effectively Azure or GCP or AWS, they're performing that exact function. And then you've got the teams here, the business teams and the customers who are actually assembling the services. Okay, they do sometimes have cloud engineers and stuff, but there're not like armies of them. There's maybe one or two just understand, they can navigate the many services. But it's much more customers are building their business apps using cloud services. And it's the same thing I think in data. We need to get to that point where the business teams are building their own analytics and doing that kind of process I talked about with their own business stakeholders, that type of stuff. And the data team, again, split into a platform team or an infrastructure team, if you like, and the product teams get federated into the business teams where appropriate. It doesn't have to be like that because I've built pooling systems where you've got a pool of experts and they get spun up as part of projects and all that sort of stuff. It doesn't have to be like you have to embed everything into the business unit. What you need is those people building the analytics to be business facing, whether they align management up to it or not, we could argue either way, but that's the key point.

Juan Sequeda [00:31:43] So going now to the reporting structure, so I think traditionally what we're seeing is that data teams are kind of underneath the CIOs because they're just like that. I mean, it's technology and then you're here because of reporting and you're like, let's get the machine running, keep the machine going. Now I'm starting... I mean, we're here starting to go see a little bit more of a shift to saying, oh, should they actually go underneath the COO? Because if it's truly going to be about the operations of the business, then it's not just about keeping the lights on. I mean just normal IT things, but here's about how do we can be more efficient, right? Let's go have those, understand what those business processes are and figure out what are the things that we need to go improve and so forth. So that's a trend that I'm starting to go see, but I do hear once in a while people saying it should be under this chief product officer. And so I'm thinking if the organization is actually generating, creating... in this case almost like software products themselves, and that would make sense to go in there, but not everybody does that. So I feel that if you are a software company, you're a data team and want to report on the metrics of your own building and all that stuff, your report ended up to the chief product officer, otherwise it's usually going to be the CIO, but it should end up as fast as you can moving into the COO. That's kind of what I've been observing. I'm curious to get your thoughts on this.

Jon Cooke [00:33:16] Yeah, I mean I've done a lot of product based data, product platform stuff for operations, and I did a system back in 2014 for a large bank, which is very mesh- like where we spun up metric engines for different parts of the organization, transaction reporting, settlements, confirmations, and there was about 30 different domains and it was very, very mesh- like. And you could see they're all under the CEO, but that's a use case. That's not really a cross- functional capability because you want products to be able to actually do process optimization, you want products to be able to help finance, you also want products to be able to drive business. I've got a customer who's built a whole analytics, they're transitioning from a service company to really an analytics company. They've got four million customers and they've built this analytics app where you can download and you can do all sorts of interesting things in it. That's a business facing piece. And really, you don't want two teams, one doing operations, process optimization, efficiency, and one doing customer facing. Because really, there's a lot of similarities in there really. You want a similar process, obviously there's actual money involved in customers, it's slightly different. But you don't want to bifurcate that, in my mind, because like I said, if you can get the process cooking, you can actually build internal and external products in a very, very similar way. And you can actually have cross- products on there as well because they all depend on each other, because you're going to end up with an ecosystem of products across your whole organization, running your organization, which is really the nirvana in my mind.

Juan Sequeda [00:34:45] Tim, are you ready to go on something?

Tim Gasper [00:34:48] No, I'm just processing and thinking about all of this because we're thinking about where the biggest impact is going to be for a data team and where are they going to be the most positioned to succeed. And there's a part of me that thinks it isn't a specific place in the organization that's the right answer. Maybe a better way to frame the question and I'm curious if, Jon, this resonates with you, maybe it will based on what you just said here, is well, where's the center of power in the organization? That's where the data team should live.

Jon Cooke [00:35:26] Right. I mean that's really interesting because one, in large organizations there isn't one, and also the other thing, it can change. An org change or a very powerful exec leaves and someone else comes in or there's a reorg and stuff,

Tim Gasper [00:35:42] A new CTO comes in who's like really just a real go- getter, like oh man. Maybe the data team would be best there, right?

Jon Cooke [00:35:49] Yeah, I remember again-

Juan Sequeda [00:35:50] That's why they do it sometimes, right?

Jon Cooke [00:35:52] Exactly right. I remember a bank a few years ago, they got a CDO in and he said, " The only reason I'm coming in is you give me the whole organization, like you give me the analytics, you give me the platform, you give me everything." He came in with a 5, 000 person organization. So suddenly the power was all centered around him, but he would negotiate that on the way in. So it absolutely changes. We talk about culture and data- driven, that kind of stuff. Ultimately this comes down to, in my mind, the culture from the CEO downwards, how committed are they to using analytics to drive the business? And it comes back to that business changing and that really, if they're not, then it's going to be pockets of power around and you've got to find your exec and you've got to find where that is, and that could be anywhere. It could be the CEO, could be the CFO, and we're seeing a lot more CFOs in this market getting a lot more power. So actually, do they actually go to the CFO? Because then they're going to get money, it's going to have clear run, clear objectives, all that kind of stuff. But really in a mature organization, and there are very few of them out there, the CEO's got to say, " Right, we are going to use analytics to really improve our business. And that includes business process reengineering, that includes business change. So actually, the data team or the data product team rather should come under the digital or the transformation, the change organization fundamentally. So that's where the product managers sit, they have a branch of it. We can have implementation across the board because obviously each business area could have their own small implementation team. Maybe, again, there's different ways to slice and dice this, but fundamentally that's where it's got to come from, because ultimately the business leader's got to drive that kind of change and that kind of appetite to use analytics in an agile way across their business.

Tim Gasper [00:37:34] Yeah, no, that's super interesting. One more question that's related to all of this is around... So imagine you're a data team. You're somebody who works on the data team, and you're trying to figure out how to start to move in this direction. Like, how do I start pushing more business value in the organization? How do I start to push more of a data product approach? How do we start adopting gen AI more in what we're trying to do, whether it's more on the data product management side or it's more in the data products themselves that we want to deliver to make more value in the organization. How do you get started? Especially somebody who's not a CDO or somebody who can be like, " There's now an initiative," right? How do you get started?

Jon Cooke [00:38:23] So that's a really, really good question because a lot of it comes down to empowerment and authority and trust, and you can have great trust, but I've also seen many data scientists come up with great models that can do the business and the business don't want to do it. So the first thing I would be doing is testing, can I push back and can I really have that business conversation and actually steer the business in a particular way? Because if you can't, that's going to be very tricky to do that. That's going to be my first. The second one is basically really how much of the product management ethos is in the organization already? Because if we talk about data product management, it's a new thing. If you look at standard product management, that's really sparsely adopted across the world, you've got very few companies really nailed product management generally across the board. So again, that's another thing, if you look at doing product management and processes, and I think part of what I've been trying to do is come up with a really simple, quick and easy process that a business can understand and actually see value in. And I think that's the thing that's really, I found worked for me. Lots of times of in my career I've said, right, what do you want to do? And you flip- flopped and that sort of stuff. But if you actually have a process or a language that you can actually have a business conversation and get their trust to actually drive that conversation, as we talked about for the framing and that kind of stuff, not talking about data, not talking about analytics, but talking about their problems and how they actually solve them, which is, let's just face it, classic product management. If you can get into that position, that you're going to go a long way. The other option is obviously you do a massive sales pitch to the senior exec and say, " Look, we can 10x your revenue or whatever, or do this initiative," that kind of stuff. You do a more formal approach where you're doing an actual pitch as if you're an external coming in trying to do that. So that's another way of doing it, in my mind.

Tim Gasper [00:40:14] That's kind of the bottoms up versus the go to the top and see if you can make something happen that way, right?

Jon Cooke [00:40:20] Yeah.

Juan Sequeda [00:40:21] But it's clear that you need the role. I mean, you need a product manager. This is the takeaway. You need a product manager and in today's data teams there are no product managers, so you need to bring in these people here, right?

Jon Cooke [00:40:40] Yeah. It's not strictly to this. I do know a few, but you're absolutely right on the majority, yeah, most. I mean the thing about data teams, and I really feel for them, I've been working with them for 20 years, so I know they tend to be kind of pushed quite down in the organization, and a lot of them have come from, they're seen as reporting teams by the business. It's like, oh, they just do reports. It's like, what a big deal. I'm a salesperson, I'm going to get my bonus by getting my comp by pushing this many deals, and oh, there's some reporting I need to do over here, that's them over there. And I feel sorry for them because actually they're doing a massively important job, especially if they're doing regulatory stuff. I did a lot of work in banking and that's a nightmare doing all the regulatory stuff because you have to run around doing, especially in large organizations. And there's still that, I see that a lot in businesses. They're seen as this kind of something down off to the side rather than a core instrument of the business or the support. Even IT is a bit like that, right? In a lot of businesses they're seen as that's just IT. It's like, well, your system, whole business runs on IT. That's why I do think we need to reframe it into that product management piece, up into the product thing. You start having conversations about org charts and stuff where you start talking about product management and business change and stuff. Business people, their ears prick up, right? It's like we need to build a new data platform, we need to do data quality or governance and stuff, and the eyes roll back in the head and it's just like, you do need to do it, but the business people say, " Well, I don't care about that." Right?

Tim Gasper [00:42:05] Yeah. This is another interesting, I feel like analogy that ties into the software world, is that I feel like you do hear a lot of CDOs or heads of data say things like, " Oh, we really need to establish our governance framework and we really need to modernize our data infrastructure and we really need to improve the performance of our data warehousing capabilities." And that's the way that they're kind of describing their roadmap. But if you think about it from a software lens, if the VP of product comes and stands in front of the organization or in front of customers and says, " Okay, what we're going to do is we're going to invest in really scaling out our MongoDB database, and we're going to really make sure that we refactor all the JavaScript, very important." People will be like, " What does that have to do with value?"

Jon Cooke [00:42:52] Yeah. Yeah. You're absolutely right. I talk a lot about data friction. Data friction really comes down to the cost, time and effort, and risk of using data for a business problem. That's it, fundamentally. Especially in analytics, we're trying to use secondary use cases from the transactional data and sometimes you can't because the transactional data isn't fit for purpose. That's friction. I want to do this segmentation as a natural use case for a customer. They've got customers over 60, 70 years ago, they have customers and they've still got them and they didn't require date of birth to go in there because it wasn't a requirement. But the business runs perfectly with that. So they can run these customers and run all the services without, you know, 10% of the customers having date of birth in their CRM system. Marketing are tearing their hair out, say, " We want to do some sort of age- based segmentation around marketing campaigns, but we can't do it." So is that a data quality issue? Well, no, it's not. It's a secondary use case that wasn't thought about when the data was captured. And so what it comes down to, if we want to do that, if you want to go to your, they've got four million customers, so go to 400, 000 customers, there's a cost, there's a risk, and there's an effort associated with getting that, capturing that data. So is the cost benefit around doing that greater than actually the effectiveness of the campaigns we're going to get into? And that's the way I think about the whole thing about data, isn't it? Or governance, all the tech debt and data debt and process debt and everything else. It's basically, you go to the business and say, "You want to do X, Y, and Z. Okay, it's going to cost you X to do that and the risk is this, do you want to do it?" And that's a better conversation to have than say, we need a governance framework, and actually they do need a new governance framework, new infrastructure, scalability, all this kind of stuff. But you're having it in a language and a currency that the business can really get behind and understand.

Tim Gasper [00:44:41] Right. Why does marketing care about that? Why does the finance group care about that, right?

Jon Cooke [00:44:45] Right.

Juan Sequeda [00:44:47] I have to say that I was expecting to go talk more about generative AI.

Jon Cooke [00:44:52] Oh, right.

Juan Sequeda [00:44:52] No, no, but here's the thing. I think that's a good thing because it's just another tool. And I think in the way you described it originally, it's like, oh yeah, I just want this tool so we can be able to go generate business value faster. So I think that's actually a nice kind of outcome out of all of this. A reminder, like this is fun, it's cool, but it's just another tool to help us generate business value.

Jon Cooke [00:45:22] So I always ask the question, do you want to build an LLM or do you want to use one to solve a business problem? What's the goal here? If it's the second one, which is great, we don't know what that business problem is, let's do some investigation and discovery, but that's the goal. And ultimately, it's not about using a hundred billion model, you can use a hundred million. BERT models do half the stuff that you want to do. Because the other thing about transformer architectures, you can actually break the problems down into smaller models, which you can train them in like a day or two, rather than try and put everything in a big Falcon or Mosaic or whatever model as well. So that's, again, top down. You start with the problem, start with things you're trying to do, and then see if the LLMs, and what LLMs and what AI and what models, because you don't have to have necessarily a generative model to do that, you've could have a classic machine learning model or a deep learning or even stochastic or whatever. Again, exactly right. It's a great set of tools in your tool bag, it's just your toolkit's just got massively expanded, which is fantastic, but it is, you've got to know its limitation of where it works and where it doesn't.

Tim Gasper [00:46:19] Yep.

Juan Sequeda [00:46:19] All right, well as I knew this was going to happen, like just keep talking and talking. One more question before we hit the lightning round and all that stuff. Your T- shirt, The Great Data Race, please describe it because I love it so much and I know, you have a post on LinkedIn to go see it. So yeah, just give a quick description of it.

Jon Cooke [00:46:38] Yeah, so The Great Data Race, I was trying to again frame those sort of conversations with business people around the process of building analytics, really. And you kind of start with, again, the first entry point is with a racetrack because everyone knows about racing, whether they like it or not, you can understand that whole kind of thing. The first entry point is do you start with a business problem or do you start with the data? And in my mind, you should know this obviously by now and on the call, I try to start with the business because if you go down the data route, you might end up in a dead end because you actually don't know. And then you get to Business Understanding, which you've got to get. That's the first bit. And then obviously you don't want to get lost in Data Sourcing, you go through Data Product Chicane, which is actually you start to think about product management inside of your data as well, that kind of stuff. Data Modeling Roundabout, that's one of my favorites, don't get lost in that. You can keep going round and round, model your way round and round and round. And then you obviously then go through gen... Oh, it's gen AI before that. Sometimes I read this back to front. It's been a while since I've done this, but you've got to look at, again, what we talked about. Can gen AI actually help accelerate what you want to do? Go around, then you've Data Model Roundabout, and then obviously Data Contract Curve. You've got to take the right apex around Data Contract Curve. It's not going to solve everything, there's still a lot of race left to come. And then I've got Data Governance Pitstop at the end, but there's argument to say it should be further up and that kind of stuff. So there's definitely some nuances around where these things can go. User Testing, again, user testing you always got to do that. Again, is that the end or is that the start? That could be anywhere. And then you're into the Production Straight. So the idea is you've got this kind of race and really what we should do is actually almost have a circular race where it goes round and round, and I thought about having a rally section where you could have different terrains of different business problems, have a businesses as your copilot in the car, all this sort of stuff. I thought I'd lay it out in a simplistic, even though it's not quite right, a simplistic way to start with to get people talking about it and get people to understand, not just technical people or data people like us, but business people and other people understand what the process is around data.

Tim Gasper [00:48:39] That's so fun.

Jon Cooke [00:48:40] That was the idea.

Tim Gasper [00:48:41] A great conversation starter and a great thing to think about. And yeah, it's a good analogy, and on this show we love analogies, especially analogies taken a little bit further than usual.

Juan Sequeda [00:48:55] All right, so let's hit our next segment here, the AI Minute. We talked a little bit about AI, but I just want to give you one minute so you can rant and just share all your thoughts, rant about AI. Ready, set, go.

Jon Cooke [00:49:08] So yeah, I mean I've been doing AI for six, seven years. My first project was really interesting, completely failed because I didn't really understand AI and understand how to actually go in, and actually if you don't have the data, it can't work. And there was a six- week engagement, completely failed, we failed to come up with it at the end of it. And that's something that's really taught me a lot about AI. Understanding the limitation, understanding what tools you want to use, understanding the fact that it's all about small incremental gains rather than this kind of binary. So I come from the software world originally 20 years ago, you have business rules and they either work or they don't theoretically, but then AI comes along. It's like you can do this. Actually no, AI is a completely different mental model. You want to be able to do small incremental gains and sometimes that can do that. If you can do a 1% gain on your P& L, fantastic. But then there's a lot of work around that. You've got to understand all the hallucinations in gen AI, which is a big problem, all the tail events, all the training, just understand training's very different. And yeah, it's a lot of work but it's a lot of fun so I really like it.

Juan Sequeda [00:50:13] I like that. There's a lot of these small incremental gains to think about it. All right.

Jon Cooke [00:50:17] Yeah.

Juan Sequeda [00:50:18] Lightning round. Let's go through this. We got four questions here with yes or no, a little bit of context. I'll kick it off. Number one, can it be a data product if it doesn't provide business value?

Jon Cooke [00:50:32] I don't think so, no. Can it be a product if it's priced but there's no business value.

Juan Sequeda [00:50:37] All right. Very, very clear here. All right Tim, you go.

Tim Gasper [00:50:41] What about, will data product managers actually leverage LLMs almost as like an equal? Like is it going to be the yin to their yang or are LLMs more of a point tool?

Jon Cooke [00:50:51] I think they're more of a point tool.

Tim Gasper [00:50:54] Point tool for them?

Jon Cooke [00:50:55] Yeah, yeah. Point tool for everyone. I call them the AI butler. I tell them what to do and they go away and do it a lot more quickly, more efficiently for them and then come back and do it for me. But it's a tool, it's not a... Yeah.

Tim Gasper [00:51:06] Okay. Interesting. Yeah, I think some folks try to get really excited about how expansive it's going to be and like, " This changes everything!" And I mean, obviously it changes a lot, but it's good to keep that framing in mind.

Jon Cooke [00:51:18] Yeah. Cool.

Tim Gasper [00:51:19] Next one, should the data teams be the ones pushing generative AI innovation for the organization?

Jon Cooke [00:51:25] Well, if they're product orientated, then maybe, but if they're not then I think that might be challenging. Unless you just talk about they're using gen AI to speed up their own processes, so they're generating SQL and that kind of stuff. If it's about process efficiency of them getting analytics out quickly, than potentially, but if it's more around business process, reengineering, new products and that sort of stuff, then really if they're product management, then absolutely. If not, they need to get to that point.

Juan Sequeda [00:51:53] All right.

Tim Gasper [00:51:54] Interesting. Last question. Is the CDO the head of data product management?

Jon Cooke [00:52:02] Is the CDO the head of data product? No, the head of data product management is the head of data product management, fundamentally. So the CDO could become that and become their remit, but then it's the head of data product management and that, to me, is more of a CPO type role than a CDO type role.

Juan Sequeda [00:52:23] Interesting on that one.

Jon Cooke [00:52:26] It went very silent there, didn't it?

Juan Sequeda [00:52:29] I did not... I agree and I disagree at the same time. Well, we'll have to get some other CDOs and see what they think about it, but all right. We did a lot, got a lot done. Tim, kick us off with your takeaways.

Tim Gasper [00:52:43] All right, well we started off with whether or not you had a pedantic definition of data products and you said yes, but it doesn't really matter. The noun or whatever you want to think of as data products is not the goal, that's not the point. The point is the verb, managing data products, the act of data product management, that's what matters.

Jon Cooke [00:53:03] The noun/ verb thing is Brian T. O'Neill's, not mine. So I want to give him credit for that.

Tim Gasper [00:53:06] All right, so reinforced, further supported by you. And it doesn't have to be that specific, we don't have to be that specific, but if you do have to be more specific then really it's that thing that's providing value for the organization. It has a customer, it has a user. You gave this story of an organization you worked with where they were working on building out the credit models and those credit models were the data product, which I think it goes into this idea that is it resonating with business people, is it outcome oriented, is it value oriented? And it doesn't have to be a narrow definition, it's the definition that makes sense for your organization. And you mentioned how expansive that definition kind of depends on the maturity of your organization. What is the unit of solving a problem in your organization? Maybe it's dashboard, maybe it's a recommendation engine, but depending on your organization, maybe that doesn't make sense. So don't get too pedantic, don't get stuck in Data Modeling Roundabout when trying to figure out what the hell is a data product. But definitely have a definition, have a vocabulary in your organization. You mentioned, really think about is there a market for it? And product management is this idea like is there a market for this thing? And so similarly, is there an internal or external market for this data? And then we talked a little bit about data products and gen AI working together and you mentioned that there's three kinds of data products, or I'm sorry, of sort of gen AI data products. One of them is more sort of a copilot aspect, which is LLMs creating the code or the contract and then you run it. There's more of a classification sentiment kind of use case or model around gen AI. And then the third was where the LLM is the data product itself. You mentioned ChatGPT is a data product, and actually I hadn't heard that before. That was new for me to hear that sentence said. And I think that that is helpful because I think a lot of folks out there think of ChatGPT as a tool, they think of it as a service, but they're not thinking of it as a data product, which I think puts a new and important lens on it. So I think that was good. LLMs can help for the different stages of the data product management lifecycle. So you mentioned how it can help with mapping the business questions in, it can help with developing the data product. It could be the data product, it can help you to deliver and enable around the data product. So thinking about how gen AI fits into the data lifecycle more broadly is important because it can help in lots of different areas. And if you're talking to business users, sometimes you need to reframe the problem with them, and that's sort of a skill around product managers and also similarly around data product management. Really think about how things are going to get adopted, how they're going to be used, how they're going to provide value. And so much more, but Juan, I'm going to pass it over to you. What were your takeaways?

Juan Sequeda [00:56:03] We talked a lot about roles and these topologies between the platform teams versus product teams, like this is how it works for software teams, something similar should probably happen in data. In software you have product managers who are a specialized role, isn't like DevOps or Agile, it's really a job. So the data product managers will probably be something similar. We talked about where should the power be, right? The data team should be where the power is, but we also know that the power moves in large organizations, and large organizations, the power's all over the place. So I think that's something interesting to think about. So as a data professional, how do you start to take advantage of generative AI innovations and data product management? I love what you said, start to push on things, test and see how much you can push on these things and how folks react. I think that's a really important takeaway right there. Understand that very few companies will actually nail software product management. They don't nail it so you have to set the true expectations because you're probably not going to nail data product management from the beginning. Develop that language to speak to the business value, the problems that people are experiencing, start to move into that position. Communication is so key on this and getting that right language. And this is kind of more of the organic bottoms- up approach. I mean, also you can do the top- down, just bringing the pitch from leadership and making sure that you get an initiative over there. And then we talk about this data friction. We need to be able to go focus on the business value of data. We need to understand the cost benefit of things and understand why does a finance group care about, why does the marketing group care about this stuff? Like really, truly understand that. And going through this whole discussion, LLM wasn't like the focus of our discussion because it's just a tool. And I like what you said, do you want to build an LLM or you just want to go use an LLM to go solve a business problem? And then finally, wrap it up with your T- shirt, definitely just look up Jon, his T- shirt on LinkedIn. But just in a nutshell, start with the business problem first because if you start with the data first, it's a dead end. You go through the wilderness of data sourcing, there's the Data Modeling Roundabout and be careful because you can end up a lot over there. There's the data contracts, user testing, and then you have a production straightaway to hit your final value. But at the end I think we acknowledged that this should be going back.

Jon Cooke [00:58:13] Yeah, it should be like, yeah, absolutely right.

Tim Gasper [00:58:15] A loop.

Juan Sequeda [00:58:16] How did we do? What did we miss?

Jon Cooke [00:58:19] I think we covered a lot actually. We didn't dive into much technical stuff, did we?

Juan Sequeda [00:58:26] This was just less than an hour. I knew that we would need more time, but.

Jon Cooke [00:58:29] Indeed.

Juan Sequeda [00:58:32] Well, we'll have another episode sooner than later, dive into more technical issues.

Tim Gasper [00:58:36] We'll do a follow-up, a techie follow- up.

Jon Cooke [00:58:38] That'd be awesome.

Juan Sequeda [00:58:39] To wrap up, three questions. What's your advice about data, about life, whatever you want, who should we invite next, and what resources do you follow?

Jon Cooke [00:58:49] So my advice, oh, blimey. So me, I'm always curious, and it's the way I learn, right? Because I've got some neurodiversity and stuff. So I struggle to read textbooks, so I always try and do stuff. So I was playing with GGML the other day, I was brushing off my C and actually writing some C in it for inaudible 30 years. For me, it's all that curiosity and understanding really, I want to understand how things work. I think there's lots of challenges with people just saying, oh, it's going to do this and this and this, and make an assumption. Go and check it out and be curious and try it out. That's kind of my mantra. That's why I'm still in this industry for 30 years later. I love going and playing and actually still being curious around that. What was the second one? Sorry, it's getting a bit late now so my brain's not working very well.

Juan Sequeda [00:59:29] Who should we invite next?

Jon Cooke [00:59:30] Who should you invite? So I reckon you should get some data product managers on this, ones who are actually doing the job.

Juan Sequeda [00:59:37] And is there anyone You want to call out here publicly?

Jon Cooke [00:59:39] Yes, so there's Nick, I can't pronounce his last name, Zervoudis? He's a guy, there's Brian T. O'Neill who's very much around that.

Juan Sequeda [00:59:48] Brian has already been on the podcast. Definitely recommend. And Nick's podcast is awesome.

Jon Cooke [00:59:53] Yeah, he's actually doing the role, he's actually doing the data product management role. So really doing the life, the trenches of actually someone doing this kind of role. So that'd be quite a nice, good follow- on. And the third one?

Juan Sequeda [01:00:05] What resources do you follow? I mean people, books, magazine, podcasts, conferences?

Jon Cooke [01:00:12] So I get a lot of feeds from things like Medium and DZone and stuff like that. That's where I get a lot of my content around understanding about LLMs and this kind of stuff. That's typically where I get most of my thoughts and that kind of stuff. I obviously do a lot of googling when I say, oh, what about that? And I'll go and try it out. So I tend to be more trying to find the information and actually work stuff through. Obviously chats like this just are absolutely brilliant and we have a lot of these. I had a drunken chat in a taxi with Chris Tabb coming back from Big Data London today, which was fantastic. We did a podcast last night around that kind of stuff, around data products and stuff.

Tim Gasper [01:00:50] I wish I could have been a fly on the wall there.

Jon Cooke [01:00:52] Yeah, exactly right. And had certain conversations at Big Data London around various different things. So that, to me, that is absolutely gold because I'm also very impatient, I just want to get to the nub of the thing very, very quickly. So to me it's trying to zero in.

Juan Sequeda [01:01:07] Wow. Jon, this was fan... I'm so glad we finally got this.

Jon Cooke [01:01:10] Absolutely.

Juan Sequeda [01:01:11] It all worked out, we scheduled it while we were at Big Data London. Thank you, thank you so much. As a reminder, next week our guest is Samia Rahman. She's the director of enterprise data strategy and governance at Seagen, and I think I'm actually going to be in Chicago and she's in Chicago, so we're going to do a live over there. So it'll be a fun discussion. A lot of data mesh and data product discussion, continuing that one, and governance too. But Jon, thank you, thank you, thank you so much. Really excited that you're finally here as a guest. And cheers, happy to hear you.

Jon Cooke [01:01:47] Cheers. Thank you so much having me on. It's been absolutely awesome. Really, really enjoyed it. And I'm awake, so it's all good.

Juan Sequeda [01:01:47] All right.

Tim Gasper [01:01:47] Cheers, Jon.

Juan Sequeda [01:01:47] Cheers.

Jon Cooke [01:01:47] Thanks a lot.

Special guests

Avatar of Jon Cooke
Jon Cooke CTO and Founder, Dataception
chat with archie icon