NEW Tool:

Use generative AI to learn more about data.world

Product Launch:

data.world has officially leveled up its integration with Snowflake’s new data quality capabilities

February 29th:

Ensuring Highly Distributed Data is Available to All with W. R. Berkley

Upcoming Digital Event

Learn how WR Berkley & Singlestone Consulting supported this distributed model with modern data practices and a data catalog built on a knowledge graph.

View all webinars

Tim And Juan Rant From Amsterdam: Where Is The Data Budget?

Clock Icon 12 minutes
Sparkle

About this episode

Tim and Juan share their latest thoughts about where the budget for data projects are heading.

Tim [00:00:17] I'm doing really great. It's cool to be in a new spot.

Juan [00:00:20] Yes, a different place. We're doing a little bit of a European tour this week here. We're chatting with a lot of folks, meeting with customers. It's always just fun to be on the road, actually meet people, live in person. And just figure out how are things going, where's people's heads at? And I think our goal today is to really share what we've been learning over the last couple of... I think the last couple of weeks has been really intense, talking to a lot of folks. And figuring out where is not just the data team's heads at, but also where are the board, not even CEO, the board level, where is everybody's heads at?

Tim [00:00:56] I think that's a great place for us to do our little Tim and Juan rant today. And before we do that, for our warmup question or comment, just want to let everybody know we were looking on dictionary. com. And we were looking at what the word of the day was. And the word of the day is skookum, S- K-O- O- K- U- M, which means powerful and mighty, but a little bit in a scary way.

Juan [00:01:22] Yeah. So ChatGPT is skookum.

Tim [00:01:27] It is skookum, yeah. So for those of you that want to get to know us a little bit better, we've been trying to figure out how to use skookum in a sentence. And so I was thinking, for the 4th of July, I went to a fireworks show and man, it was skookum. That was a skookum fireworks show. I don't know. We're failing, I think.

Juan [00:01:46] All right, then we just stick with our data thing.

Tim [00:01:48] Okay, back to data, back to AI. So we've been talking a lot about where are people focused? And where are budgets going? Where are new initiatives focusing? And it seems like there are two areas where there's investment, maybe a couple more. And the two biggest areas that we're hearing a lot, from all the data leaders that we're talking to are, so budgets are tight, economic times are uncertain. But AI, new investments in AI. People don't want to fall behind. They want to make sure that they're on top of this. There's a lot of excitement. Also, caution though. And then, security, on the topic of caution because part of that is who knows what AI is going to mean from a security standpoint. But, also, of course, just in general, cybersecurity continues to be an area of focus. Got to protect the business, got to make sure that we're taking care of our customer data.

Juan [00:02:44] Yeah, so those are the two aspects that we've seen across the board, everybody that we've been talking to. That's what's on mind of CEOs. Actually, that's what the board is telling the CEOs and companies to be focusing on, AI and cybersecurity. What is your AI strategy? What are you doing for security? Now, that's an overlap across the board. But what's also interesting is that what we're seeing more in Europe is that a third one, where there is a lot of, not just interest, but where there's budget, is for ESG.

Tim [00:03:09] Mm- hmm. Environment, sustainability-

Juan [00:03:12] And governance.

Tim [00:03:13] And governance.

Juan [00:03:14] So that's interesting. And what's fascinating that it's a very European thing because it's not a topic that's shown up in the US as much. Now, we've been thinking about this together. From the data space, and data and catalogs and so forth, you want to be able to justify your investments. You want to be able to say, " How are we providing value?" Well, it's very clear that if you want to fast track or make sure that you are heading for success, talking to the executives and to the board, is that you need to be able to tailor your conversation about data on those three things, especially AI and security in the US. And if you're in Europe, you probably can talk about ESG. ESG's not something that's coming up in the US, from what we've been just talking to a lot of folk. So what we are learning, and a message to everybody who's listening, is you're trying to show value. You're trying to get by. And you're trying to push the conversation to that executive level. You have to be tying your work to, how is this impacting, or going to deal with our AI strategy? How are we going to deal with security for our customer? Because that is what's on CEOs' minds, that's what's on the board's mind.

Tim [00:04:22] Yeah. And this is where you can, if you're a data person or you're a data leader, either way, this is where you can align the work that you're doing, and the value that you're providing, to something where there's new investment and new focus. Because, otherwise, you're stuck just tying what you're doing to efficiency. How do I drive cost efficiency and performance efficiency in our company? AI security, and then especially in Europe, but a little bit in the US as well, ESG is another area of increased focus and investment. I think AI, especially, is interesting to tie your data initiatives to because there's so much excitement. People want to try out new things. And I think that in all of our conversations in the data community, as well as with our customers and enterprises, is that people know that AI relies on good data, well- governed data. And I think this is good. It's a good surprise because I think there's always a worry. Juan and I are always worried about the honest, no BS. And we were worried that the honest, no BS is that people were just hyped up about AI, and not going to keep their house in order, before they start building rocket ships on top of their house. And the reality is that people know garbage in, garbage out.

Juan [00:05:34] Yeah, that is a phrase that everybody gets. And I think, also, talking to the folks is, the word governance is like, " No, that's a bad word." But they know they need it. So I think, something we were talking about earlier today is, we go from boring governance to exciting AI. And I think this is the opportunity to connect that stuff that we all... It's we need to eat our vegetables.

Tim [00:05:58] Yeah. You have to do good governance. You have to catalog your data. You need to document.

Juan [00:06:04] To invest, invest in having quality, good data.

Tim [00:06:05] Invest in good data systems and things like that. Yeah, exactly. You have to do those things. But AI is the thing that, it's the steak and the dessert that can go with your vegetables.

Juan [00:06:14] So, now, you're able to go justify, this is why we need to invest in data quality. This is why we need to invest in having good catalogs. Invest in having people to have a stewardship program and so forth. Because you are setting yourself up for success for this AI strategy that everybody is looking to define. And the other thing is that, it's very clear that the generative AI, these large language models does not have the knowledge and the context of your organization. So you need to be able to provide that context of your organization. And that is going to be through knowledge graphs. It's going to be through RAG, and through embeddings, and vector databases. But that means that whatever you're organizing that's going to be given to a large language model, that needs to be clean, beautiful. That's the context. So, now, you're able to go justify why you need to go invest in your data quality, invest in your metadata, and your catalog initiative. So this is the way how to go sell it. We're the non- salesy podcast, but at the end, everybody who's listening is trying to figure out how to go sell their work that they're doing to their own stakeholders, to their executives of their company. So, hopefully, giving you some of the tips of the conversations you should be having.

Tim [00:07:26] Yeah. Whether you're doing it directly, or you're indirectly connected, metadata management, governance, data quality documentation, you do these things, and it's going to make your AI better.

Juan [00:07:38] Yeah. So if your executives are saying, " We want AI." You're like, " Yeah, we want AI too. Everybody wants AI. But to get there, we need to get our house in order." So I think this is actually the perfect timing to be able to go tie all your metadata, your catalog initiatives to AI. And then, I think, from the AI side, there's always these two sides. One is on the unstructured, like, " Oh, we have all these texts you want to go do." And I think we're seeing a lot of the summarizations, code generations. That is really strong work that is, from the unstructured side, where I guess not a lot of the data teams are working. But when it comes to a lot of the vision of chatting with your data, this question answering, a lot of the stuff that we've been pushing on too, that's where context is key. And a lot of the questions that executives want to answer with this new vision of AI is going to require all that data that is stored in your structured databases and your SQL databases. And that's where context is key.

Tim [00:08:34] Exactly. Context is key. And that's why we get so excited about knowledge graphs and how they can help provide the context and facts to LLMs. And then LLMs can both feed things into the knowledge graph, but also LLMs are going to provide that natural language interface. So I think that's exciting. And just to round out this topic, we've been talking to a lot of different organizations. Just a couple of days ago, we were with a very large consultancy. And talking to some very, very smart people around AI and around data science. And we were all really agreeing, pretty strongly, that an ideal architecture around AI is the combination of LLMs, vector databases and taking embeddings and being able to use that for better results, in combination with prompt engineering, and then knowledge graphs. And that those three things, vector databases, LLMs, and knowledge graphs are probably the three key pillars of the next generation AI architecture. Everything is so early, it reminds me of the early days of Hadoop. Hopefully, it's going to be a lot more sticky and valuable than Hadoop ended up being. But it reminds me of those early days where, what are the right pieces? What do we need? How do we make sure this is secure? How do we make sure this is performant? How do we make sure this is valuable? And so it's exciting to see this all pan out.

Juan [00:10:06] I love how you're bringing this up right now, and it's a great way to wrap up, is that from a technology standpoint, we're seeing again the LLMs, the knowledge graphs, and the vector databases all coming together. Now, the devil's in the details. And those details, we still need to figure them out. Everybody is, right now, experimenting on this. And figuring out what a combination these things are, and then how much are... Is it from prompt engineering, to fine- tuning, to training? And what do you put in embeds? What do you put into knowledge graphs? How are these compliant? Are they-

Tim [00:10:33] You're even doing some of your own benchmarks to try to figure out, where do we push the envelope here?

Juan [00:10:37] Exactly. So this is really a fun time. But at the end of the day, it's always focused on the value, what that organization needs, and where they're investing, and where they're seeing or where the book is heading. And, again, back to that, it's AI, security, and for the Europeans, ESG.

Tim [00:10:54] Yep, agreed. So for those of you in the community, give us a shout on Twitter, on LinkedIn. Tell us your thoughts about where you see these investments going, where you see the focus being. Talk to us about AI and what you think that architecture is going to be. And let's figure this all out together. Let's experiment. Let's share knowledge.

Juan [00:11:11] And next week we are going to be in London and we'll be at Big Data London. So catch us over there, and we'll have Chris Tabb as our guest next week. Looking forward to it.

Tim [00:11:20] Looking forward to it.

Juan [00:11:21] Bye, everybody.

Tim [00:11:21] Cheers, y'all.

chat with archie icon