NEW Tool:

Use generative AI to learn more about data.world

Product Launch:

data.world has officially leveled up its integration with Snowflake’s new data quality capabilities

Upcoming Digital Event

Learn how WR Berkley & Singlestone Consulting supported this distributed model with modern data practices and a data catalog built on a knowledge graph.

View all webinars

Live From Snowflake Summit 2023 In Las Vegas

Clock Icon 19 minutes
Sparkle

About this episode

Juan and Tim sit down at the end of Snowflake Summit 2023 and debrief on everything that was Snowflake Summit 2023 in Las Vegas and there's even a sneak peak and a future guest.

00:00:00 Tim
Hello, everyone. Welcome. It's time once again for Catalog& Cocktails, coming to you live from the Snowflake Summit in Las Vegas. This is day four-

00:00:10 Juan
Four.

00:00:11 Tim
...of the conference.

00:00:11 Juan
We're tired.

00:00:12 Tim
It's been a long week. We've been talking with lots of people, sometimes with cocktails, sometimes with coffee. Either way. It depends on the time of day. Just so much excitement, so much energy. There's a lot going on in the data ecosystem right now, and obviously Snowflake is at the center of it.

00:00:29 Juan
Yeah, for sure. I think this is... We wanted to do this episode, record it on the last day because we're just now summarizing everything that we have learned and talked to throughout the last couple of days. Let's kick it off.

00:00:41 Tim
Yeah.

00:00:42 Juan
Well, I'll start. I think the most obvious thing right now that everybody's talking about is AI and these large language models. The whole conference kicked off Monday evening with the keynote with Frank and Jensen. Frank, the CEO of Snowflake, and Jensen, CEO of Nvidia, about their partnership. I think this is one of the most exciting things of how we're starting to go see Snowflake with Nvidia, but bringing these two things together. We're also seeing Databricks made their announcement. All the different vendors, all the different clouds are having their large language models and AI partnerships inaudible.

00:01:14 Tim
Yeah. This is just taking everything by storm. You can even see with Snowflake, they were saying data plus apps, data plus apps. And that was about a year ago. And now it's data plus apps plus AI, right?

00:01:25 Juan
Exactly.

00:01:26 Tim
You have to say AI.

00:01:27 Juan
A couple things that we're seeing, and this is the trend that I've been seeing across the industry, and now it's validated and solidified here with Snowflake, is they're going to be offering a marketplace of all these foundational models. You can go get these foundational models all within Snowflake, and then just... We've seen cool demos. Just write a bunch of SQL, then it's right there and you can just start using it and makes it super easy with Streamlit and stuff to be able to go access these foundational models. They'll be offering partnerships with different foundational models, and then also bring in different open source foundational models in there. And then still have other partnerships with other models out there. I think this is the trend that we're seeing is that every single cloud vendor is offering their marketplace of, " Yeah, now that you're in our ecosystem, you should be able to go in and choose which large language model inaudible." I think that's one. But I think it's very clear that the foundational model is a start, but you will not want to go put everything and train or fine tune that one specific foundational large language model. I think where everything's going is having a large set of small language models, which those are going to be more specific to your particular task, to your particular department or industry. And actually, the training of that's going to be cheaper than being able to go train or fine tune the large language model around inaudible.

00:02:48 Tim
So instead of one big monolithic model that's maybe generic, let's specialize and have the different models work together.

00:02:55 Juan
Yeah. I think we can see it as the monolith versus the microservices approach. I think people will start figuring out, what are the micro little services? Which are going to be the smaller language models. I want to be careful. I'm not saying the smaller large language models, because it's LSLMs in this sense.

00:03:10 Tim
Right. And do you think that everybody is going to be training their own models? What do you think is the adoption curve inaudible here?

00:03:16 Juan
This continues to be the open question. And today, this morning at the panel keynote from the large language models, I think an open question is how much you're going to be able to do through prompt engineering or the zero shot stop. Or you're actually doing some fine- tuning. If you're doing small language models, you're probably will be fine- tuning these things. But I think we're still on the stage of when to go do this, how to go do this. I think a lot can be done by prompt engineering. I mean, the test that I've been doing... I mean, I can just do... Really go prompt engineering and that's the way how I'm" training it," in a sense. So still an open question. And then it goes back to the cost around that. Even just the people that need to go have. Who's trained to go do this? Do I have enough data to actually make an impact around this stuff or not?

00:04:04 Tim
Right. And I'm worried about the compute cost too. I thought it was funny during the keynote when Frank was talking and he got asked the question, how much is this going to cost? And he paused and he was like, "It's GPUs. It's going to be expensive." So I mean, it'll be interesting to see how different companies handle those dynamics.

00:04:21 Juan
No, and I think conversations I've been having the last couple of months. I think cost is something that is not really on the top of people's mind right now because they're like, " Oh, we want to go see you-"

00:04:29 Tim
There's all the excitement.

00:04:30 Juan
The excitement.

00:04:31 Tim
People just want to see the cool demos.

00:04:33 Juan
But then there's like, " Okay, this was really productive." Okay. How productive was it? You're chatting with the data, you're getting all this cool stuff, but it costs you this to go do that.

00:04:41 Tim
Yeah. And it costed you a million dollars a year. Did the ROI go more than that? Right? Yeah.

00:04:47 Juan
So it leads me to another big topic is, what are the use cases? So what are the AI use cases of people? And talking around and even talking to folks at Snowflake in the hallways, there's excitement, but the valuable business use cases are still very, " Hi, oh, yes. I want to be able to find my inefficiencies," is like, " Yeah, no shit, Sherlock. I get that. inaudible want to go do." So I think that the very high level... I mean, in the first keynote, it was like, " We should now be able to go discuss what churn is and figure out what churn is." That's always been a problem. But the obvious one, the low hanging fruit is chatting with the data. So I think that's the first immediate use case that we're starting to go see. And I'm sure now with all the apps and stuff, we're going to have this large marketplace of chatting with the data apps inaudible. So I think that's one. And then talking about chatting with the data, one think I'm really, really happy, is that a lot of the people are acknowledging and realizing that semantics and knowledge graphs are going to play a key role. Because if you just do the natural language and translate it to the SQL query by itself, this all works cool with just a very... Sample of small schema. But the moment we get complicated schema, that's where the hallucination, all that stuff comes in. And maybe you'll get good enough SQL to go... Somebody else can go fix it. But it's not going to be a point that it's answering your question.

00:06:11 Tim
Yeah. I feel like we're very quickly tapping out on all the low hanging fruit. And then all of the hard problems become... It's going to take a lot longer.

00:06:23 Juan
Hallway conversations, people are realizing that semantic layer is going to be critical, because then translating your natural language questions to the semantic layer, that can be done more effectively than just translating it to an underlying inscrutable database team that we don't know, understand what it means.

00:06:38 Tim
Yeah. There's only so much context you can cram into a prompt. Right?

00:06:41 Juan
Exactly. So I think that's the key thing that we're seeing. And I think one of my predictions here is that... I mean, my heart and soul is semantics and knowledge graphs are going to be at the key to make sure that businesses be able to go use these large language models effectively. So anyway, that's what I've been seeing through the AI, LM world. How about you, Tim?

00:07:03 Tim
I think that that was a great take there, and I think I'll broaden this a little bit, because in addition to a lot of announcements around AI and around ML and around LLMs, there were a lot of really big additional new capabilities that were added to Snowflake. So a lot of enhancements to Snow Park, which is their large workload engine, spark oriented on Snowflake. They announced Snowflake Container Services. So now you can actually run these different applications, data applications, or just regular applications on Snowflake. Native applications on Snowflake, so you can actually deliver your app through Snowflake. And a bunch of vendors are already thinking of starting to jump into that. The biggest thing I heard yesterday that got me excited about that is especially for smaller companies that... Or SaaS companies that have to deal with Enterprise, InfoSec and things like that, that can be a very complicated thing. If they already have Snowflake, then maybe you can just piggyback off of that relationship already. There was native data quality, there were improvements to Streamlit. So there's obviously this theme here of all of this where it was mentioned multiple times, in the keynote and elsewhere at this conference, the data has gravity. I think Snowflake wants to be the sun in that solar system, and they want the planets to revolve around them. Bring the compute to the data instead of bringing the data to the compute. And that's a big shift. It's a shift in power. And it also brings up big questions about, what does that mean for the big cloud vendors, right? AWS and Microsoft, et cetera, right?

00:08:45 Juan
No, definitely, snowflake is all about, you're all in with Snowflake. And they're making it as easy for anybody to go be incentivized. Once you're all in on Snowflake, you get all of these things part of the Snowflake ecosystem. I mean, now with the containers, you have your apps and your data all together. You have an app store, the marketplace with the apps that you can go share all this stuff. They're making it so easy. People can go off and monetize all these apps inaudible. So I think this is the big thing, and it is fascinating to see people's... And actually senior folks when they're like, " I'm all into Snowflake." I'm like, " Wait, but I mean, you have so much experience of working with these legacy vendors who are all this one monolith and you're doing it here. Aren't you scared? inaudible the risk?"

00:09:30 Tim
Why isn't this just a new monolith, right?

00:09:32 Juan
But they're like, " This just works. It just makes it so easy. Why wouldn't I, if I don't have to go pull all these things together?" There's a case. So I think that's an important one that I'm seeing.

00:09:41 Tim
Maybe you're okay with a monolith if you're enjoying it, right? It's an all- in- one resort and the drinks taste good.

00:09:49 Juan
Yeah. A Really high- end, all- inclusive place, right?

00:09:52 Tim
Yeah.

00:09:53 Juan
So a prediction I have is, if they're really going to go all in, I would not be surprised if in the next year or two, Snowflake's going to announce that they have their own data centers and stuff.

00:10:03 Tim
Their own data centers. Really?

00:10:05 Juan
I mean, why not? They're just another cloud vendor. I mean, why use inaudible-

00:10:11 Tim
I mean, that would be bold. I think that they're going to want to keep their relationship in a really positive way with AWS. Interestingly, they're great partners, but also there's a friction there at the same time. But I think more likely is that I think Snowflake's going to really want to be the network. I used to work at Akamai. And Akamai was a big content delivery network. And so their key was having a really fast network that goes across the globe. And I think that's where I would see especially Snowflake going, is maybe not creating their own data centers, but really wanting to make it fluid in terms of" Oh, you want to move from Microsoft to AWS to whatever? Oh, well, Snowflake is the thing that stays constant."

00:10:50 Juan
Regardless of how things are implemented, it's all about, " I'm all in on Snowflake. I want to go create my apps and everything. I can deploy everything on Snowflake. I don't have to think about anything else." And it's going to make people, a lot of developers, engineering, those lives easier, faster to go do things.

00:11:06 Tim
Yeah. And there was a lot of-

00:11:07 Juan
Faster time to value.

00:11:07 Tim
...And there was a lot of developer oriented language here this week. It was a lot of building apps, building data apps on Snowflake.

00:11:15 Juan
That's it.

00:11:16 Tim
And just before we wrap today, even though we've been hanging out at Snowflake, there has been this parallel conference going on, the Databricks conference, which has also been happening, and they were making a lot of really big announcements over there as well. So for those of you that are listening that have been following along, you probably heard of a few of these. So for example, Databricks, 1. 3 billion acquisition of Mosaic, AI model and development company. So that's a huge move. Massive purchase there. And then also, they made a bunch of announcements this week, such as, they added Lakehouse Federation. So now you can connect to other data sources and actually be able to connect that into the Databricks ecosystem. Also, another one that caught my attention was Lakehouse Monitoring, where now they're actually building sort of data observability capabilities into the Lakehouse architecture to monitor your data quality pipelines and things like that. So just yet again, another large planet, or I'm sorry, a sun, that's got all these planets evolving around it.

00:12:21 Juan
So what I'm seeing is either the Snowflake approach is you're putting everything in Snowflake, versus the Databricks approach is like, " Yeah, put everything in Databricks, but also, we're okay with an open ecosystem, so we'll federate our own stuff." So I think those are the two different approaches that we're starting to see.

00:12:35 Tim
And I think it'll be interesting to see as we go forward, is Snowflake going to invest in federation virtualization capabilities or cataloging capabilities? And it'll be interesting to see how these things play out. Yeah.

00:12:47 Juan
All right. Well, this has been a fascinating week. So much stuff. I'll be writing my takeaway posts on LinkedIn to follow that. And next year, we're not in Vegas. We're going to be in San Francisco. That's the other big announcement.

00:12:59 Tim
Yep, exactly. inaudible.

00:13:00 Juan
Because I think what happened a lot this week, we got a lot of steps here.

00:13:04 Tim
Yeah. We did. We did. Yeah, we had to go back between the forum and the palace over here in Vegas, so it's about 15, 20 minutes.

00:13:11 Juan
Walk in 100 degrees.

00:13:13 Tim
It's a little hot outside, but we got our exercise, so that was good. Well, thanks, everyone. Cheers. Hope you all enjoy your week and-

00:13:20 Juan
We'll be back soon. We're still cooking up a lot of the stuff that we're doing for our next seasons. We're going to have some episodes coming out, so stay tuned. Cheers, everybody.

00:13:29 Speaker 3
And now a preview of an upcoming guest.

00:13:33 Juan
And I am finally super excited to meet you in person.

00:13:36 Ethan
I'm so excited as well.

00:13:37 Juan
Ethan Aaron from Portable. I mean, if you don't know who Ethan is, I think you've literally been living underneath a rock in the LinkedIn world, but it's great to meet you finally.

00:13:45 Ethan
Pleasure to meet you. I feel like we've been chasing each other around the country and been off by two days.

00:13:51 Juan
That is true.

00:13:52 Ethan
This is it. We're in Vegas.

00:13:52 Juan
Finally.

00:13:53 Ethan
We're doing it.

00:13:53 Juan
All right. So the deal is we're going to be chatting with folks and asking, " What the heck is going on in Snowflake? What are the trends? What are you seeing?" So go.

00:14:01 Ethan
I would say the big thing that I've been taking away from this year at Snowflake inaudible is a few things. You have different size companies. You have small companies. Still, every small company is still thinking about data. It's just, they're more cautious about, what does that look like? Because the market's crazy. A lot of medium- sized companies out there right now, let's say 50 to 1000 people, they're trying to figure out what happened over the last three years in the data world, to their data team, to all their initiatives. And they're trying to figure out how to rebalance that. What are the things that are absolutely critical? What are the things that might not be as critical? How does their team get as much leverage as possible? And then the bigger companies that I'm talking to, the true enterprises, their priorities aren't changing. Their initiatives aren't changing. They still need data governance, data quality, observability, catalogs, all these things, discoverability. And they work on longer cycles than these smaller companies. But I think the new initiatives are ever... There's more risk. There's more uncertainty. So I think most people, the overwhelming trend this year is just, taking a step back, rebalancing, what truly matters? Can you save costs here? Can you get more leverage there? Last year was flashy. Last year was new this, new that, and it was the first conference coming out of COVID. But this year, I think it's more pragmatic.

00:15:20 Juan
I'm seeing this too. People are being very pragmatic about cost, right?

00:15:24 Ethan
Yep.

00:15:24 Juan
Reducing costs. I think there's announcements on that stuff. AI has been a big thing, but I think everybody's still like, " Okay, what are we actually going to go do with this?" And there's a lot of excitement, but at the same time, cautious. Out of all the announcements, what are the ones that you've been excited about or looking forward to?

00:15:40 Ethan
I think the AI piece, you can't ignore it, because it's going to fundamentally change everything. But when I think about how AI is going to impact data, there will be a very small number of companies that create entirely new frontiers with AI. It's going to happen. I think the biggest thing that we're going to see with AI, and I think data teams can provide leverage to themselves and to the organization, is using it to create leverage. It's not, " Wow, it's going to change the data landscape tomorrow." It's not going to sweepingly remove all the jobs. It's going to be someone sitting there being like, " Well, I'm doing this task manually 50 times. Can I..." A year ago, it might have been outsource it on Upwork or Fivver or something like that. Now it's, " Can I write a script or use a Snowflake function to just get the answer I need automatically from AI?" Though, I think it's the micro wins from AI that are going to have a... It's not going to be a big bang moment, like AI changes everything tomorrow in the data world. I think it's going to be a lot of compounding micro wins from AI that I don't think we're going to see them overnight. I think people are starting to add more features, more capabilities. I think in some scenarios, it makes a ton of sense. But I think the compounding of picking the right features around it is actually super, super powerful.

00:16:53 Juan
The honest no BS thing for me right now is that people are excited about the AI, but they're really acknowledging that they don't understand... What is something very valuable they need to go do with it? They're all over the place. The first thing that everybody's saying, " I want to go chat with the data." And that's the obvious low hanging fruit. But what's coming after that? And actually how much of that productivity gain am I going to get with it? And obviously, what's going to be the costs around this stuff too? And I think people are trying to figure out the large language models, small language models, and anyways. This is going to be a big topic of... Really, We're in discovery mode for the next year.

00:17:29 Ethan
Yeah. Totally. Yeah.

00:17:30 Juan
Anything else you want to add?

00:17:32 Ethan
I just think on all of this stuff, there's only so much efficiency you can get out of a team. If you have 10 people, that's all the efficiency you can get. They just do things more efficiently. I think where there's another big opportunity, whether it's AI or not, is just refocusing on, how do you find the highest value business problems? Chatting with data is great. But chatting with either a human being or with AI about, " Hey, what should be my priorities for the business?" I think is even more powerful. So I think making sure that as we think about any initiative, there's efficiency gains. Sure. If you have a three person data team, your efficiency probably doesn't matter that much, versus impact gains. And I think this year, you're seeing both, cost specifically on the efficiency side. And just, how do you create as much value as possible is-

00:18:23 Juan
So I think we got to realize and get out of our bubble and zoom out and realize it's not about the efficiency of this team, this data team. No, what is the actual value in the business? Again, show me the money. Follow the money.

00:18:35 Ethan
Yeah, 100%. So I think as long as everyone stays focused on that, you can find one more opportunity to create$500, 000 in value for your business. Doesn't matter if you have two analysts instead of one. You created the value. If you create 10% efficiency on your two person data team, it's immaterial to your business. It's still important. It's still money. And for some businesses, that's critical. But I think there's a lot of money to still be made, value to still be created by data teams, and staying focused on that, on the money, is absolutely-

00:19:08 Juan
Know the business objectives and go towards that.

00:19:11 Ethan
Yep.

00:19:12 Juan
Pleasure to finally meet you, man.

00:19:13 Ethan
Pleasure. Yes.

00:19:13 Juan
All right. Looking forward to more events.

00:19:16 Ethan
Always.

00:19:16 Juan
We'll catch up in more happy hours around the world.

00:19:18 Ethan
Awesome. Love it. Yep.

00:19:18 Juan
All right. See ya.

00:19:19 Ethan
Pleasure. Thanks.

chat with archie icon