Productivity is not performance with Santona Tuli

Tim [00:00:00] It's time once again for Catalog& Cocktails. It's your honest, no BS non- salesy conversation about enterprise data management with tasty beverages in hand. I am Tim Gasper. This is Juan Sequeda, my co- host. We're in Austin, Juan.

Juan [00:00:14] How are you doing, Tim?

Tim [00:00:15] We're doing well.

Juan [00:00:16] And this is so cool that we are finally live, in person with our guests because we don't always are live with our guests. I think last time was in Big Data London with Chris Tau.

Tim [00:00:26] Yes.

Juan [00:00:27] So we're in Austin right now. What's going on in Austin is that there's a Data Day Texas and one of the speakers and who's our guest today is Santona Tuli, who's a director of Data Upsolver. How are you doing?

Santona [00:00:38] Hi, thanks for having me. Hi everyone.

Juan [00:00:40] I am super excited to finally get to do this and it's so cool to get do this in person. It's Thursday, 8: 30 PM. We usually do this Wednesday, middle of the week.

Tim [00:00:48] It's actually a proper drinking hour.

Juan [00:00:51] Which we don't have a cocktail today and also if whoever's watching us or watching the video here, we're in the bar that I have in my house, which has a lot of beer and wine.

Tim [00:01:03] Our wine.

Santona [00:01:04] It's a really good setup.

Juan [00:01:06] Anyway, so tell us, what are we drinking and what are we toasting for? Want to kick us off?

Santona [00:01:11] Yeah, I'm toasting to everyone being here for this conference. It's a great opportunity to run into everyone at least once a year because all the other conferences are a little bit like different people are at different ones, especially when two happen on the same day. There's one group going there and the other group going there, but the folks that I really like in data and enjoy spending time with are usually at this one. So toast to that.

Tim [00:01:36] No, that's awesome. I'll cheers to that as well. It's always fun when everybody gets together for Data Day.

Juan [00:01:41] Exactly. Cheers and cheers to the organizers. Thanks Lin for always organizing such an awesome conference. So cheers. We got our oral questions, so what is your favorite productivity hack?

Santona [00:01:56] My favorite productivity hack is probably walking my dog. It just clears my head, gets me up and moving and it's not just like, oh, I'm stuck. I'm going to go walk my dog. It's like she makes me, makes get up and go, which means the time that I do have at the computer, I'm going at it and then I'll go and think about things and go at it in my head and also wash her chase squirrels and then come back and it's refreshing.

Tim [00:02:26] That's funny. I don't have a dog, but I try to go on walks whenever I can and now I'm thinking, man, maybe I got to get a dog. It makes it so you have to go out there regularly and clear your mind and think about things, right?

Santona [00:02:35] Yeah, it's like a Fitbit but better and you can't ignore it.

Juan [00:02:38] Well, I was going to say something, well, technical. Oh, something with GPT or some productivity tools, apps that you have. But now that you say that, I'm going to say I get my best ideas in the shower, organize my thoughts, so that's one. So I don't know, it's not that I'm going, "Oh, I'm confused or whatever. Let me go get into the shower and go do that", I think that would be weird, but now that I'm saying this out loud, I think I could do that. I'm going to try that out. How about you, Tim?

Tim [00:03:10] This is a hard question.

Juan [00:03:12] All right. You do something technical, what's your technical?

Tim [00:03:14] I'll do something a little more technical. For me, it's going to be I add everything to my to- do list, just everything. And that's just my way of being a lot more clear and present when I'm having conversations and interacting with people because I never have to worry about like, oh, I got to remember that thing or anything. It's all here.

Juan [00:03:32] What is that?

Tim [00:03:32] As long as I don't forget my phone.

Juan [00:03:34] There's an approach. There's a whole methodology for it, right?

Tim [00:03:37] Oh, yeah. Getting things from GCD, right? Yeah.

Juan [00:03:40] Getting...

Tim [00:03:40] Yeah, getting shit done.

Juan [00:03:41] That one, right, just put everything on it and then you start organizing.

Tim [00:03:45] Yeah, exactly.

Santona [00:03:46] Do you ever get blocked by something and then that prevents you from going down the list?

Tim [00:03:51] Sometimes. Sometimes. But I've been doing to- do list stuff for so long now that it's almost second nature, and so people are always like, " What are you doing on your phone? You on social media or something like that?" I'm like, " No, I'm rearranging my to- do list. I'm just moving things up and down because I'm going to change the order of what I'm doing."

Juan [00:04:08] All right. Well, let's kick it off. Honest, no- BS, you wrote on a LinkedIn post previously is productivity is not performance. So honest, no- BS, what do you mean by that?

Santona [00:04:19] What do I mean by that? It's really the distinction between output versus outcome. So you can do a lot without doing a lot. What you're doing isn't building up towards some bigger purpose, something that is well- defined and well- understood the value is clear. So an example of this would be building out a very detailed intricate engineering infrastructure for something like answering a simple question, like maybe depending on the phase of the business and the data practice and so on and so forth. You might want to pause and take a single question and think, can I make it a process? Is there a version of this that's better as a process and productize and stuff? And it's good to put that thought into it, but definitely don't jump into, oh, I have this question and so I have to really build this out and it's going to be six months. And then the answer was just 42.

Juan [00:05:24] So thinking about this, I'm curious in your experience, because you have a very, very interesting background because we both have academic backgrounds and we get into this whole data space. How are you seeing the different data roles coming in and how are all these roles, there are definitions of are you being productive enough? How are you seeing this right now?

Santona [00:05:46] Yeah, I think data has always been a tough field to define roles, and there've always been so many different roles. By the way, I'm making eye contact with a friend that I brought along who you won't recognize, maybe he'll come say hi at some point.

Tim [00:06:05] A secret live audience.

Santona [00:06:07] There is a secret live audience. What was I saying?

Tim [00:06:12] Roles.

Juan [00:06:13] Roles.

Santona [00:06:13] Oh, data roles, yes.

Tim [00:06:14] I had to think about that for a second too.

Santona [00:06:18] And yeah, productivity is going to look different and performance or actually producing value is going to look different for different roles as well. So yeah, I've posted in the past or I shared my thoughts in the past around the distinctions between if you're doing data or ML work because that's the core of the business, like a Netflix or an Amazon where the business is really centered around building some predictive model and so on and so forth. Or if you're doing descriptive analytics for a company that otherwise the business is somewhere else, but you still want to have an understanding of the business, so the analytics function. And it's important to have these distinctions partly because not all skills are transferable. And this is not to say, this is not to gatekeeper or draw boundaries, I think that any given individual can do both of those things and should learn. But it definitely helps to have at least some understanding of what the purpose of the role is beyond what the title of the role is. So I think one thing that bothers me, I think people often think of data analysts as someone who's flutzing around and, I don't know, in Excel, isn't technical, but is answering this data question. I can't really get behind that definition of a data analyst. That's why I like the title'Analytics Engineer', because that tells me a lot more about what kind of work you're doing. Like an ML engineer, you're doing ML. Analytics engineer, you're working for the purpose of analytics, but I don't really know what a data analyst is, if that makes sense.

Tim [00:07:58] Yeah, that's interesting. Well, and just to back up a little bit, I think when you mentioned about this distinction between roles that are at data or machine learning companies where it's their core or maybe their core services depend on really critical pieces that they're doing there versus a company where their core business is somewhere else and analytics is in service of that. Maybe it's descriptive analytics, maybe it's other things. Do you actually think that the roles themselves are a little bit different between these organizations?

Santona [00:08:27] Yeah.

Tim [00:08:28] Because one of the things I think that's interesting is you look at the history of innovation and a lot of times these very data as their core businesses build out new cutting- edge infrastructure or they come up with new machine learning techniques. I just think about the big data movement and Netflix and Yahoo and all these organizations really pushing the envelope of big data and then all companies are like, " Oh, we want to get into that too." So should we be thinking differently about the distinction between these two different kinds of businesses?

Santona [00:09:00] Maybe. So there's a couple of different factors. One is having those problems to solve. When you're at a certain scale, you have a unique set of problems to solve. And the other thing is having the resources to solve those problems. So again, if you're a big company, you're going to have those resources, but you never know where innovation comes from. So yes, a lot of the times the technological advancements that we've had, although the recent, I think the generative AI is half academic, half industry, I would say. But sorry, that's an aside. A lot of the times these advancements do come out of these bigger companies, but it's not always the case. And you really don't know where technology is going to come out of, the touchscreen came out of CERN, which no one was developing the touchscreen there, it's just how it worked out.

Juan [00:10:01] The web came out of CERN.

Tim [00:10:03] Yeah.

Juan [00:10:04] You never know where all this innovation...

Tim [00:10:06] You never know necessarily where the innovation's going to come from. And maybe in some cases it's coming from these data- first companies, maybe in some cases it's coming from other industries, but...

Santona [00:10:15] Yeah. And sometimes it's coming from tooling companies. So not everyone can build X, Y, Z tool within their company. So these smaller companies that still need some version of that solved and so there's a market need and tooling company comes in for...

Juan [00:10:40] Who innovates? And the other question is who should be innovating because if you're in a smaller company, I would argue that everybody should be innovating. I'd argue that. But then how do you measure the product teams like, oh, you're doing this work but I'm not seeing the immediate value right now. But like, oh, hold on, hold on, give me more time, there's still unknowns in here and if we have a breakthrough, this is going to be huge, but sometimes it's a risk if it takes. So who can take those types of risks to do innovations?

Santona [00:11:12] That's a very good question. Certainly, there has to be room for that innovation.

Tim [00:11:23] Hey, cheers.

Juan [00:11:24] We just popped in here, look at that. All right.

Santona [00:11:27] Welcome, welcome, welcome to my show.

Joe [00:11:30] Anytime, you're allowed to crash here.

Tim [00:11:32] I didn't think it'd be a party. I'll get...

Joe [00:11:34] So you're talking about innovation and who should do it?

Santona [00:11:37] Yeah.

Joe [00:11:37] What do you think?

Santona [00:11:39] I think that as individuals, I think that's where innovation comes from, right? Yeah. It's collaboration, it's working across boundaries. But I don't want to say just because you work at a smaller company that's really focused on solving a specific problem that you can't innovate the next technology, I think it certainly could happen. And if that's what you really want to do, you can also partner up with academic institutions or these bigger companies. So yeah, I don't think there should be boundaries around who should be innovating. The only thing that I think matters with innovation, and this is a lot like research, is you have to know where you're at. So a lot of reading and keeping yourself abreast of where we are today, because that's how you make the poke outside of that.

Joe [00:12:27] I've been thinking a lot about this in the context of, what was that? There was some law at 174 or something like that that Congress had passed about taxes back in 2017, like a tax cut bill. But what it means is now R& D expenses have to be amortized and not expensed. And so I wonder what this is going to do to...

Juan [00:12:48] Bootstrap companies.

Joe [00:12:49] Yeah. Right?

Juan [00:12:51] This is the topic we were talking about a while ago.

Joe [00:12:54] Right. Because before you could expense your R& D against your...

Juan [00:13:00] This could go back to your developers building software and stuff.

Joe [00:13:04] Yeah. So it screws you over on that one. And then I'm like, "Okay."

Santona [00:13:08] Why?

Joe [00:13:08] Why?

Santona [00:13:09] Yeah.

Joe [00:13:10] So it's an accounting thing. So now instead of expensing that payroll expense in one lump against your revenue, now you can only do I think, what is it, 20% per year or something like that?

Juan [00:13:22] Yeah.

Joe [00:13:22] Or whatever number it is.

Juan [00:13:23] So let's say that you are cashflow neutral, right?

Santona [00:13:26] Yeah.

Juan [00:13:26] Well, I made zero money so I pay no taxes, but because your expenses were your payroll, and they're like, " Well, no, you can't now put all your payroll as your expenses. You can only put 20% of it."

Santona [00:13:38] Oh.

Juan [00:13:38] So it's like, oh. So technically I have, well, more money so I have to go pay taxes on that, but I made no money.

Joe [00:13:43] What I wonder what it's going to do to innovation, especially for... Yeah.

Juan [00:13:48] But people want to do bootstrap and want to innovate in a different way. They don't want to take a lot of capital and stuff.

Joe [00:13:55] Oh, yeah. That could kneecap a lot of stuff because I think the promise of a lot of government investment into small businesses is maybe they'll do something interesting. I was at a conference last night in Utah where they're celebrating all the people involved in tech and AI and I'm like, yeah, okay, cool. That's good to see everybody here and I wonder who's going to be able to innovate under these kind of constraints because just this is business. So it's something out of left field where it's just a consideration. But I think absent of that, sure, innovation's one of those things that hopefully people are doing, but it depends on the type of business you're in and what the appetite for risk is. Because if you're small, the appetite for risk, on one hand you should be taking risky things, because that's what's going to make you great. On the other hand, you don't have a lot of runway.

Juan [00:14:44] So... Yeah, go ahead.

Santona [00:14:46] You could argue that smaller companies are doing their innovation anyway within what they're trying to bring to market, right?

Joe [00:14:54] Yeah.

Santona [00:14:55] Unlike Google's Moonshot or something, I'm literally putting everything at stake if I'm...

Juan [00:15:01] No, at that point as a startup, your definition is innovation because you're doing something nobody has done before. But if you think about the function of data in our organization, so what is the function of data? And I think this is a topic we've had in so many...

Joe [00:15:20] Is this one of those Bob questions on office space? Like, " What do you do here?"

Juan [00:15:25] Well, many people say, " Oh, think about 10 years ago, oh, data science and data's in new oil." It's like, oh, there are things that we're not doing with data we should, so we should be innovating because that could generate... So one could argue that a lot of innovation in companies should be coming from data teams, but then at the same time it's like, well, you don't understand the business, you don't do this thing. So it's like you see this disconnect. I don't know, this is what...

Tim [00:15:52] This is a blurry line I think for data organizations in terms of what you brought up at the beginning in terms of output versus outcomes, is working on innovation in a data organization, especially if you're not a data core company, is innovation a worthwhile place for people to be spending time on a data team?

Joe [00:16:17] I don't know, man.

Tim [00:16:19] Versus just crank out more BI reports?

Joe [00:16:21] I don't know. Look at successful businesses, I was talking to a friend of mine the other day and she was hanging out with a bunch of wealthy CEOs of companies and one of them owned a beer distribution company and she actually showed me pictures of all these yachts. Beer distribution company guy, hot dog stand owner guy who has the largest hot dog stand company in the US. And it's like when you look at companies like Chick- fil- A, I went there a few months ago, inaudible. Chicken sandwiches, I'd say from a technology standpoint, they're doing really innovative stuff. Crazy stuff.

Juan [00:16:56] A company like Chick- fil- A, I know that they own their real estate so they have an entire real estate arm and all the data that people do for real estate because they want to know exactly where exactly do I need to go put this place to go and then have drones and they figure all this shit out.

Joe [00:17:10] Oh, yeah. That's innovative.

Juan [00:17:12] Yeah.

Joe [00:17:12] They're using data to make those kinds of decisions so I would say businesses, if you can do it that way and just succeed to improve your business, that's the same as it ever was. But to do data for its own sake I would say is a bit weird because that's like is that really going to get you the benefit because you have to know what you're doing too. Right? So I don't know. As you say, outcomes and outputs, is that what you said?

Santona [00:17:41] Yeah.

Joe [00:17:41] "Well, what you really mean to say is..." Mansplain it to you.

Tim [00:17:45] So innovation can be outcome focused and there's a lot of companies that are like Chick- fil- A, that are doing very interesting things around data that's innovative, even though they might not be considered a company that's data at its core.

Santona [00:17:58] Right. Innovation can be a byproduct easily of outcome driven work. So that's one example. That's how things come out of places that you don't expect, and that's what I'd say. And then I would hope that even under constraints of limited runways and stuff, I would hope that part of enjoying your job is having some creative freedom. I don't think anyone really would be happy if you're like, " You have to check these three boxes and this is what you have to do every..."

Joe [00:18:34] That's a lot of people's jobs. It sucks, but that's... We all have jobs like that.

Santona [00:18:39] Right, yeah.

Juan [00:18:40] Compliance stuff.

Joe [00:18:41] Tim, just move that pile of boxes over there. Now move them back to the same spot and I might just do it all day.

Tim [00:18:51] And count them while you're at it.

Joe [00:18:54] Count them each time. But we joke, but that's a lot of work.

Tim [00:19:01] Yeah.

Santona [00:19:01] This is true.

Joe [00:19:01] It's a lot of people's jobs.

Santona [00:19:02] This is true, yeah.

Tim [00:19:05] Well, a lot of people in the data space too, I feel like maybe they're caught in a job. Maybe a lot of them have the title of Data Analyst and they don't feel like they're getting to do great stuff. Maybe they don't even feel like they're working on this stuff that's going to have the biggest outcome.

Santona [00:19:20] Exactly. And everyone's aspiring to be a data scientist, but the lines are so blurred anyway.

Juan [00:19:27] So this is interesting. People are driven, they want to do something different, fun, innovative, but they're like, " Ugh, I'm just moving bucks from here to here." And I'm like, well, there's this new cool technology that comes out, so I'm going to go test and play because I feel bad by testing and playing with that stuff, I'm being innovative. But maybe you are, if you are doing things that are outcome driven, but if you're like, I've just got to move my boxes here and there, and then are you really? You're just maybe just wasting your company's time.

Joe [00:19:58] You got a pallet lift or a fork lift? That's innovation, right? Think about real innovation, whoever invented the shovel, I think Morgan Housel was writing about that in his new book, Same As Ever. That's a genius thing. People were digging stuff for years by hand or rocks and then a shovel comes along, that's pretty amazing. But you got to consider who was the first person that thought of doing that? That person was probably considered to be an idiot. Why are you wasting your time? Keep digging holes. What are you doing?

Juan [00:20:25] Always that same picture character like the wheel or you got a gun or whatever.

Joe [00:20:32] Innovations are those things where... Innovation's a weird thing too because it happens out of left field because it's a confluence of different ideas and experiences that come together for something new.

Santona [00:20:43] There's another thing I find interesting as far as innovation goes is there have been so many times in history that essentially the same idea was developed in completely separate locations. And especially if you look at the history of physics and math, there's something natural about the progression of innovation, that it's an unstoppable force and based on what we know today, we are going to get us to the next step. And there are different places where it can pop up. I think that's really cool.

Juan [00:21:20] Then the other thing that's my big pet peeve is when we think we're innovating, but we're not because we don't read, know our history and we don't build on the shoulders of giants.

Santona [00:21:32] Reinventing the wheel.

Joe [00:21:34] Give us some examples of this when you're talking about that. I find this is interesting, the false positives of being innovative. What does that look like?

Juan [00:21:42] I just think that if we look at the data space, I think just by not understanding a lot of the core, foundational principles, then we start thinking about all this stuff as like, oh, this is a brand new thing. We get excited about it. You've talked about this too, the dead 10 years or whatever, the dead decade that we didn't advance anything because we were just doing the same thing, just in a different look. A database, a data warehouse, it's all the same shit. Obviously there's differences in there, but conceptually the principles are the same. And then we just focus on the specific details and those details are different. But when we zoom out, the problem is still the fucking same problem we haven't solved. The talk I'm going to give this weekend, it's just a sneak peek, I'm going to just show a screenshot of a problem description, which I swear that everybody's going to say, " Yeah, yeah." Well, I just copied and pasted that from a paper from 30 fricking years ago. What the frick have we been doing for the last 30 years?

Tim [00:22:44] Evidently not solving that problem.

Juan [00:22:46] But we all say we have done that problem. I can't find my data. This customer is, I don't know this customer is the same customer, this...

Santona [00:22:54] One thing is when there is a need that surfaces everywhere, then you often have multiple people jumping on that idea of solving for that need, and somewhere down the line they forget to iterate and come back and see if that need still exists or if it's evolved or if they need to refine their idea or not. So I think that's how we end up with 101 BI tools, none of which are good. Right?

Tim [00:23:23] Yeah, that was one thought that came to my mind is how many ways of BI tools inaudible successful, but think of how many BI companies have been started.

Santona [00:23:33] Right. And successful is not the same as being good.

Tim [00:23:36] Yeah. True. Very true. And connected to this is it's interesting to see where a real novel innovation is going to happen, where you're really actually making progress. And for example, I think something that is an example of this maybe is you've got lots of database companies, lots of companies that are in the data warehouse space and things like that. And they're like, " Oh, my indexer is a little bit faster", and things like that. And another company could have figured this out. But then you have a company like Snowflake, comes in and says, " Oh, I'm going to take advantage of the fact that I can separate storage and compute and I'm going to create a novel approach that's the cost economics of managing on- demand data warehousing." It's like, oh, well that was an interesting trick that they did there and it didn't have to be Snowflake. It could have been somebody else with...

Joe [00:24:25] Cloud Native.

Tim [00:24:25] It's interesting to see when these switches get flipped where now maybe we are actually making some progress.

Joe [00:24:31] Do you remember when Snowflake came on the scene and you first started meeting with them and you're like... I think the other innovative thing they did is they just understood how to talk to the customer in a really simple way. You've tried this product here, you tried this product here. It solves a lot of those pain points. I remember meeting with the salespeople, is it 2016- ish?

Santona [00:24:54] I thought you going to say 20 years ago.

Joe [00:24:56] 20 years ago, inaudible. But no, I instantly recognized, they get it. They know how to talk to customers, they understand the enterprise, they understand the pain points, they understand. So it was as much of, I think a technological innovation as it was a sales and marketing innovation too, where they just knew exactly how to talk to the customers and solve their pain points. And then the sign-up process was super easy too. It's like, I don't know. You have a credit card? Yes. That's cool. So after two weeks you're trying this out, you can put that credit card in and then you can pay.

Juan [00:25:32] Well, I think it goes back to the outcome versus output. I think it's like you have a very clear this is what you're trying to achieve. This is the value of trying to go achieve that. This is what we're focused on. So that's the outcome perspective, I think. And I agree, the case is it's also marketing and sales, but what is that? That's just communication. That's all it is. And I think that goes back to a lot of the issues that we see today is like, well, we're not able to communicate what am I doing and why is it important? And then when you are communicating, you're communicating the output, oh, I did this. I generated so many dashboards and I generated so many pipelines and all these DBT models we did and oh, all look at all this stuff that I have. I'm like, what was the output? What was the outcome?

Santona [00:26:14] Exactly. That is probably my current biggest pet peeve. I'll hear a sentence like, "I have 1,600 DAGs running in production." Why do you need 1,600? You're a BI team.

Joe [00:26:26] That's crazy because that's 1, 600 points of technical debt that's used to incur.

Santona [00:26:31] Right? Why are you proud of that?

Joe [00:26:33] Yeah, it's like I like kids, but if you replace the word'DAGs' with 'children', would you be as impressed at yourself? " I have 1, 600 kids." Slow down.

Tim [00:26:46] Is that the data engineering litmus test?

Joe [00:26:50] You're like the Will Chamberlain of data engineers at that point. Yeah.

Tim [00:26:54] Replace it with children and then how do you feel?

Joe [00:26:56] Yeah. Litmus test. Yep. Actually that's a good one. I like that. Yeah, or something like that. Things that are hard to take care of, require a lot of thought. Juan's not getting PTSD.

Juan [00:27:10] This is great, in our notes we bold anything that is T- shirt- worthy, and that's a good one.

Joe [00:27:16] I always come up with random crap with you guys.

Tim [00:27:20] Hey, the more pithy and weird sayings we can produce, the better. That's the outcome that we're driving here. When you mention about number of DAGs and things like that or dads.

Joe [00:27:33] Dads.

Tim [00:27:33] Dads, hopefully not thousands of dads. I don't know which one. I remember when DBT came out with their blog post where they first revealed how many DBT models that they had and it was 1, 200 or something like that, some ridiculously high number. And I remember that it was a weird inflection point for folks along this whole output versus outcomes question of like, oh wow, that's so impressive. And then it was like, oh wait, is that impressive? Is that bad? Maybe that's bad. I don't know.

Joe [00:28:15] About what year was that?

Tim [00:28:16] I think it was maybe Three years ago. Two years ago?

Juan [00:28:19] Probably less.

Tim [00:28:20] Less than that maybe. Okay. I know it was after they already changed their name to DBT Labs, so it must've been maybe a year and a half or two years ago. But yeah, I thought that was an interesting moment in thinking about as analytics engineers, are we being productive, are we impactful?

Santona [00:28:36] Yeah. Are we good at building metrics of the metrics that we've built for ourselves.

Joe [00:28:41] Are DAGs and models like the new lines of code?

Juan [00:28:45] And apparently more is what? Yeah.

Tim [00:28:48] That could be an impressive vanity metric to say I've created... Right?

Santona [00:28:52] Exactly. Exactly.

Juan [00:28:53] Again, but people have been doing this for so long, they've created their ETL pipelines and informatics and stuff, and they have all this shit that goes on. And then they have their own code and it's not an ETL tool, but they did it in the store procedures, they did all this stuff so that we've been doing this over and over again.

Joe [00:29:10] Oh, yeah. I don't know if you remember. Yeah, store procedures. That's a hellscape in itself. Lots of probably impressive outcomes and outputs, but they're highly invisible typically, you look in a database, it was like, I don't know what's happening there. Oh, that's happening there.

Juan [00:29:27] I remember once working with a customer, were like, okay, " So this store procedure, what does this store procedure do?" And I'm like, " Well, there's all these lines of code in it and well, let's look at the comments." And obviously the comments started 20 years ago, so you have 20 years of comments. And then they say, " Well, who should we ask? Who's the responsible person?" " Well, the last person who wrote the comment", they went in there and then this was a year ago or whatever, " And that person died from COVID." I'm like, " See? This is the stuff that you have to go through", and this is not new anymore. You know what? And now it goes back to my like, oh, we're innovating with new modern things. We're just replicating that same type of debt in this other new modern tool, whatever, you want to go do that. So this is why we need to focus on the principles.

Joe [00:30:15] How do we focus on the principles in terms of producing better outcomes?

Juan [00:30:21] So the question here is then the architectures that we have today, this is something I want to talk about, what are the current architectures? What are the standard types, approaches, principles of the architectures that we have today, and what do we need to innovate on that to make things simpler, better or whatever, to be able to drive better, faster outcomes.

Santona [00:30:45] Architectures, in terms of?

Juan [00:30:47] Not specifically in our data world right here.

Joe [00:30:53] Obviously the old trope, which we'll be talking about at Data Day Texas, and it's going to be funny because there's two opposing viewpoints on this. One is business value should matter in this equation for outcomes, and the other is business value is the last thing you should be thinking about. Which I think you'll find out who's going to talk about what at Data Day Texas. We're not spoil a surprise for you. But that's one litmus test perhaps to think about it. I don't know, but I don't think it's the only one. I can think of many. You're on the spot right now.

Santona [00:31:27] No. I'm trying to parse it and trying to figure out what angle to come at it from. So what comes to mind, because this has been top of mind for me, is the OpenTable format. This technology that's coming, it's not exactly new, but it's catching on more now. And it comes to mind because I've been trying to organize this Chill Data Summit by the way. Shout out Chill Data Summit, New York City. It's February 6th, but it is...

Joe [00:31:59] We'll be there. We'll be there, by the way.

Santona [00:32:00] Yeah. Joe's going to be there. I don't know, Juan hasn't said anything yet.

Juan [00:32:07] inaudible so much. I don't even know where I'm going to be. I don't even know where my calendar is.

Santona [00:32:12] But yeah, that's a piece of innovation I think that's come out of a necessity, which is where sometimes some innovative... We talked earlier about how innovation comes from weird places.

Joe [00:32:25] Sorry, I did a beer burp. So outcomes...

Tim [00:32:31] This is one of the benefits of being on Zoom, virtual.

Joe [00:32:39] Yeah, I won't be invited back in person. Sorry, guys.

Juan [00:32:44] Please get back to...

Joe [00:32:46] She's like, " I can't even talk right now. I just have to go leave."

Tim [00:32:49] This is a test, by the way, to see how you react to inaudible tools. Yes.

Joe [00:32:53] Can you talk a little bit about what OpenTable is?

Santona [00:32:57] Yeah, sure. So okay, let me try to do the short version. I was just making slides for this. So you've got databases and data warehouse, and as you've said, we've been toying around with how to make them more efficient or whatever requirement comes up addressing them. And then we had the data lake that came out of, okay, you don't need this structured interface to interact with your data that a data warehouse produces for every use case. If you have a use case that you're serving a machine learning model, let's say an NLP model, you just need to store the vectorized documents or whatever. So a different set of requirements then, so you've got this data lake that's closer to just files and storage. So that's a data format, like Parquet is a data format, so serialized files and storage. And then on top of that you have some hive metadata catalog or something, which is why I was asking you earlier... So that works, but it is pretty low level because you have the data level and then you have a catalog that doesn't have any real power to influence how the data are arranged, it just knows where things are. So when you're running a query, you still have to do a lot of thinking about, okay, how do I access this piece of information? Because a piece of information can be spread across multiple different files and multiple different parts of a file. So that query planning and logic, it just takes more work. The lower level you get typically is, the more engineering intensive it is. So that I feel has been the biggest issue with data lakes, and that's often why we've ended up with swamps instead of well- maintained lakes and stuff. So I think now we're entering a new era, where a table format, open or not, as opposed to a data format is a specification. Just like Parquet is a specification for how data should be serialized and stored in these files. The table format is a specification of how the files and information need to be organized and how you contain that information so that you can trace along a path to get to the information you want. So read up on it, this is a two- second description. So just let me... Yeah. You've got this layer on top of the data basically that's in between the data and the catalog that is smarter about how the data is organized and how you can access it. So the query planning, it's a little bit like stored procedures actually. So the query planning is done ahead of time for you so that when the query engine comes in, it can get to it more efficiently.

Joe [00:35:51] And I like this a lot. I've been noodling on this quite a bit. What kind of innovations and outcomes do you think are going to come about because of these table formats?

Santona [00:36:02] I think what we're going for here, table format is a technology that's enabling it, but I think what we're going for in terms of the need is a data lake that's more usable, that's more queryable.

Joe [00:36:15] But what's the problem with data lakes right now?

Santona [00:36:20] It's hard to access the information in a meaningful way. I've built a data lake and use it, but only for an ML use case. I would never really do a data lake if I was trying to answer 60 business questions, I would go to the warehouse because it's like I'm doing the transformations, I'm generating the tables and views that are going to be relevant. So it's like even if you think back on just de- normalized versus normalized data, it's the same thing. How queryable, I'm trying to coin this thing, it's not going to work, but I'm trying to coin the term'queryability index'. It would help if it was easier to say, but basically the more optimized you get in terms of how you're storing things, how compact things are historically, the less queryable your data has gotten. But I think with this lake house and this table format layer on top of the data, you can bridge that. It's still Parquet files stored, but because you have this layer over top of it that increases a queryability.

Joe [00:37:25] Yeah. Yeah.

Juan [00:37:27] Can you give an example of how the data is and how would you access it without that queryable index you were talking about and then with it? What would make it, because you have the queryable index makes what, faster, better, easier or whatever?

Santona [00:37:46] Yeah, let me try. Okay, let me go back to the data lake that I had for an ML application. Here, I would do vectorize all the documents and stuff, and then I would have different folders. It's not really folders because S3, it's just pads, but you can think of it as folders of data, let's say from a certain day and then that's partitioned by the hour. So some way of organizing the data that doesn't have anything to do with the data really or very little to do with the data, and that's what the metadata catalogs without the table format, what they do is they know which folder to open up to get to what data and so on and so forth. What the table format does, so you still know where the data are and what you'd have to open up to get to them. But in addition to that, one of the cool things in the table format is you can do a merge on read, which is instead of going and copying your data and making changes when updates happen or overwriting them, God forbid, what you're doing is you're not changing...

Joe [00:39:05] Always do that by the way.

Santona [00:39:08] Instead of changing the data, you're just logging the updates that need to happen to it. So at query time when the customer or when the user is trying to retrieve that data, it combines and merges both the data and this change log to see what the relevant transformations need to happen to the data to get to the current state. I almost think of it as a smart layer in between that preserves the optimization that comes from having a lake- like infrastructure that you can still access. You can write queries against it. You can say, " Select this and join this", depending on what your query engine is, and you can get your data.

Juan [00:40:00] At the end of the day, it's like you have the raw data, how it is, and you just put a layer, which is, like you say, it's basically a smart layer to understand how you connect and how can you combine this to improve your optimizations?

Santona [00:40:10] Exactly.

Juan [00:40:10] On how to query things. At the end of the day, one of the things that fascinates me in computer science is it's all about abstraction layers. So as a computer scientist, the way I was educated was like, yeah, you basically figure out what abstraction layer you enjoy.

Santona [00:40:27] Exactly.

Juan [00:40:27] Right? People like to go all the way down to the core, to the bits and everything, and then you get higher, higher, higher, and then to user interfaces and stuff, right?

Santona [00:40:36] Exactly.

Juan [00:40:36] And then you are also become a compiler like, oh, I like to work between these two.

Santona [00:40:41] Exactly. Right.

Juan [00:40:42] And so I'm comfortable in these two and I'm an expert in the compiling part. So I think this is always interesting for the data.

Santona [00:40:49] The other thing I want to say on table formats is it's been there, so let's say, how does Snowflake make your table when you do a transformation or something like that? There is logic that's happening too because a table is not a real manifestation of data. It's still in files somewhere and so it has this set of stored procedures or whatever it may be, its own version of table format that retrieves the data in this smart way for you but now that's where the openness comes in. We have these projects that are almost simultaneously coming out, have been coming out that it's like, let's just break that. Let's bring that out. Let's make it open so that you can have your data in these optimized file formats, but you can still have that logic and the metadata management built in.

Joe [00:41:43] I think the other cool thing with the OpenTable formats is it allows you the possibility of, I wouldn't say avoiding vendor lock- in but having more options for query engines, which especially in today's world is not a bad thing. We were talking about this in the podcast, I can't remember with who. I feel like one scenario is that Snowflake becomes... As you have more Iceberg, Hudi, Delta Lake and so forth, one scenario is everyone just becomes a better query engine and a different type of query engine. And so you just pick the query engine that you deserve.

Juan [00:42:24] Because it's a higher- level abstraction, where you're doing all that just to have better query optimization at the end. That is the outcome I want. I want to have very fast queries and then I don't care if it was this, all I care about is I got faster and that's what I was looking for.

Santona [00:42:37] Exactly.

Joe [00:42:37] Yeah. What was that? I think in Rob Martin's Clean Architecture, I think he was talking about interface separation and concerns or something like that. I can't remember which principle it was in solid maybe it's open- closed, but either way, whatever. But the whole point is basically any higher- level abstractions don't need to care about the lower level components and so forth. And I think that's very much where we've always been going and will continue to go. So with storage, it's an interesting one because I feel like that's been the crux, getting your data out. Google did an interesting thing two weeks ago, I think it was, where they got rid of the egress fees, if you want to move your data out of Google, which I think was pretty dope. So hopefully egress fees go the way of cell phone minutes at some point. I think they're stupid, but there's a lot of money to be made on that. So you're going to keep doing that as long as you can until somebody throws a wrench to the whole prisoner's dilemma problem that everyone has to go do it. But back to the interoperability issue, it's a very interesting one because I feel like the world is going to move away from... Hopefully, one scenario is that this actually doesn't work out at all and that OpenTable formats are a fad and go away or just don't get the traction they need. On the other hand, the other alternative is that they get a lot of traction and then companies are forced to compete on other stuff besides storage, which becomes, and it always has been really commoditized. There's obviously certain factors where storage, where you want that to be really awesome to work with for retrieval, depending on what you need to do. But that also depends on the query engine because if you study your databases, that's kind how that works. Yeah. But then there's a question of whose optimizer is going to give you the best performance for the data that's in that OpenTable format? And I think that's maybe an interesting question. Because now, does it become a matter of database optimizers being better? I don't know. But it is an interesting thought experiment. I think that might be more ongoing though.

Tim [00:44:46] It's hard to tell which of these trends are going to overtake each other because you think about what different vendors like Snowflake and the Databricks and the Microsofts of the world are doing around, just their data warehouses and improving query optimization and things like that, virtualization tools, open frameworks versus AI and what's happening with LLMs and it's hard to tell...

Joe [00:45:19] There's nothing happening there.

Tim [00:45:20] Right? It's a very weird moment in time right now I feel like for data people.

Joe [00:45:26] It's so confusing.

Tim [00:45:28] Because there's interesting things happening with open source. There's interesting things happening in the cloud and with proprietary technology, but ultimately we're in this weird moment right now where things feel relatively stable with the technologies and the approaches that we're used to. And yet there's this big wave here that we see coming in and we're like, what is this going to do? What's this going to change for us? Is this a distraction? Are we going to be really output oriented for the next few years here and figure out that we were spending a lot of time on something stupid? Or are we at the beginning of a productivity revolution as data people?

Joe [00:46:05] I think the answer is always yes to all of everything you just said.

Tim [00:46:10] It's great and it's stupid.

Joe [00:46:11] How many hype cycles have you been through? it's always the same. Of course it's going to have a lot of ridiculousness and a lot of outputs but at the end of the day stuff, there's a lot of value to large language models and generative AI. I use it all the time. I use it as somebody to talk to because I'm lonely.

Santona [00:46:31] Oh my God.

Joe [00:46:32] Yeah, I have

Joe [00:46:33] nothing else going on in life.

Tim [00:46:34] "Your name is Bob."

Joe [00:46:37] Or Dash. But it is an interesting one, but I find that it captures a lot of blind spots in my thinking that I wasn't aware of because it's read a lot more books than I have, for one. And so that's...

Santona [00:46:50] It's better than a rubber duck. Yeah.

Joe [00:46:52] Yeah. I have four of them on my desk. In fact, I posted a picture the other day on my Instagram where it was like...

Santona [00:46:56] Four ChatGPTs.

Juan [00:46:57] Yeah, I was going to say, four...

Joe [00:47:00] That's like Warren Buffett and Charlie Munger, I got these rubber duckies of them back at the...

Juan [00:47:04] No, I thought you said you had four different LLMs that you've talked to.

Joe [00:47:08] Well, no, I do that too actually. I use Anthropic and I subscribe to OpenAI and then I also have Bard. It's interesting comparing the results of all of them.

Juan [00:47:19] Perplexity is the other one.

Joe [00:47:20] Oh, I have that on my phone. It's helpful, Perplexity. I use it all the time. It's dope.

Tim [00:47:25] It's fun.

Joe [00:47:26] I don't know. Let me ask you this, to bring it back to OpenTable format. Sorry, but we both know, so you work at Upsolver, why is Upsolver interested in OpenTable formats? I'm just curious. What's the fascination?

Santona [00:47:41] Yeah. That's a good question. So we've always, maybe not always, but our product is a replacement spark. We have a query execution, we have query planning, everything's built in and it's always been like the ETL, the transformation comes together with what we call SQL Lake, our data lake, really a lake house structure. And it's been a really strong value prop for our customers. But I think with the advancements, with the creation of these other table formats that are open in the lake house, we want to basically separate the transformations from the file formats and table formats, like the storage aspects. So again, this is what we were talking about a little while ago. We don't have to have the two combined because you can be modular, you can have this metadata layer and metadata management layer that is just about organizing your data and you just pick up that box and you move it here. And then one can also reach the things that, because we all speak the same language, so the standardization that comes with this open metadata management is worth more than...

Juan [00:49:02] But wait, but isn't the standard here already something like SQL and that's it?

Santona [00:49:06] Well, there's no planning in SQL.

Juan [00:49:12] Well, so this is where I talk about the pendulum swings, right? We're reinventing a lot of the wheels of just query optimizations that exist, that are very well studied in databases. And now that, I think what happens is that we go to this cloud infrastructure and then we separate storage and computers and then we're like, " Well, we got to redo all the work that's been decades done in query optimizations", right? So then it's like this is... I don't know, the honest, no- BS for me is this is not an interesting problem because it conceptually has been solved. Now you got to go change, figure out how to adapt it to this thing and that thing. But conceptual is like, I don't think there's new optimizations, cost models or stuff that needs to be invented for this, but I would be happily proven wrong, please.

Santona [00:50:03] When you say" SQL", you're assuming a data in a format, you're assuming a table essentially, because what you're talking about is an entity, a piece of information that means something as opposed to being raw lines in a file. So in order to get to that, that's what the promise is and that's what it's harder in a data lake because...

Juan [00:50:31] Yeah, and I think this is tied to you is, oh, modeling or things like that. There's some things that effectively, there's models, there's semantics and stuff everywhere, but we just decide not to pay attention to it. So we're like, " Well, I want to avoid doing all this modeling things, so I'm going to have to go jump through all these hoops to do all these things so that I can optimize queries." But if you did invest in doing all this modeling stuff, you wouldn't have to go do this stuff because then...

Joe [00:51:00] I'm glad we some spicy takes an hour into the podcast.

Juan [00:51:05] Wow, we're 50 minutes...

Joe [00:51:05] I'm going to let you guys fight it out for a second. I'm going to go grab another beer. Guys, you want another beer?

Santona [00:51:09] I'm good, thank you.

Joe [00:51:10] You good?

Juan [00:51:11] I got mine already handy.

Joe [00:51:12] You good, Tim?

Tim [00:51:13] I'm good for now. But I'll need another soon. I want to see what the next hour of this conversation is like.

Juan [00:51:21] Well, we can keep going. You got to keep on...

Tim [00:51:26] Well, eventually. But here, let's do one final question here before we do lightning round.

Santona [00:51:32] Okay.

Juan [00:51:32] So you don't want to...

Santona [00:51:33] Do you want me to answer one?

Juan [00:51:34] Yeah. No, se needs to answer it.

Tim [00:51:39] Oh.

Santona [00:51:40] I agree with you on the swinging of the pendulum metaphor, right? I think we do do that and I think there is a little bit of that happening. So in many ways the lake house is the pendulum swinging back towards the warehouse and the database. That's the first thing I'll say. I think of the lake house as a combination of lake databases and data warehouses. Because the other thing you get with these lake houses, which is not just a table format, it's table format plus plus, right? With clever management and the query engine is asset...

Joe [00:52:08] Well, you get quota and structured data as well.

Juan [00:52:13] I think that's what makes it harder, that's why you get different.

Joe [00:52:17] Yeah.

Santona [00:52:17] You go to what?

Joe [00:52:19] The lake house architecture too, it moves past the structure of data world or tabular data I should say. Right? I'm thinking about, when I think about data modeling, I published a post last week in my practical data modeling Substack about how I thinking about data modeling and I feel like it's moved beyond the world of rows and columns at this point. Because the reality is people are combining different types of data together now. It's not just...

Santona [00:52:48] Yeah, exactly.

Joe [00:52:49] Yeah, that's what it is.

Santona [00:52:51] And one model doesn't serve everything, as you have new use cases and different use cases. That's another problem with what we think of, or I think, you can tell me if you think of it differently, but when I think of a data warehouse, it's often from the old days of an enterprise data warehouse, it's basically meant to be a model of your business. And so you go through and you figure out entities that are important to you and it's just so far abstracted from the actual data. Sometimes you have use cases that are somewhere in between.

Joe [00:53:21] Yeah, that's just it. But Bill Inman, our good friend, wish he was here, hope he has a good recovery soon. But I think he's a big proponent of the lake house. God, he's written three, four books on it right now. But even in the'90s he had this realization that the amount of structured data in the world that is tabular data was really small compared to all the text data and this is in the '90s. And it's funny, when you talk to him, he had the vision of the lake house back then because he's like, " Most corporate data sets even back then were as text data, stuff people type, that's not in the table."

Santona [00:53:57] Right.

Joe [00:53:58] So that's why he worked in Textual ETL, still is working on it. That was genius. He laments to me and I'm sure to many other people he talks to, but it's like, he left that data warehouse world behind.

Juan [00:54:11] A long time ago.

Joe [00:54:11] A long time ago in the early'90s. He's just like, " That's great."

Tim [00:54:15] The world wasn't ready to move on. We wanted to swim in that for 50 years.

Joe [00:54:21] We're still swimming in it, dude. It's crazy. He'll still talk about it, the data warehouse. But he moved on mentally a long time ago. He was studying stuff like taxonomy, zontology, all this stuff that he was not familiar with at all. Stuff that you'd be awesome at, you'd be awesome at, but the realization is just objectively that is most of the data is all the instructions.

Juan [00:54:47] It goes back to, I think the approach to deal with this is to actually invest in the metadata and the semantics of what this stuff actually means. But I think what history has always shown us is that it's always been hard. It's expensive, it's manual. And I do genuinely believe, to bring back a little bit of inaudible is this is Godsent. Like, hey, this thing is actually going to help us to go...

Joe [00:55:16] Oh, I think it's going to help us a shit ton. Literally the idea of a transformer is to translate, literally that's why it was invented by Google was to translate languages and does a good job at it.

Tim [00:55:28] Yeah.

Joe [00:55:28] So yes, what happens when you have concepts that you're trying to translate between different things? It was interesting, I think Jeremy Stanley from Anomalo, we were talking on a podcast and it was interesting, he brought the idea of relational embeddings, and I think that's an interesting concept too. Where applying the relational algebra concepts, but to embeddings, you'd have a vector database. That's a fascinating topic. I don't know what he's done with it since then, but it was pretty cool.

Juan [00:55:55] So to bring this back.

Joe [00:55:57] We're nerding out.

Juan [00:55:59] No, I do want to bring this back to you.

Joe [00:56:02] Look at the audience here. Look at who you have, this is...

Juan [00:56:04] It'll be silly to go through our inaudible.

Tim [00:56:07] You're nerds, right? We're all nerds, right?

Juan [00:56:07] I do want to take us to our lightning round question, but there's so much to take away. We were discussing, so what? Go back to this whole thing started with productivity. We're talking about outcomes. So what?

Tim [00:56:23] That's actually part of my question here is...

Juan [00:56:26] How are you going to justify...

Tim [00:56:28] It can be one table or it can be something else or OpenTable. What's the better future we're trying to get to here. What are we trying to accomplish?

Juan [00:56:35] What is the outcome we want to achieve?

Tim [00:56:39] Is it as data people we feel like things are too hard and we want them to be easier? Is that the outcome we're looking for?

Santona [00:56:47] Faster. Stronger. Better. More optimized. Cheaper.

Joe [00:56:50] Cheaper.

Santona [00:56:51] Yeah.

Joe [00:56:51] I was just thinking of the iron triangle and which was...

Juan [00:56:57] Thinking about inaudible.

Joe [00:56:59] That's you. Yeah, that's it.

Santona [00:57:03] As Joe was saying, data takes various forms and you can't fit the same box to all kinds of data and that's why we have work. And again, I think of these things as all new because maybe it's a few years, but that's still new, right? Vector databases, graph databases, different data and search based, document based database and stuff. Different data deserves different treatment because the storage can be optimized and the query paths can be optimized in different ways and I think we should embrace that. I don't think it loses anything, but if you want to be grumpy and say, " So what?"

Juan [00:57:43] Well, this is a good point because nobody loses anything, but then you're probably leaving something on the table so it goes tied it back on innovation. We could be doing things that we are not doing that we didn't know, that we're pushing the barrier.

Santona [00:57:57] And there's always an opportunity.

Juan [00:57:59] Yeah.

Santona [00:57:59] But coming back to another... Sorry, I'll let you go. Coming back to another thing we were talking earlier. If you're an analytics engineer, that's a well- defined thing. You don't have to use, I don't know, you don't have to use a vector database.

Joe [00:58:15] You could use CSVs.

Santona [00:58:15] Yeah, no one's forcing it upon...

Tim [00:58:20] I'm going to read that blog post. I'm an analytics engineer. I spend most of my time on CSVs.

Joe [00:58:27] That's just reality. You're saying out loud...

Juan [00:58:32] What everybody's doing, right? It's Excel.

Joe [00:58:35] It's like saying I use one ply toilet paper at my house because that's all I can afford.

Tim [00:58:39] You're saying it's not something you say out loud?

Joe [00:58:45] Yeah, thrifty.

Santona [00:58:48] We've established that Joe doesn't have any filters.

Joe [00:58:51] Truly don't. It's an interesting one, but I think it allows people the opportunity to do things easier. So I would say all this stuff is an enabler and so in that sense there's not much downside. It enables you to do better things. But it is what I've been writing about for a long time and screaming at the sky about is that we have all these wonderful tools. But I think the big crux, at least to me is that we don't have either the knowledge and the skills to use them to the fullest potential to execute on the outcomes that are desirable to a business. And part of that I think is like... And you'll hear about this at Data Day Texas, it is about how do you get a practitioner to focus on outcomes that matter? Again, I don't believe that we have, at least to solve classical analytics problems, especially, which we're still struggling with for some reason. Again, back to the 10 years of wasted time, which I do believe we did. Why is it we're still talking about the same stuff? Why is it that most companies are still struggling with BI, let alone doing AI? Why is it that most people are using CSVs for stuff, right? I had Dave Langer on my show the other day about Excel, why is it that Excel is still by far the most widely used data stack in the world? Now it has Python, that is a data stack. Everyone uses it, not Python, but they use Excel.

Juan [01:00:30] The achievement outcome.

Joe [01:00:31] This is all they that matters and Excel, so I would say that the world...

Tim [01:00:37] And needed a fancy calculator.

Joe [01:00:37] The world, if you divided it into basically a bar chart of pie chart, however you want to look at in Excel, or donut chart, if you want to do that, the whole point is for all the Excel out there that's missed opportunity, that" data tooling companies" should focus on because that means that's a workaround to answering, questioning and getting an outcome that tooling and the practitioners in" our data space" haven't been able to provide to those practitioners. That's one thesis I have.

Tim [01:01:06] I think that's really interesting because for all the amount of time that going back to output versus outcomes that we spend as data people on building out data infrastructure and things like that, the sales team is still crunching their reports in CSVs and with Excel.

Joe [01:01:26] That's the sad reality of it. I just want to go, I don't know, go work on a ranch or something at some point.

Juan [01:01:31] We're watching the next pendulum swing over here and then let's see what's going to happen.

Joe [01:01:38] Yeah, it's an interesting one, but at the same time there's a lot of innovation, but I think what's going to be harder is if you're a legacy or mature company that hasn't embraced data or digital stuff. it's easy if you start from scratch, it's like this is day one stuff. And after seeing all the cool blog posts you see from the companies that had to do this because they're digital or data native...

Juan [01:02:01] And I think the stuff we're talking about, data format is comes from companies who are very digital first and stuff. These are the things that they're heading into. But then you see other companies who aren't and they're trying to get updated and they're fall into that and they think that that's the coolest thing. Try to explain that to your executives who are not very digital first.

Joe [01:02:24] Oh. No, I was just talking to a friend of mine, she'll be speaking at Data Day Texas and she's one of the most, I would say, accomplished data executives in the world and she was lamenting that, yeah, it's one of those things where if you're a mature company, a lot of it's driven by gut feel still, right? And it's like it's all up here. What do I need this stuff for?

Juan [01:02:44] All right, well let's head into our lightning because we still got more to go. We've got four more questions.

Joe [01:02:50] This is a long podcast for you.

Juan [01:02:51] Well this is officially the longest podcast now that we've done.

Joe [01:02:53] We can keep going if you like.

Juan [01:02:56] Because I know this is going to take a while here too. And actually I have to go back to do some stuff quickly, so hold on. All right, lighting around questions. Number one, so should everyone be thinking about outcomes or is it okay for some people to focus on outputs?

Santona [01:03:10] I think Joe brought up a good point. There is a lot of work that is just in some sense rote and that's okay. And you don't always have to get fulfillment from your work. Ideally... Most of us don't give a moment just from our work and so I think it's fine if, yeah, go work at a ranch and maybe that's perfect and that's what you get fulfillment out of and on the side you're reading Kafka, but real Kafka. Not...

Joe [01:03:41] Yeah, real authors.

Santona [01:03:44] Is Joe answering the same questions?

Juan [01:03:47] I'm writing this right. Read Kafka, but the real Kafka.

Joe [01:03:51] Franz Kafka.

Tim [01:03:52] Franz Kafka. Metamorphosis is a good book.

Juan [01:03:54] Oh, man. There are so many good T- shirts here.

Tim [01:03:57] All right, second question. Instead of yes/ no, it's actually going to be a quick answer question. So what is the biggest underappreciated performance hack for data people?

Juan [01:04:13] Not having dogs. Aren't going on walks.

Santona [01:04:16] Biggest underappreciated performance hacks for data people. A quick visualization, I don't want to say" tool", so for me if it's tabulated, loaded into a Pandas dataframe and then column dot hist, that's it. I just want to see the dat and I don't want to have to pull it into some fancy BI tool or something. Excel is not for me. It's hard for me, so I don't want to do it there so now spin up a Jupyter notebook. Do a dot hist.

Tim [01:04:46] Quick visit in a notebook.

Joe [01:04:49] Yeah. I would say listening. It's a good hack.

Santona [01:04:52] Never heard of it.

Joe [01:04:58] What'd you say?

Tim [01:04:58] Listening.

Joe [01:05:05] I see what you're doing with that, but no. Yeah, and I'm actually going through this in a course I'm making right now, but one of the big things I'm trying to emphasize is real world skills. So going and talking to stakeholders and asking them, " Okay, so what is it you're trying to do? And diving into why are you trying to do it?"

Juan [01:05:28] Oh my gosh, this is the best. The first thing is you're trying to... First, what is the problem you're trying to go solve? What is the question you're trying to ask? Who's asking that question and why are they asking that question?

Joe [01:05:39] Yep. What does'good' look like? What does the end... So it was funny, I was actually meeting with a client today and he's like, " So what do you want to ask me?" I was like, "I got two questions for you. What do you think is going well? What do you think could go better?" Right? That's it. That's it. And by asking open- ended questions like that and listening to them, open- ended question is meaning you don't get to a yes or no type answer either. It's like, " Hey, do you like cheese?" Yes. No.

Juan [01:06:10] This is what you should be very Socratic about.

Joe [01:06:12] Yeah, so I think that's an underrated skill, but...

Juan [01:06:18] All right, the next one. So I'm going to follow instead of a yes or no here. So we've talked a lot about the OpenTable, the formats and stuff. There's a lot to unpack. What's the best recommendation for people to get up to speed quickly on this?

Santona [01:06:29] There's a of good amount of blog posts out there that just...

Juan [01:06:34] Anyone in particular or people, because again, the issue is that there's a lot out there too.

Santona [01:06:38] Yeah, this is true. So Ryan Blue of Tabular, he was part of the team that built Iceberg in the first place out of Netflix. I've read a lot of his material around Iceberg. That's helpful. We are trying to, so if you go to the Upsolver blog, the last three or four have been around Iceberg as well. We're trying to really put out content that's just meant to educate on how it works. And same exists for, I don't want to leave out anyone, same exists for Hudi and stuff.

Tim [01:07:08] But no, it's dope. I got a question for you. Was CSV the first OpenTable format?

Santona [01:07:15] No.

Tim [01:07:15] Okay.

Santona [01:07:15] Okay. Yes. I'll give you that. 100%.

Tim [01:07:22] I was like, "Huh." Okay. Anyway. All right.

Juan [01:07:29] Take it away.

Tim [01:07:30] Last question. Both of you might actually be interested in this one.

Joe [01:07:33] Probably not.

Tim [01:07:38] Imagine different data roles and this whole spectrum of output versus outcomes, are data engineers the role that is the most out of whack on this spectrum?

Santona [01:07:51] Okay. I am going to contradict myself from something I said earlier in the podcast and I've always had this contradiction within me. I like to do everything. I like each part of the stack, the pipeline, the process, whatever. So regardless of title or whatever, I really enjoy doing the data engineering and the modeling and all of that. So I'm not going to diss on any one role being out of whack or anything. I think that I see a lot of memes around data engineers holding back a train that's coming and the data scientist is helpless, I think that's way overblown.

Juan [01:08:30] Yes. Thank you for calling that out. Here's this big ship and I'm this little thing trying to...

Tim [01:08:36] It's like data engineer, hero worship kind of stuff.

Santona [01:08:38] Exactly. I think where it comes from is clearly you can't work with data if you don't know how to get it right. And that's something that companies and teams get stuck on. They want the data function, but it's like where do I begin and what are the tools? And so in that sense, yes, you need to be able to bring that in, but that doesn't have to be like this. Yeah.

Joe [01:08:59] So what do you mean by "out of whack"? Are they whack? Are you saying...

Tim [01:09:06] Let's put it this way. So imagine there's time in the day and you're spending 40 hours a week working. Are data engineers spending more time than other data roles on stuff that's not really making an impact?

Joe [01:09:24] I could say that it's probably the case for a lot of data roles. So I wrote an article about this called Data LARPs. So LARPs are life action role playing. In the summer, you may go through your local city park and see people dress up as knights, playing swords and that kind of stuff. And I'm actually writing this about this right now, this will be an article that's on my Substack hopefully tomorrow or whatever. Just need to publish it, but it's a notion of Potemkin data team. So the Potemkin was somebody in Russia who was dating Catherine the Great I believe, or Catherine the Dead right now. But Potemkin, to impress her with how grand her land was in Russia, he would actually make fake towns and he would carry her through in carriage. And there's people working, but behind it, it is a complete facade. There's nothing there, but this is a lot of data teams I would say in general. In respect of data in general, I'll get into the specifics of your nuance of the question that you just asked. But I feel like a lot of data teams in general are LARPing and/or in a Potemkin situation where they're actually doing a lot of what you described earlier, we're moving boxes back and forth and counting them and all this empty stuff and the outcomes aren't there. So I feel like data engineers...

Tim [01:10:56] It's less about the role and more about the situation the team is in.

Joe [01:11:00] Well, if you zoom out and understand where the entirety of a data team is, it wouldn't really matter what role you're in if the entire objective of the team is to do useless stuff or useful stuff. So I feel like the notion that it would be specific to data engineers, we talked about abstractions as well in the discussion. Juan brought up computer science abstractions and really, you can apply the same principles of solid actually to data teams in the sense where it should be a single, is it single responsibility principle? Something like that. But these same principles actually apply to teams in the sense where you should be working on things that matter. You should have stakeholders that are directly applicable to outcomes and often this doesn't happen. And if you zoom out even further, all this happens because of Conway's law. Conway's law basically says you'll design systems and architectures that mimic how you communicate as a company. And so the inescapable law of the universe is as long as your company is dysfunctional, nothing will save you or the team because it's a transitive property at that point. So yeah. Anyway, that's my answer.

Tim [01:12:17] I like that. That makes sense.

Joe [01:12:21] Kind of, I don't know what I just said, like Rain Man at a station, it's like...

Tim [01:12:27] I'll make one quick little hot take and then we can move into takeaways, which is that I feel like the answer to the question that I asked is yes, but it's also not fair because data engineers, I feel like the time to get to the next iteration is a lot slower than a data scientist and then even slower still than a lot of data analysts because I think there's a very fast turnaround time to, " I need your report on..." It's like, boom here's your report.

Joe [01:13:01] Interesting Why would those cadences be separate?

Santona [01:13:05] Yeah, exactly. Exactly. And this is...

Tim [01:13:07] I feel like they often are though and maybe they shouldn't be.

Joe [01:13:09] They are.

Santona [01:13:09] They shouldn't be.

Joe [01:13:10] But they are, but that speaks to the nature of people probably working around bottlenecks of the data engineer.

Tim [01:13:15] Yeah, absolutely.

Joe [01:13:16] Yeah.

Tim [01:13:17] Yeah. And that bottleneck is especially problematic if they're working on the wrong thing because then the tent pole gets even longer.

Juan [01:13:29] Understanding the outcomes.

Joe [01:13:30] Don't talk to me like that. No, I'm just kidding.

Juan [01:13:34] All right. All right, Tim. All right, so takeaways. How are we going to do this?

Tim [01:13:39] All right, how much time do we have? I think we can do this in 30 minutes, right? All right, takeaways.

Joe [01:13:44] 30 days.

Tim [01:13:46] All right, so we started off with the honest, no- BS question of what do you mean by productivity is not performance and you really focused on... Yes, that is where we started. And we started with there's a big difference between outcomes and outputs and you can do a lot without doing a lot, which I think it's probably the biggest take away that I think we can think about as we go through all of this is there's a lot of work you could do without actually making an impact. And so for example, a very complex infrastructure to solve a simple problem. We talked about roles and the distinction between companies and data people at companies where maybe data is the core of what that company does versus maybe more descriptive analytics and things like that in support of the core business being something else. And the way that data works there and what output and what outcomes is going to mean is going to be different. And so think about which organization are you in, are you more the data is the core or data is supporting? We also talked about data analysts. They aren't just some awkward person behind Excel.

Joe [01:14:57] Speak for yourself.

Tim [01:14:59] Data analysts, we love you. But that analytics engineer is actually maybe a better title because it talks more about the nature of the work that these people are doing to try to make an impact. And so I thought that was interesting. We spent a good chunk of time talking about innovation too, and everyone should be innovating, but it adds a little bit of complexity to the whole outputs and outcomes conversation. Data teams have the opportunity to do the innovation, but you got to know what you're doing, remember your outcomes. Innovation can be a byproduct of outcome driven work. I think that's important, that innovation isn't a distraction if done right. It can actually be in the service of the outcomes that you're trying to drive as an organization. When there's a need that surfaces everywhere, a bunch of people jump on the idea, but there often isn't enough scrutiny around the actual need itself and if it's actually the important thing to focus on. So we spent a good chunk of time for those who are listening and if you're just listening to the takeaways episode, you want to listen to an interesting segment. We talked a lot about Snowflake and BI tools and the state of the space and the dynamics there. So that's a very interesting segment you should check out. Juan, what about you? What are your big takeaways?

Juan [01:16:14] I think this is a T- shirt one, if you replace DAGs with children, how do you feel?

Joe [01:16:19] AI models...

Juan [01:16:22] Yeah.

Joe [01:16:22] Models. That's... Nevermind.

Juan [01:16:27] No, because we were talking about this and I think, you have hundreds and hundreds of DBT models of all these things you've done. Is that impressive, not and so forth. But again, I bring this up because you tie it back to the output versus the outcomes, which I think this is the theme. Throwing outcomes with outputs, it's always about this. So we also talked a lot about how do we focus on the principles behind the data architecture for better outcomes. And one of them is like, hey, maybe it's like the principles of business value is at the center of everything. And we really dove into a lot of the whole OpenTable formats. So talking about from data warehouses, the data lakes and really how it's a specification of how files and information need to be organized so query planning can be done ahead of time. And you have this, what you call this queryable index. So it's like that smart layer that preserves the optimization. But it's really interesting how we got into this really technical discussion. But if we zoom out again, it's like well, outputs versus outcomes around this stuff. Is this a fad or not? Maybe no one cares about this stuff, but maybe it is a competitive advantage. Or it is a competitive advantage, but to what point? Because at the end of the day, people just want a better query optimizer. So how is that going to go? Are people going to care about that? The outcome here, I asked you is like, " What is the outcome?" You said, " Faster, stronger, better and cheaper." And you know what? That's actually something that, as simple as it is, but that's actually so true around that. And finally thing is LLMs, I use it all the time for someone to talk to. All right, how did we do?

Santona [01:17:57] And read Kafka but the real one.

Juan [01:17:58] Oh. That's another one that's in there.

Joe [01:17:59] Yeah.

Juan [01:18:02] All right.

Joe [01:18:03] Go team.

Juan [01:18:05] Take it away. Three final questions. What's your advice? Who should we invite next? What resources do you follow?

Santona [01:18:13] Be yourself, do the thing that you enjoy most. Be curious. Yeah, scrutinize the ask. Don't get lost in trying to implement what's being asked first. The first thing to do is figure out if the person who's asking knows what they're asking for and why they think they need the things that they're asking for. Who should you invite? I'm sure you've had every awesome person on here.

Joe [01:18:40] Actually I got one for you.

Santona [01:18:41] Yeah.

Joe [01:18:43] John Giles.

Tim [01:18:44] John Giles. All right.

Joe [01:18:46] From Australia. G-I...

Tim [01:18:48] Australia. All right, cool. That's a good one.

Santona [01:18:53] I was going to say Hala Nelson. I'm excited...

Joe [01:18:56] Hala would be awesome.

Santona [01:18:57] Yeah.

Joe [01:18:58] You could probably do an in- person. She's going to be here, right?

Santona [01:18:59] Yeah, she's going to be here for Data Day Texas.

Joe [01:19:02] She's amazing.

Tim [01:19:03] And then what about resources to follow?

Santona [01:19:08] Nowadays I just read as I am searching something, like I go down rabbit holes and then read everything on a topic.

Joe [01:19:16] You seem like the kind of person.

Santona [01:19:19] So it's less about, I have these five subscriptions that I'll go through and read. I don't do that. It's more like where it takes me. So yeah, I don't know. So Joe's standing next to me, so read Joe's book.

Joe [01:19:33] And my next book.

Tim [01:19:34] I heard you have a pretty good book.

Joe [01:19:34] It's okay. I don't know.

Tim [01:19:35] Some people read it and stuff.

Joe [01:19:38] Yeah, I already know what I want to read by us in it, but there's a practical data modeling that Substack, I'll give a shameless plug for the new book I'm working on. A lot of it's on there already or not on there already, but it will be on there. Other resources? I always read The Information. It's a paid site, but it's got, I think, probably the best tech journalism there is. So it's worth checking out and read books. Books are awesome.

Santona [01:20:08] Yeah. Actually I do want to add...

Joe [01:20:10] Let me ask you, what books are you into right now?

Santona [01:20:13] Right now I've been slow reading a book on basically the history of the Ottoman Empire.

Joe [01:20:19] Oh, really? That's cool. Why would you read that?

Santona [01:20:25] It had a really cool graphic on it when I was...

Tim [01:20:29] Is this a newer book?

Santona [01:20:31] No, it's an older book. It's a physical book that I was browsing through a bookstore and I was very attracted by how it looked and then I opened it and, " Oh, yeah, I don't know that much about Ottoman histories."

Joe [01:20:43] Isn't it a bunch of horses with people on top of the horses going, is it that one?

Santona [01:20:49] That is the very, very Spark Notes version.

Joe [01:20:53] Because I think I have the same book at my house. I'm not sure.

Santona [01:20:57] Oh, the graphic you mean. I thought you were describing the Ottomans.

Joe [01:21:00] That was not the Ottomans, that could have been any empire until recently talking until the 1600s or 1700s. No, that was not what I was referring to.

Tim [01:21:10] If you both have this book then it must be pretty important. It might be something I got to check out too.

Joe [01:21:15] I can't remember. I'll have to go back to my library and look, I got 1, 000 books at home and then probably another 2, 000 on Kindle, but I don't know, I read a lot. That's all I do, I'm a dork.

Tim [01:21:26] Reading's good. And I think it's nice to...

Joe [01:21:28] It's terrible for you. Don't read, don't. You shouldn't read, you'll waste all your time. You don't get anything done.

Tim [01:21:33] Everybody needs to read. And it's actually good to read stuff that's not always the same. I find myself sometimes reading too much data books and business books and sometimes it's like, all right, that's enough. Let's go read about the Ottoman Empire. It's different.

Santona [01:21:49] Exactly.

Joe [01:21:49] That's a good business book though.

Santona [01:21:52] Yes. How to Run an Efficient Empire. How To Make The People That You're Ruling Over Not Hate You. That's basically the story of the Ottomans.

Joe [01:22:02] It's a business book.

Santona [01:22:03] It's a good lesson. Yeah.

Tim [01:22:04] That's it for at least a few hundred years at least. Right? Several hundred years.

Santona [01:22:08] Yeah. More successfully than anything before or after, pretty much.

Tim [01:22:12] That's true.

Santona [01:22:13] One thing I've been dying to say is, if folks are listening, this is this conversation that are earlier or newer to data science and looking for actually what to read and what skills to pick up on, I try to say this, statistics is still an important part of data work and it's really hard. Find a good...

Joe [01:22:32] What do you mean? That was a pun.

Santona [01:22:32] Oh, if you have to explain your pun...

Joe [01:22:41] It'll make sense in a second.

Santona [01:22:45] Yeah. And it's not really easy to find good literature on statistics.

Joe [01:22:51] Oh, it's bad.

Santona [01:22:51] Yeah. Especially in data science books, if you know what I mean. Stuff that's come out as the hype over data careers went out, so I learned my statistics from physics books like Possible Dynamics or something like that. I'm not advising that, but yeah, find a good solid...

Joe [01:23:12] Stats, I would say that's just a good rubric for how to think about the world too, that, and probability. If any of you...

Santona [01:23:20] Two sides of the same coin.

Joe [01:23:21] Two sides.

Tim [01:23:22] Do you guys know of a...

Santona [01:23:23] It's the same point.

Joe [01:23:24] There's a pun there, and I was like, okay, you set that up, that's great. Go with it.

Tim [01:23:28] Do you guys know of a good stats book that you read and you're like, " Wow, that was..."

Joe [01:23:32] So the one I cut my teeth on was Sheldon Ross's books, but these are pretty old but so am I. What he had was inter- probability then probability models or something like that and those are good and then stats. There's a lot of stats books out there. I can't remember which one I cut my teeth on. I have a lot of them, but they're mathy. I don't think you should learn the mathy stuff. You got to realize a lot of this is like you're not using calculators or spreadsheets, you're just doing proofs, which is different than prove the central limit theorem and it's like, cool, I'll do that. That's not how to learn stats. At least for most people.

Tim [01:24:15] You'd recommend it a more practical approach.

Santona [01:24:18] Matt Humphrey and I were chatting about this a year ago, is there is an open need I think for a good statistics for business book.

Joe [01:24:26] Yeah. Josh Starmer, I think he's got some of the best YouTube videos. Josh was, he worked in biology or biosciences...

Tim [01:24:34] Did you say Josh Stormer?

Joe [01:24:35] Starmer, Starmer, not Stormer, with an A. But he's got millions of YouTube subscribers to his videos. He does this thing called StatsQuest, I think that's pretty awesome too. He started by making videos for his nine people in his lab and put them on YouTube. No, he started off doing labs in person and he is like, " I'm going to put this on YouTube", for nine people and then his videos circle off.

Tim [01:25:05] Yeah.

Joe [01:25:06] Because he's a really good instructor. Josh, he's just a gentle human being, great guy and a great teacher. And then I would say look at YouTube too. Books I think actually are terrible to learn from. Khan Academy actually, I think has a really good elementary stats.

Tim [01:25:23] That's true. Yeah.

Joe [01:25:24] I have my kids doing Khan Academy all the time. I would start there. I actually would avoid reading books for stats. I think to your point, most of it is pretty crap and the other part is in math in general, as you know, you want to keep doing problems and solving. That's how you get the intuition of how to solve problems. In a book, it's hard because you just be like, " Okay, I'll go with the extra items, you give me half the answers of the odd problems in the back." It's dumb. I have to go buy the study guide for the rest of it, which is another &1, 000 or whatever it costs these days.

Santona [01:25:55] And if you don't have the structure of actually going to class like you do in college, you're not going to...

Joe [01:26:00] No. Dude, it sucks. Khan Academy is dope because they game- ify it. My kids, I have them do Khan Academy every day because I'm like...

Tim [01:26:06] Badges and completions and stuff like that.

Santona [01:26:08] Do we need a disclaimer?

Joe [01:26:10] No, I don't work for them. The kids, there's a dopamine effect with it where it's like they want to keep going. One of my younger kids, he's good, but he's getting better.

Tim [01:26:27] I thought you were going to say something more positive there.

Joe [01:26:29] Well, he knows, but when he does Khan Academy, he's stoked. He just wants to keep going, like, " Can we go do something else now? Because I want to keep going." And that's how you learn.

Santona [01:26:38] Exactly.

Joe [01:26:38] Right? Just keep pushing.

Tim [01:26:43] This resonates with me a lot because actually when we started talking about stats books, I can't think of really great stats books and the only thing I can always come back to is I had an amazing high school statistics AP teacher. He just was fantastic. He made it fun. He made statistics fun.

Santona [01:27:00] Same with me and physics.

Tim [01:27:01] That probably put me on my data trajectory, and so maybe stats is just one of those things where, yeah, you can read a book and there's books out there that I'm sure that are good, but if you can find a course, a video, something like that, then it's going to be more fun and easier to learn.

Joe [01:27:15] Well, math is poorly taught, dude. Okay. When you were in elementary school, did you feel like you got a really good math education?

Santona [01:27:24] So I went to school in Bangladesh and yes, I did.

Joe [01:27:26] Okay.

Santona [01:27:28] No. Yeah.

Joe [01:27:31] There is...

Santona [01:27:31] Just an anecdote. Right? When I came here for college, it became very evident to me, the education that I had received and the way that it was taught to me was so different.

Joe [01:27:39] So different.

Santona [01:27:40] Sorry, go ahead.

Joe [01:27:40] No, the US sucks. I'm just going to say it, the way we teach math, common core, I'm even more confused because I help my kids with their homework and I have no idea how this way of teaching it makes any sense, but it is what teachers... One teacher, we went to parent- teacher conferences and reviews and I felt like I was at a therapy session because the teacher's like, "I just can't teach the way I want to teach." I'm like, " Yeah, it sucks. I know."

Tim [01:28:07] Yeah. That's tough right now.

Joe [01:28:09] It is. So you got to take advantage, there's a lot of good resources online. Khan's awesome. Sal Kahn, I think deserves an Nobel Prize for everything he's done.

Tim [01:28:19] And you still don't work for him, right?

Joe [01:28:21] I don't work for him. I literally don't. No, I'm just saying...

Tim [01:28:25] Actually they have really good resources. Yeah.

Joe [01:28:27] Yeah.

Tim [01:28:27] Yeah. Well, how did we do guys? Was this good?

Joe [01:28:31] I don't know. What was the outputs of the outcomes?

Tim [01:28:33] I don't know. I think we just produced a lot of really good tidbits.

Santona [01:28:38] Sure did.

Tim [01:28:39] Santona, Joe.

Joe [01:28:41] Thank you.

Tim [01:28:42] This was fun.

Santona [01:28:43] Thank you.

Tim [01:28:43] Thanks. Appreciate you all joining.

Joe [01:28:45] inaudible here too. I don't know where he went, but...

Tim [01:28:48] I think he had to duck out for a second here.

Joe [01:28:50] Yeah. Cheers.

Santona [01:28:52] Cheers.

Tim [01:28:53] Cheers, Joe. Glad to have you here.

Joe [01:28:54] My empty beer can, yeah.

Tim [01:28:56] Cheers, everyone. Appreciate you joining us today. Hey, next week, Jessica Talisman is going to join us from Without Information Architecture, and thank you to Data. world that lets us do this. We get to drink and hang out and talk data and then go to Data Texas.

Joe [01:29:14] Yeah, see you at Data Day Texas, it's going to be awesome.

Tim [01:29:16] Yep.

Joe [01:29:16] Yeah.

Tim [01:29:17] Cheers y'all.

Joe [01:29:17] Thank you.

Santona [01:29:18] Bye.

Catalog

Explorer

Marketplace

Governance

Workbench

Catalog

Explorer

Marketplace

Governance

Workbench

Financial Services

Healthcare

Higher Education

Insurance

Federal

State and Local Government

Financial Services

Healthcare

Higher Education

Insurance

Federal

State and Local Government

Data Leaders

Data Engineers

Data Governance Professionals

Analysts & Business Users

Data Leaders

Data Engineers

Data Governance Professionals

Analysts & Business Users

Integrations

API Documentation

Reference Implementations

Support

Integrations

API Documentation

Reference Implementations

Support

Snowflake

Oracle Database

Postgres SQL

Databricks

dremio

Snowflake

Oracle Database

Postgres SQL

Databricks

dremio

Blog

Events

Podcasts

Webinars

Reports and Tools

Blog

Events

Podcasts

Webinars

Reports and Tools

Who We Are

Our Team

Our Partners

Why data.world

Who We Are

Our Team

Our Partners

Why data.world

Press & Media

Events

Careers

Legal

Contact us

Press & Media

Events

Careers

Legal

Contact us

Catalog

Explorer

Marketplace

Governance