About this episode

We all strive to be data-driven. And yet we all instinctively know that we’re not very good at it. In fact, if you believe any number of recent surveys, it seems we may actually be getting worse at data. One reason for this is a misalignment on what success looks like. How do you define and measure it? What is the actual value of your data?

This week, Tim and Juan join Lars Albertsson, founder of Scling, to talk about data productivity and balancing a technology vs product-driven approach to data.

Special Guests:

Lars Albertsson

Lars Albertsson

Founder, Scling

This episode features
  • Companies that have truly unlocked data value and how they did it
  • Common missteps along the data maturity journey
  • What have you been practicing… but not actually getting any better at?
Key takeaways
  • Empowered innovation is key to data value
  • Democratization happened by accident with a very flat structure
  • Immutability ended up being an important best practice to ensure repeatability

Episode Transcript

Tim Gasper:
It’s Wednesday and it’s time once again for Catalog and Cocktails, where we get to talk about enterprise data management in an honest, no-BS way. My name is Tim Gasper. I’m a longtime data nerd and product guy joined by, Juan Sequeda. Hey Juan.

Juan Sequeda:
Hey Tim, how are you doing? I’m Juan Sequeda, the principal of scientist at data.world. I’m live from almost New York, from New York at the airport. And if you can hear a lot of noise it’s because this is live. When we talk about BS is like no-BS, we’re doing this live and there’s a bunch of announcements right now going on. So how bad does it actually hear right now?

Tim Gasper:
It actually sounds all right. It does sound all right.

Juan Sequeda:
So for everybody’s realizing I actually travel around or the world wherever I’m going with my microphone and make this real. So last week I was in Amsterdam and literally I got off a plane coming in from London and I’m now in New York. And always excited too, it’s Wednesday, middle of the week. Let’s go have this conversation, no-BS [crosstalk 00:01:00].

Tim Gasper:
We don’t we miss it, right? We don’t miss it.

Juan Sequeda:
No, no, no. And today I’m super excited, as always our Wednesdays, our guest today is Lars Albertsson. He is the founder of Scling, you got to get this right, Scling.

Lars Albertsson:
Scling.

Juan Sequeda:
Scling, all right. Well, here’s the thing. Lars is somebody who’s very well known in the data circles for a long time, right? Lars started working at companies like at Google early, like in mid 2000s, 2007, I think getting the first work on Google Hangouts. Was at Spotify early on when he was like a data engineer at Spotify when actually data engineering was not a thing. So you’ve really been ahead of the curve around things. And if you go talk to people in the circle and you say, Lars, people know who Lars is. And I am super excited to have this honest, no-BS conversation about data and value, and how do we get value out of data and productivity and all that stuff. So, Lars, how are you today? How are you doing?

Lars Albertsson:
I’m doing good. It’s quite late here. It’s 11:00 PM in Sweden, so I’m slightly tired, Had a data strategy workshop for a management team today and that’s exhausting for me, particular when it’s online. But I’m doing good.

Juan Sequeda:
Awesome. So hey, let’s kick it off. So what are we drinking and what are we toasting for? Lars, you kick it off.

Lars Albertsson:
Yeah. Well, since it’s late, I have a very big cup of coffee to keep me awake. That’s my typical drink of the day. Strong, Swedish coffee.

Juan Sequeda:
Anything in particular you want to toast for?

Lars Albertsson:
Well, I would like to toast for the fact that in two weeks we’ll be lifting all of the pandemic regulations and limitations that we’ve had here in Sweden for quite some time. The vaccination rate of the population is now fairly high. So we’re hoping this will work out, so I think that’s worth toast.

Tim Gasper:
Definitely.

Juan Sequeda:
Definitely. How about you, Tim?

Tim Gasper:
I am drinking a mezcal margarita. I bought some mezcal a few weeks back and I’m trying to find creative ways to use it. And this tastes very delicious, I like this approach and I will also cheers to your restrictions starting to come off, Lars. I look forward to when things get a little bit more normal in the US. We like to play the yo-yo kind of thing here, it’s fun to jump back and forth, except not really

Juan Sequeda:
Well, I’m drinking a farnet and Coke. A farnet is something I don’t have at home, and it’s like the traditional drink in Argentina. And I’m toasting for two things. One is traveling. I mean, I’m back on the road, it was great to be for two weeks. I was in Amsterdam, Paris, at London, Edinboro. Met with a lot of people, a lot of friends. Met with a lot of actually listeners, so shout out to Mark Kitson, who I got see him in London. So definitely I’m excited about travel and I’m also super excited about our summit at data.world.

Juan Sequeda:
So on September 29th, in two weeks, it’s a free virtual event, we have awesome presenters and of we have awesome agenda. We have, for example, Zhamak Dehghani, Dean Allemang, Barr Moses, Doug Laney. We’re going to talk about data mesh, DataOps, data product management, data governance, knowledge graphs. So if you really like Catalog and Cocktails, you’re going to really like our summit. I think we keep it that honest no-BS way too. But Hey, let’s turn the salesy thing off, a hundred percent and let’s kick off with some the warm up question we have today is, what have you been practicing by not actually getting any better, Lars?

Lars Albertsson:
Professionally, I keep practicing with Git every day. I use it on a regular basis and I can never understand it. It’s hopeless. I’m a complete newbie, I goof things up whenever I go out of the very narrow path I use, I just make a mess out of things. It’s a completely incompressible tool to me.

Juan Sequeda:
How about you, Tim?

Tim Gasper:
For me, also something a little technical here, learning JavaScript. I’ve been trying to learn it for probably six or seven years now. And I start to learn it a little bit, I practice it a little bit and then I get dizzy and then about six months later, I pick it up again and I do a little bit, and then I get dizzy. So I keep practicing it, but I’m treading water.

Lars Albertsson:
[crosstalk 00:05:40] how many times I’ve tried to learn front-end stuff, that’s a hopeless area for me as well.

Tim Gasper:
It takes concerted effort, right? I feel like I need to take a sabbatical where it’s like, “Tim goes and learns how to code sabbatical.”

Juan Sequeda:
Well, I’m going to do non-technical for me, it’s in the gym. I’ve been trying to get better at my front squats and it’s still not. I have to get my wrists up and that, and I still I’m struggling with that. So been practicing-

Tim Gasper:
Squats are hard in general, man.

Juan Sequeda:
I mean, back squats, I’m actually pretty good. It’s the front squats, it really changes a lot. Anyways, let’s kick it off now to our discussion of today. And so Lars, look, kick it off with an honest, no-BS question. So we always talk about data being the new oil, and it kind of seems obvious that the goal is to extract value of data. But it seems that the entire data industry is just focused on technology aspects instead of the value. So honest, no-BS, why aren’t we able to extract value out of our data or are we, or who is able to, and let’s kick it off with that.

Lars Albertsson:
Well, I think we are easily distracted by technology and very rarely the technology matters. In some case it does, but in most cases, it doesn’t. So there’s a huge gap in the capability to extract value from data, and if you look at the data leading companies of the world where I’ve had the fortune to work at a couple of them, they are decades ahead essentially in getting value after the data, if you compare to the crowd and to the traditional enterprises. And that gap doesn’t seem to close, it just keeps being wide. And first with the big data and sort of analytics and products product development tools like AB testing and so forth and nowadays with machine learning.

Lars Albertsson:
And as I spent some time in some of the leading companies, I worked for Google, I worked for Spotify and that’s sort of where I saw how much value you can get. And then I’ve spent a number of years, various considerations helping non-leaders get value from their data. And it’s never about the technology, that’s never blocking them, right? It’s always the ways of working, the ways you organize, the collaboration patterns, the rituals. Many of the companies have tons of rituals that they cling to and they can get rid of.

Lars Albertsson:
And then I’ve seen, there’s one pattern that I’d like to, if you don’t mind, I’d like to take a little detour and describe. And if you zoom out here and look at all of the things that we humans do in terms of refining something from something being crude into something valuable, like cooking or forestry or whatever, in history, it always starts with a manual process, with like manual tools, with axes and saws or cooking over the fire and so forth. And then it progresses to a mechanized state, where you do the same things but you now have machines in your hand, right? You have a chainsaw, you have an electric stove or an electric kitchen utensils and so forth.

Lars Albertsson:
And then from there, it goes to an industrial stage where you sort of do the same things, but now you’ve made it into an industry rather than just one chef cooking. You have McDonald’s, which is a whole machinery of food getting cooked. And although you might argue that it’s not as pleasant as the sort of the manual or mechanized way, you cannot argue with the efficiency, right? The industrial level always beats all the previous variations out of efficiency.

Lars Albertsson:
And you can see it in transportation, like you started with a horse cart and then you have a Lares and now we have FedEx where it’s like an industry. It’s a service rather than… except the Lares are like the same, but it’s an industry. You cannot beat IKEA at making furniture, for example. And on the industrial level, you work with a process, that’s where you improve, not with the actual craft, with the process. And with data we have moved on from the manual tools, the pocket calculator, and so forth or Excel sheet.

Lars Albertsson:
And we’ve gone to the mechanized level where we have the data warehouse is the most powerful tool, but it’s still a tool in our hands that we control with our hands and steer it, we run a query and so forth. Well, only a few companies have moved to the industrialized now, where you work with a process instead, all of the engineering effort is put into making the automated processes better. So you don’t assess your quality with a query, you assess your quality by building yet another pipeline that measures the quality and puts it out on graph for maps.

Lars Albertsson:
And the US is ahead here of Europe, they have more companies that do this. We only have like no less than a hundred companies at this level in Europe, just a few in Scandinavia. And the order of magnitude in getting how much value you can get out of data is, at the industrial level, there are several orders of magnitude higher. And you can quantify this by, if you approximate business value by counting datasets produced. Each dataset that you produce, whether it’s a report or a search index, or recommendation index has some kind of business value, otherwise you wouldn’t produce it.

Lars Albertsson:
And if you look at the typical enterprises, the banks and telcos, they produce on the order of a hundred or a thousand data sets per day, whereas Spotify produces a few hundred thousand per day, and Google produces a few billion per day. And that number tells you something about how differently they work with the data. I don’t know if this rant made sense or not, but it-

Juan Sequeda:
It does. I’m smiling. I am nodding and smiling. Yes.

Tim Gasper:
It does. I mean, this topic of industrialization of the data process is compelling, and I think you made a good comment where you’re kind of like, “Well, I don’t even know if you want to say that like McDonald’s is like the elegant experience here, or certainly the luxurious experience. But you can’t argue with its efficiency, they’re churning out more hamburgers.” Or whatever kind of analogy you want to extend there, right? You mentioned data sets, the production of data sets, is this a good thing? Is the industrialization a good thing, is that where we want to go?

Tim Gasper:
Because I wonder a little bit is, Google and Netflix and Spotify, these companies data is very much a part of the services and capabilities that they provide. Is that kind of the point? Does every industry need to be thinking about how data becomes, like the data is the hamburger as opposed to something else? I know I hit a few things there, what do you think, Lars?

Lars Albertsson:
Data’s not a goal in itself. We’re refining data even though, sometimes you meet executives that think that having machine learning is a goal in itself. But the data and the tools, machine learning or not that we use are means to make business value or user value or value for the world in one way or another. And data is raw material, right, it exists. And we can choose to have a process that refines it or not if we see the business value.

Lars Albertsson:
In organizations that are product driven, then you typically have little waste in terms of creating data things that are not valuable, because if you organize around driven by business value pool then your activities up stream towards the data sources will be prioritized by the amount of business value that you generate. Spotify is really good at this, at always keep doing work that is perceived to provide business or user value in one way or another.

Juan Sequeda:
I want to go follow up on this analogy, which I’m really liking, we call the manual, kind of what I call the automate and then industrial. So let’s continue with the McDonald’s and the hamburger. So manual is, “Hey, I’m going to go buy the ground beef and make my burger at home.” And then I’m like, “Well, no, I can probably buy a bunch of, instead of me buying the ground beef and making the patty at home, I can go buy already a bunch of patties already, and I can just put them the grill, make a bunch of at the same time.” But then if your goal is to go buy hamburgers, just go McDonald’s.

Juan Sequeda:
But at the end of the day, the goal is what? To provide food, to take hunger away. And McDonald’s wants to go do that fast and cheap, right? And I made my hamburger manually because I also, I was hungry, right? So kind of what I’m going through here iS how do we treat data in this aspect? Is our goal to be able to go industrialize it because we want to be able to go do things fast and cheap or let’s follow up on this analogy a little bit about kind of manual automated and industrial when it comes to data. Because at the end of the day we talk about, I need to get value out of my data, but what is that value? What is the equivalent of, “I’m hungry and I want to go eat fast and cheap.”

Lars Albertsson:
So the equivalent of, “I’m hungry right now. So I need to flip a burger on the stove,” is, “I need this data now to make a decision, so I’ll pull up my spreadsheet.” That’s sort of the most primitive tool that we have nowadays. Whereas, the industrialized version is people in the organization, tend to need this of data on a regular basis, so we will glue together from existing components in AB testing framework so that we can, without doing the query, the data warehouse or the spreadsheet query each time have the right decision presented or the information for the decision presented in front of us, as soon as we have thrown out the sort of AB test to users.

Lars Albertsson:
So that’s the, I would say the analogy between being hungry, wanting the data now and working on the process to improve, not for me right now but for the next person, and next person, and next person that wants the data. So it’s to some degree automation, but it’s also automation beyond just rescheduling the same query, it’s automation where you continue to iterate and improve on the process. So whenever something goes wrong, you add a bit of more process to make sure that your data quality is measured or whatever. And here we come into sort of the DataOps practices, which is essentially the equivalent of lean but in a data factory setting.

Tim Gasper:
So we’re talking a little bit about the process and the evolution and the maturity. Let’s talk a little bit more about the hamburgers themselves, the content that’s being cooked here, right? People say things like, “Hey, we want to be more data driven.” But as you just mentioned, a lot of times we’re trying to do these use cases, we’re trying to drive value with the data in more specific ways, but we say things like be more data driven. What does that really mean? What does it mean to be data driven and what are we actually trying to cook here that is valuable?

Lars Albertsson:
Well, data is used in three major ways to sort of enhance your business. One is, being data informed which is you manage to get the data that you need for your human decisions, right? Business insights, product insights, and so forth, so that you make better decisions at a higher, low level in the company. And the second one is sort of data fed products where the data is part of the product that you provide. And this can be top lists, if you’re a media company, it can be reports assembled and send off to partners because you have signed a contract that you are supposed to provide analytics to your partners and so forth. Where the logic is straightforward, but data is part of the outcome.

Lars Albertsson:
And then you have machine learning where your logic is not complete, but you need data to sort of refine the logic because you assess that it will be a better result than if we humans create all of the logic our ourselves. So these are the major categories, and data driven sort of in that concept in encompasses, I guess, all of them, it’s a bit fussy.

Tim Gasper:
How much are we doing these different things here to drive more revenue, decreased costs, all the typical stuff, right?

Juan Sequeda:
You said these three things to repeat them. One, value is about being informed for human decisions. Second, is data fed products. So the actually data is part of your product. You have the top lists of something, the most popular things and reports are being sent to partners. And then you said the machine learning as a value. I would argue that the machine learning is the one that’s actually helping to go either do those first two things. It’s not a value per say, it’s part of the technology. The value at the end of the day, I mean, if we’re honest about it, any capitalist organization is we need to go do anything with value is to make sure how we make money and save money. And productivity is how we can at the end of the day, make sure we were wasting less time because time is money. But I mean, isn’t that at the end of the day, what we need to go do with data?

Lars Albertsson:
Yeah. Machine learning is not value per se, it’s just one group of features or one type of technology that you can use that is fed with data. And in theory, whatever you do with machine learning, you could do with playing coding if you were smart enough to write the exact [inaudible 00:21:46] write code. And this is how to look at artificial intelligence back in the ’80s, right? The AI at the time were expert systems where you got a bunch of experts and you translated their knowledge into if statements, essentially and then we call that AI.

Lars Albertsson:
And nowadays we sort of hit the limit of types of systems, and we figured out a way to leave some of the logic open and then trained by examples essentially. So we create better logic than we can write in practice as humans. And the value of that better logic can be translated to user value or business value.

Juan Sequeda:
So one of the things I want to go pick your brain is given your background and something we discussed before was kind of different approaches to how to drive value. Is it technology driven, kind of that experience from Google that you’ve had, or is it more product driven, that experience you had from Spotify? How would you compare? I mean, when we were chatting before it was like this technology driven approach, this product driven approach. I would love, if you can give us a little bit more of insights about this, how do you get value out?

Lars Albertsson:
Yeah, it’s different types of company cultures and there’s no single culture that is better than the other, but they affect the way that companies work and the resulting products quite a lot. So Spotify is very clearly a product driven company where all the activities stem from, “We want to make the product better in this or that way.” By the way, there is an excellent podcast on product management at Spotify. It’s a wonderful podcast, almost as good as this one, you should really go and check it out, it’s of course, available on Spotify.

Tim Gasper:
I’ll find that, looking forward to that.

Lars Albertsson:
Whereas I Google I used to work for a brief while at some microsystems as well, are both examples of technology driven companies where the strong force within the company is people sort of fiddling with technology essentially to see what can be done without necessarily a clear connection to products. I mean, Google is doing quantum computers these days, it’s almost basic research, it’s hard to tie that to products. And Sun was doing all sorts of crazy things and somehow some turned out really, really well. And that will make you a technology leader in many aspects, but not necessarily deliver what the customers want or what the customers request.

Lars Albertsson:
And we see that with Google’s journey with the cloud, whereas competing with AWS is very customer driven and delivers what the customer wants, whereas Google has this idea that they know how to build these systems and sure they do, so they provide the technology for customers to build the systems the way that Google thinks they should be built, but that doesn’t always rhyme with what the customers want, depending on your customers. It suits us really well.

Lars Albertsson:
We’re hosted in Google Cloud because we want to care about our niche and we want Google to take care of the rest. So we take whatever they supply in terms of security and monitoring and so forth, we don’t want to cut that out. But it results in very different types of companies [crosstalk 00:25:52].

Tim Gasper:
How would you say they’re defining value in that case? Because obviously there’s some end results, right? But how are they defining the data value kind of driving those end results?

Juan Sequeda:
Because it seems to me that if you are, technology is like, “Hey, we got a lot of cool things. Let’s let’s go put a lot of cool technology, a lot of cool people who are smart and they’re going to go do something and something’s going to come out of it. And I don’t know what’s going to come out of it, but something is.” But then if you’re, so almost think about it, going back to your AI, back in the ’80s, ’90s experts systems, that’s like a forward chin approach, “Let’s go see what’s going to happen.” But if you’re going to go flip around the product way, it’s, “No, this very specific need, I need to go develop this product. How do we go do that?” So that we focus on that and we have a very clear value.

Juan Sequeda:
Now, look at companies like Google, they generate tremendous amount of value, but people did not say, “Well, users are asking for this particular thing, let’s go do that.” No, they didn’t do that. That’s not, I guess their culture about it, but they generated so much value, it took more time. They obviously have a lot more money and time and people to go do that, but when you’re product driven, you don’t have that luxury. So I kind of almost see that as a backward chain approach, right? You start from the goal going and how do I get to that goal? So you have a very specific goal, which is tied to value, but in the other approach, it’s like, “I don’t know, but we’re going to get there.” But not everybody can have that mindset.

Lars Albertsson:
I mean, Google was technology driven from the start. I mean, they had this idea that, they figured out how to do search, right? And then they did tremendous search, which was a lot better than [inaudible 00:27:45] and what was out there. But they couldn’t figure out how to make money out of it. So they very close to sort of giving up and just showing banners, which is what everybody else did. Until somebody happened to figure out this [inaudible 00:28:01] model that they use for the early searches, which was yet another technical invention that sort of nailed it and now they have an endless amount of money.

Lars Albertsson:
So I guess it’s a higher risk journey to be technology driven, but it also perhaps makes you capable of taking the leaps. I mean, we’ve seen, even though Sun is not with us anymore, they also were a company that made these significant leaps. Back in the ’90s, they had this machine that was originally developed at Cray, but they threw it out. It was named Sun E10K, I think. And you know what you could do? You could actually, in the running virtual machine, it was virtual machines with hardware support. So you could split the machine up into virtual machines.

Lars Albertsson:
And you can move resources, memory, and CPU and bandwidth and so forth between these virtual machines. So that if your database was running and it needed more CPU, you could give it some more CPU without taking it down. 25 years later, I can’t do this in the cloud, right? So they have made some, you can make some significant leaps. I don’t think we’ll see product driven companies like Spotify or rarely these product driven companies like Spotify make these kinds of technical leaps. On the other hand, they have a very tight connection to the users and measure what they want. I saw this interesting invention in how to build products at Spotify when they developed the running feature. I think it has now been discontinued.

Lars Albertsson:
But they didn’t at all know when you’re out running, what do you want? Do you want music that has a constant pace, do you want music that follows the pace of your steps, or do you want music that drives you to run faster? They didn’t know. And they had to, for the first time, they incorporated the measurement of feedback loop on how to design the product into the development process. And I’ve never seen anybody else do that.

Lars Albertsson:
And that resulted in a product that the characteristics of that product was such that nobody, when they started was actually able to design that, what eventually became the end product. And likewise, companies Amazon perhaps don’t do as interesting technical leaps as Google does, but they did the business leap of defining the cloud, which wasn’t a technical thing. That thing was already in place in other places, but they formulated how to sell the thing. So the different companies have different strengths and different outcomes. I don’t know if that rant answered your question.

Tim Gasper:
I think, [crosstalk 00:31:32] yeah, go ahead, Juan.

Juan Sequeda:
No, I mean, it’s not just a question. I think we’re all here ranting and I got a bunch of more comments and thoughts right now, but hey Tim, you go ahead.

Tim Gasper:
No, that’s okay. We’re learning together. This has got a bunch of ideas and thoughts going in my head too. The most recent thing that’s popping up on my head is around, we talked about Google and them being more of a technology driven company, Spotify being a little bit more of a product driven company. Are there more categories of companies that we should be thinking about in terms of data value and how they might think about it, or can what we learn and glean from Google and Spotify and the conversation we just had for the past five minutes here, can that extend now to non-technology companies, other companies, bigger companies, smaller companies, whatever they are.

Lars Albertsson:
I think there are other categories, I don’t know if it touches data that much. I would say Amazon is customer driven. They respond to customer requests and that’s why they have such a huge flow of products because it’s always somebody asking for things. And one I argue that Oracle is sales driven, but I’d have less insight there. I know a lot of my old Sun friends, they did not stay for long, because it was such a different-

Juan Sequeda:
Wait, but everybody needs to be sales driven. Google exists because they need to make money, they go make sales. Amazon is customer driven because they need to make sales, right? Is it everybody is sales driven?

Lars Albertsson:
That’s not necessarily what drives you, right?

Tim Gasper:
Everyone does sales, but it doesn’t mean they’re sales driven, right?

Lars Albertsson:
And I mean, Google’s main revenue is still ads, right? That’s not sales, that’s just something that, well, well, you can argue that it’s sales, nevermind.

Juan Sequeda:
I mean, I think the honest, no-BS for me out of all of this is that, our ultimate goal is always, how can we make money, how can we save money? With everything we go talk about, like, oh, we’re talking about being technology driven on the value, product driven on the value, it goes boils down to these two things. Please, I mean, I don’t know, somebody contradict me.

Lars Albertsson:
Yeah. I mean, I don’t think I ever worked for a company that is actually driven by making money. Oh, well, I was for a while in FinTech and that’s very money oriented business. But at Sun I was building hardware. We were going to throw the next machine out the door, that was the regular cycle. You get machines out the door. And likewise at Google I was working on Google Talk video was the name of the first generational of video conferencing. And it was not a monetized product. All the time when I was there, there was no discussion of money ever affecting our choices.

Lars Albertsson:
And yes, Google has the money to have that luxury to worry about that later, but it wasn’t driving us, it wasn’t influencing us in any way. And likewise it money or to monetize things was never an influence for how we build things at Spotify. It was if we generate the user value then the money will eventually come. And Spotify also had the insight that music is a sort of a winner takes it all business that you’re either big or you’re nobody. And so gaining market share was always seen as more important than making money in the foreseeable future.

Tim Gasper:
That triggers some thoughts for me, which is, I wonder if there’s a little bit of a bubble around companies like Google and Amazon and Spotify that allows them to be a little bit more in that position. And whether you consider that bubble to be a good thing or a bad thing, or a cause of their success or a consequence of their success, right? That being in that bubble almost makes it easier for you to then create this data driven culture and invest in data value and things like that. And the companies that are struggling with sales or have to be more money driven, are they in a worse position to be able to be creating value with data because they’re just trying to get that policy claim figured out for that one customer?

Lars Albertsson:
Perhaps. I mean, it’s better to be rich and healthy than sick and poor, I guess is the conclusion here. And sure that gives you more breeding space to be long term focused and so forth. But not-

Juan Sequeda:
I was going to say, Jeff, on the chat just says, what you were doing is really R&D at that time, right? So you are doing more of the preparation of what could something go do, at the end of the day, you, I mean, you Lars were not focused on how do you make money, but your boss or your boss’s boss was, right? [crosstalk 00:36:58]

Tim Gasper:
Somebody in the organization was.

Juan Sequeda:
Somebody was, right? So I think, I mean, the honest, no-BS about this is that, I mean, can we just go say, “I’m going to go make a business to go provide value of something that is not about making money and saving money?” I mean, that’s the question I have. Because that’s like, if you think about it from startups perspective, like companies growing, but now I’m thinking about it, “Wait, I’m a very established company. I’m a hundred years old. I’m older than a Google doing things, and I got a lot of money. I can take risks.”

Juan Sequeda:
Now, those are the folks who probably say, “Look, I should probably be investing in something that the value is not immediate, it’s later on. So I’m going to go invest in doing this R&D type of stuff,” but they’re not that technology type of company but they probably have the opportunity to take that risk. So I’m just trying to figure out who are the types of companies where this can work, this whole technology versus the product driven in value?

Lars Albertsson:
Well, I mean, there’s so much venture capital these days. At least here in Stockholm, we have the most venture capital per capita in Europe. So there is plenty of risk that you can take with other people’s money and not your own revenue, right? And there’s I think it’s Kent Beck that has this formulation about, what is it? Explore, expand, and extract the three phases of product and company development, where in the first phase of exploration, you try things out in order to find out what might succeed. And once you find something, then you grow it in terms of users or market share or something.

Lars Albertsson:
Then you’re in the expand phase. And then the third phase is the extract phase where you have saturated, what you can do, but you now are profitable with your product. And these three phases, you need to work in different ways, there are different activities and different types of people or organization that are appropriate in each of these phases. But nowadays there’s a lot of acceptance for constantly exploring new things in the exploration phase. Even though no matter whether it’s your old revenue from old products or somebody else’s VC money.

Tim Gasper:
I think there’s one more topic that’s very related to what we’re talking about today, that it would be good to hit a little bit before we start to kind of close things off and hit some of our lightning around and things like that, which is around, early on today, you talked about the rituals that people have to do, that you see in more mature data organizations that are creating this data value. What are some trends that you see that are shown in good data teams, whether it’s some of the practices that they’re doing, or the shape and form of them, right? Have you noticed that successful organizations really driving a lot of data value have highly centralized, capable teams, highly decentralized teams. What trends are you seeing or recommendations do you have around what’s making these data teams be more mature in creating value?

Lars Albertsson:
Yeah. So centralized versus decentralized, that depends on the level of maturity where you’re at. If you’re early are better off starting centralized because the homogeneity is important and it’s easier to make it homogeneous if you have centralized technology and centralized teams and so forth. And that was the blessing in disguise from Hadoop, it forced us, those companies that adopted Hadoop early were forced to do things in a centralized and then it became very homogeneous by accident. And some of the principles that also were brought up on us by accident by Hadoop was because of its limitations.

Lars Albertsson:
It was so incapable in comparison with like mature databases, for example, there was almost no indexing. You couldn’t mutate things, so we ended up with immutable data sets and we had to transform them to new data sets whenever we wanted to change things. And it turns out that you were thereby forced into these functional architecture principles, which is essentially a functional programming with taking an architectural level. Like you never mutate your data structures, you never mutate your tables, and instead build these series of pipelines that transform things.

Lars Albertsson:
Turns out that that forced us into a way of working which is the foundation of the success of the data leaders. The immutability makes it easy to share things, right? Because if it’s immutable, once it’s out there, anybody can freely use it without disrupting your operations. Whereas if you have a mutable database or a mutable structure [inaudible 00:42:14] Arrest API, you have to synchronize and go and talk to these people. So by making data immutable, you don’t have to synchronize and therefore you can innovate faster.

Lars Albertsson:
And it was also the case with Hadoop that it was so painful to run a cluster that you only wanted one cluster. So all the data was in one place in the same format, and that turns out to bring friction down for innovating. And since the security models was a crappy, everybody had read access to everything more or less. So you got this democratization almost by accident, whereas if we were going to base things on your Oracle database whatever, then you have all of these tools and OBS to prevent people from accessing your data.

Lars Albertsson:
So these principles are of like immutability and democratization and homogeneous environments are some fundamental factors of success. And those of us that adopted Hadoop early got them by accident. Now, what happened later was that the Hadoop vendors and all of the other data vendors, including the cloud providers went out and talked to the enterprises rather than the overgrown tech startups, because the enterprises have lots of money, so you want to sell them to them. The conversation often goes like, “Okay, we have this new technology and you can do fantastic things.”

Lars Albertsson:
“All right. But we have our old ways of working where we assume that data is mutable and we can do transactions and so forth, so can you please implement these features as well?” And they did. So the late adopters of sort of big data technology were never forced into these successful patterns of working and all these successful patterns of working can be summarized as a data factor, which is sort of the fundamental of the industrial data processing. So I think that’s [crosstalk 00:44:20].

Tim Gasper:
Go ahead, Lars. And then I’ll add.

Lars Albertsson:
I think that’s why we saw so many failed big data projects in the 2015 era, right? All of these companies adopted the technology, they tried to push the technology into their old ways of working and then you had just had the worst of both worlds, whether your vendors helped with implementing transactions and SQL support or not. The real value lied not in the technology on the new shiny things, but in the ways that you work, and the way that you enable sharing of data throughout the organization and innovating on low friction innovation on top of your data.

Tim Gasper:
Right. I’ve never heard this take articulated exactly this way on sort of the impact of Hadoop and sort of early Hadoop versus later Hadoop, right? I think that’s very fascinating because, I think everybody has always thought about, for those who are listening, who are familiar with sort of the big data landscape and how Hadoop became such this big splash, like, oh, the advent of big data, and this is the Oracle that is going to bring together all our data and provide all these insights, right? That early adopters of Hadoop had very little to work with.

Tim Gasper:
I mean, you had to deal with these awkward block sizes and everything was using Pig and MapReduce and it was hard. And it actually forced you, your take here is a little interesting, in that it kind of forced you to implement the earliest versions of really good sort of DataOps practices in teams to be thoughtful, to be inclusive, to be process oriented though, and make sure that you’re treating data as a first class citizen. It was sort of almost accidental and then over time Hadoop became more and more like a data warehouse, but it was like a bad data warehouse, right? So it’s interesting to see that comparison. That’s very interesting.

Juan Sequeda:
This is a very awesome takeaway and kind of that analogy that you’ve done about what we’ve learned from Hadoop. I need to go back and listen to this to myself. What you just said was, I’m in awe right now because it just makes so it sense. For me, it’s really, this was without even thinking about it, the basis of what we’re doing now on defining Teams. Everybody’s now talking about DataOps and data observability, and all this stuff is like, “Wait, we were doing this by accident 10 years ago. And if you really study all the lessons learned from that, that’s going to build that strong foundation about Teams.”

Juan Sequeda:
And as you mentioned, I always ask organizations like, “Are you centralized, are you decentralized? What’s your organization, what’s your culture?” And people sometimes they know, “Oh, we’re definitely decentralized.” Or sometimes they have a disagreement about that. It’s like, “Hey, you guys don’t even agree, this is a good thing.” And I’m talking a lot about this friction, I want to have intellectual friction because the friction generates that energy to know, “Oh, this is where people are interested in going, doing stuff. There’s probably something interesting.”

Juan Sequeda:
And somebody in management thought, “Oh, on the right hand side, there should be something we should go do. There’s no friction there nothing’s happening. Nobody cares about that. Why not?” Anyways, this has been a really awesome discussion, Lars. Thank you. Thank you so much. And I think we’re ready to go to our lightning round. Right, Tim?

Tim Gasper:
Let’s do it.

Juan Sequeda:
All right.

Tim Gasper:
Juan, do you want to kick this off here?

Juan Sequeda:
All right. We’ll do this. So should data value be defined by data leadership? Yes or no?

Lars Albertsson:
No. I have an excellent example here. Back in 2013, when I joined Spotify, one of the first things we did was to make an effort to democratize data. It was just a few teams that were capable of using Hadoop and innovating with data. And we set out the goal to democratize it for any team with a developer essentially. And that was a transformation to what is today known as DataOps, but the word didn’t exist at that time. And we managed to push down the friction of creating new pipelines so that a beginner could do it in less than a day and you could correct errors in like 15 minutes or half an hour or something, even in production.

Lars Albertsson:
And that brought down the friction significantly and the number of jobs just skyrocketed afterwards. 18 months later, a team of engineers, about three engineers, they took a hack week and then another week or two and they built discover weekly, which is now one of the most popular features of Spotify. Arguably the most successful machine learning feature ever built in Europe. And I became really proud when I heard their presentation, because they said that we could do this, not because the company had decided at that board level or at management level, that yes, we should make an effort to spend half a year and 20 engineers, but the company had enabled bottom up innovation, enabled us with all of the data and the ways to build pipelines and the [inaudible 00:49:47] clusters to serve playlists and so forth.

Lars Albertsson:
And therefore we could do it with a very little effort just because we thought it was a good thing. And Daniel Ek, the CEO, he said, “I didn’t see the beauty of it, if it was up to me, I would’ve killed the project, but Spotify doesn’t work that way. So I just didn’t give them any more resources, but then they launched anyway and in a year they had 40 million active users for the product.” So he was clearly wrong in his definition of data value and if it was up to the leadership, we never would’ve seen that product.

Juan Sequeda:
I have to say, I was not expecting a very strong, no, and you just gave a really excellent, I love this. Tim, you go.

Tim Gasper:
That is great. Oh my gosh. I’m taking notes. Okay. Can you measurable monetary value to data value?

Lars Albertsson:
No.

Juan Sequeda:
All right, I’ll go next. We didn’t talk about Data Mesh or DataOps, which there’s so much stuff going around that, but will the growing interest in Data Mesh and DataOps, will that be one of the biggest drivers of increasing data value?

Lars Albertsson:
DataOps, yes. Data Mesh, no.

Juan Sequeda:
Ooh, that is good. [Shmak 00:51:17], if you’re listening, I’d love to get your answer to this one.

Tim Gasper:
The [crosstalk 00:51:20] it takes. That’s what the lightning round’s all about. All right.

Juan Sequeda:
Yes. I love this. Love this. [crosstalk 00:51:23].

Tim Gasper:
Right. Last question. All right. All companies should aspire to generate data value like Google and Spotify are generating data value. Yes or no?

Lars Albertsson:
No, we don’t need a world full of McDonald’s.

Juan Sequeda:
I love this. I love this. You just equated Google and Spotify to McDonald’s.

Lars Albertsson:
I mean, they’re innovating on an industrial level, I think there’s room for the local butcher or tailor or whatever and we should have all kinds of companies.

Tim Gasper:
And maybe Google and Spotify used to be that. And then they grew into the McDonald’s that we see today, right? You’re like, “Man, I don’t know about that.”

Lars Albertsson:
Perhaps. But I think if you’re just being driven by data, you also become in a sense, one dimensional. There’s a wonderful rant to that called like the 47 Shades of Blue or something, but a UX designer at Google. And he grew completely mad because whenever he wanted to make things coherent in terms of UX design, there was all of these people that needed to run AB test to figure out exactly the right shade of blue on this particular button or whatever. And that’s a good example of how to misuse data where it shouldn’t be. Whereas if you look companies like Apple, they realized that UX design is a human thing primarily and not primarily a data thing

Juan Sequeda:
That’s an interesting point, because at some point you realize, where do you draw the line, right? You want to be data driven and, well, is everything data driven, how much of it? It’s like quantitative versus qualitative research, right? I mean, it explains different things.

Lars Albertsson:
And that’s why we need all sorts of companies too, that draw the line in different places, right?

Tim Gasper:
Right.

Juan Sequeda:
I love this, Tim, TTT. Tim takes it a way with the takeaways.

Tim Gasper:
Here we go. Take away time. So we hit a lot of fun things today. Two things that really stood out were, actually your comment in the lightning round, around bottoms up empowered sort of being key to data value, perhaps even more key than top down dictation and priorities and roadmaps and things like that. And I think that’s important because I think that a lot of times we think about the leaders that we put in placeS, of course being very important catalysts, right? Especially you think of the increasing role or evolving role of the chief data officer is one example of this, right?

Tim Gasper:
But in the end, what we’re trying to do is empower data culture, and DataOps and the right folks actually bringing the innovation and the understanding, the folks on the ground who really see what’s going on, who can then bring the right innovations to market. And that is what it could mean to really be data driven and do that in a positive way. So that’s one big takeaway for me. And the second, is this whole story that you told around, Hadoop, it being sort of this blessing in disguise, especially in the early days where it forced us to think about things like immutability, things like democratizing data, because you couldn’t actually put security around anything.

Tim Gasper:
So it was like, “Hey, welcome to the data lake. Here we go.” And not that we want to repeat that necessarily, right, we don’t want to repeat the technology challenges there, but we should think about the cultural and the process and the team things that we did there that actually made that good practice. And let’s not repeat history, let’s not do bad things all over again. Let’s go back to the good practices there, and let’s find ways to make them become even more adopted.

Juan Sequeda:
Yeah. Let’s not reinvent the wheel, as I always say.

Tim Gasper:
Right.

Juan Sequeda:
So my couple takeaways are analogy of the manually automate the industry, right? The hamburger. We need companies who do more the manual, do more of the automation. Also, the industry, the Googles and Spotifys are the McDonald’s about that. The industrial kind, they beat the other, the sense of your goal is to be completely efficient, but we need everything on that entire spectrum. The definition of value, you want to be able to be informed for human decisions and generate, you want data, the value it provides products for which you can go generate new things and like in the Spotify example, top list or reports to partners and so forth.

Juan Sequeda:
And this is another topic that we’re seeing over and over again. And I discuss a lot with also a lot of customers and colleagues and stuff is, this balance between centralization decentralization. Depends on your maturity. You may probably want to start to centralize if you’re just kind of starting out. At the end of the day, you want to lower your friction so you can be able to go innovate more. I love that one. Lars, let’s throw it back to you. Two questions, one, what’s your advice about data, life, anything, and second, who should we invite next?

Lars Albertsson:
I think my advice, which I have a two fold piece of advice, and I think it applies both to data, engineering in general and perhaps life in general. And that is, keep things very simple, as simple as you can get away with and prepare and design for failure. So think about the cases where things go bad and try to cater for them, or try to plan for them. And that will, then if a good data engineering architecture will fall out of those priorities and perhaps a good life plan as well. And who should you invite? I think you should invite [inaudible 00:57:13] from Spotify. She is an amazing conference speaker, and she knows more about practical management of data quality aspects and reliable data than anybody else that I know. [inaudible 00:57:35].

Juan Sequeda:
We will definitely be asking you to make that introduction. Actually, I think on November, I’m looking up here, November 3rd we’ll have Erik Bernhardsson who used to be at Spotify. So he’ll be our guest and we talking about more data stuff with Erik. And I told him that you were going to be on the show. He’s like, “Oh, you guys, you have an excellent guest with Lars.” And so yes, this has all turning out awesome.

Lars Albertsson:
He has a big part in Spotify success because he and Elias Freider created Luigi, which is the workflow orchestrator in use. And they really nailed some of the concepts and Luigi made us tie the workflows together and collaborate between teams. It was one of those technologies that actually affect how you work and that’s why it was important. But it took them a number of tries before they got it right. But that piece of technology has had a major impact on own Spotify success.

Juan Sequeda:
So with that, we’re going to have a lot of really cool guests coming up for the rest of, I think we’re already booked for the rest of the year. Next week we have Jans Aasman, who’s a CEO of Franz Allegrograph. We’re going to talk about data modeling and data centric architectures and knowledge graphs. And don’t forget next September 29th, is our data.world summit. It’s free, it’s virtual, so many topics of things that we’ve discussed today. So don’t miss it. Lars, thank you so much and cheers, we appreciate it.

Tim Gasper:
Cheers. Great conversation.

Lars Albertsson:
Great fun to be on the show and thank you so much for inviting me.

Enter Content Here.