Speaker 1: This is Catalog & Cocktails, presented by data.world.
Tim Gasper: Hello, everyone. Welcome once again to Catalog & Cocktails. It's an honest, no BS, non- salesy conversation about enterprise data management with tasty beverages in hand. Presented by data.world coming to you from Austin, Texas, and somewhere else in the world. I'm Tim Gasper, longtime product guy and data nerd at data.world. Joined by co- host, Juan.
Juan Sequeda: Hey, Tim. I'm Juan Sequeda. I'm the principal scientist at data.world and as always, it's a pleasure. Wednesday, middle of the week, end of the day, and today there's so many things that we're going to talk about and it's a really exciting day. First of all, we're live from New York, actually from Roosevelt Island, from the Knowledge Graph Conference. And we're here with Katariina Kari, who's the lead oncologist from IKEA. Oh, IKEA BB Systems, I heard.
Katariina Kari: Yeah, inter IKEA systems BB.
Juan Sequeda: Katariina, it is a pleasure to have you here as a guest because when the pandemic started, I started out reaching out to a bunch of people and just kind of, hey, this opportunity to connect. And we connected a couple years ago and just since then we just have on our calendar a monthly chat that we just connect to talk about anything, whatever's going on. And I'm really excited that we finally have you here as a guest.
Katariina Kari: Yes, I've really loved all our chats and I really loved how you pushed me to write that blog post about the three layers of the Knowledge Graph that was like-
Juan Sequeda: Which we'll get into today for sure.
Katariina Kari: Yeah. Yeah. That was one of the nice moments where Katariina, you have all these things in your mind, you should write about this. People should know about how you're thinking and all these thoughts you have, and just encourage me because sometimes you just have those thoughts but you don't think that they might be valuable. And sometimes I just walk around and I'm thinking everybody probably knows this stuff already. Why should I highlight it?
Juan Sequeda: Yeah. We're going to dive into that. But babe, before we go into our discussion, let's tell them toast. So what are we drinking? What are we toasting for? Tim, what are you drinking in Austin today?
Tim Gasper: So I am drinking something that I invented here. I'm going to call it the Cafe Red because it's based on a cocktail similar that's called Cafe Rojo, but I modified it a little bit. So it's got coffee liquor, raspberry gum syrup, Grand Marnier and rum. So very interesting drink. A little sweet but very tasty.
Juan Sequeda: Wow, you went really cocktail today.
Tim Gasper: Trying to be true to things. Sometimes we cheat and we just have some whiskey or something like that.
Juan Sequeda: We didn't have a chance to go get the cocktails, but we are having the Coney Island, Merman New York IPA. Never had it actually.
Tim Gasper: Nice. Is it pretty good?
Juan Sequeda: Yeah.
Katariina Kari: I'm not usually a fan of IPAs but this is nice balance, taste and that's good. It's nice.
Tim Gasper: Awesome.
Juan Sequeda: So cheers. But wait, we're going to cheers. Hold on. Today we're going to go cheers for, it's the start of our celebration of three years. The first episode of a Catalog & Cocktails was May 13th 2020, almost three years ago. This is amazing. Thank you so much.
Tim Gasper: Three years of amazing conversation and great cocktails.
Katariina Kari: Yeah.
Juan Sequeda: Here's to that. We'll be celebrated this week now I just completely forgot about it.
Katariina Kari: Such an honor to join in to you celebrations. Thank you.
Juan Sequeda: So all right, we got a warmup question today. So you work for IKEA. So IKEA is known for its very easy assembly instructions that end up being difficult for people. So what is something else in life that should be simple that you struggle with?
Katariina Kari: I'm running a team and building technology. It should be easy. We have all this amazing knowledge. We've gone to university for this, we should know that. But why is it every time so difficult to run the team and build technology with them?
Juan Sequeda: I think the team thing, I would answer that as because humans are complicated period.
Katariina Kari: But then the technology is complicated too. That's also difficult. It's both the humans are complicated then the tech is complicated.
Tim Gasper: And then both of them together it multiplies, right?
Katariina Kari: Exactly. Yeah.
Juan Sequeda: How about you, Tim?
Tim Gasper: I'll switch it into personal life and I'll say something that seems like it should be simple but that ends up being so hard, keeping the house clean and organized.
Katariina Kari: We have great products for that by the way.
Tim Gasper: I have some of them in my house.
Juan Sequeda: Oh, okay. I was going to say it's something about travel. Sometimes travel I think should just be easier to go book. But I also think that probably a lot of my travel is a little bit more complicated than normal.
Tim Gasper: Yeah, you do have some complicated logistics like, " I got to be in New York. And then I got to be here." And somehow you figure it out.
Juan Sequeda: Well, I also enjoy it too. But anyways. Okay, so let's kick it off. We got so much to discuss today. All right, so we are here at the Knowledge Graph Conference. So honest no BS. When should users organizations consider Knowledge Graphs? Or is it really, nah, AI large language models, GPT, this actually can do it now all?
Katariina Kari: That's a big question, Juan. That's a really big one. But I would say the moment you find that you have a lot of data and you're working with it and becomes harder and more difficult. So when you get this to this situation, have all this data, it should be simple. But I think that's a moment when Knowledge Graphs do make sense because it helps you organize your data, helps you to have a better grip on it.
Juan Sequeda: So one of the things we were talking about earlier was kind of this balance of there's humans and machines and we want, are the machines controlling us, or we are controlling the machines, and what is this balance? And humans are active, humans are passive. You had some great commentary about this.
Katariina Kari: Yes. So you kind of asked me what is this foremost idea I want to bring to the world? And it's definitely that as humans we shouldn't have technology run us. So we are creating technology but many times we're somehow becoming really passive with it. Let's say large language models that are developed and you right into ChatGPT. In a way it's a lazy approach because we are just letting this machine read all our data and we're super passive about it. And then it kind of runs the shows show because we are now just using it to search for information. And so, we should take charge of how it is the fact that it is spouting bullshit, that it's being eloquent but it's spouting bullshit. We should take charge of that. We should say, well, that's not correct. That's not okay. That's actually not good technology. That's really crappy technology. And I think there are a lot of people said, " I asked ChatGPT this and they told me that but it's not right." And then they're like, " But it must be because it's technology." I'm like what? No, no. Technology cannot run us. It's a little bit like how humans, we all got mobile phones now we're all running around with mobile phones. Is this ever recent development in our human history. And some of those mobile phones are running us. People have notifications flaring up all the time, bing, bing, bing, bing. And you have those red badges on your applications, and they're just distracting you and they're distracting everyone who has that. What I do with my mobile phone, I always take off all the notifications. If I load a new app immediately go into the notifications because I want don't want that app to dominate my life. I want that app to do that thing I want it to do and that's it. Shut up. You're working for me. I'm not working for you or you are not working me and manipulating me. So that's something I think this passiveness humans have about technology. Oh, I guess it's just like that. Oh yeah, I guess that's it. And it's a complacency. It's not even complacency, it's just like non... What would you call it? People don't think they can affect it. Maybe it's even that. Maybe they are a bit passive.
Tim Gasper: There's nothing I can do. It is what it is kind of a perspective.
Katariina Kari: It is what it is. There's nothing I can do, but I think there's a lot we can do or just by being really demanding. So I really appreciate humanists who are going on inaudible and saying, no, this is not right. But they're usually very destructive again. They're saying just don't have any technology.
Juan Sequeda: It goes to the extremes sometimes.
Katariina Kari: It goes to the extreme and that's not right.
Tim Gasper: The answer isn't throw your phone away.
Katariina Kari: No, it's just take charge of a phone. Take charge of the applications.
Tim Gasper: I love the perspective here that you're talking about around shifting from a passive approach to an active one. And I get the feeling that it's a little bit different advice when you're talking to the consumer of the technology, versus if you're talking to the vendor or the organization or somebody who's trying to harness the technology for some kind of a product or service. Can you maybe talk a little bit more about how might a consumer take a more active approach around the technology, especially around AI for example, versus how might a organization or a company take more of an active approach around technology like AI?
Katariina Kari: I don't see a big difference there because as a consumer we are using technology to lead better lives or have time more time for something. And the same thing is for companies. Companies use technology to save on time because time is money or to save on effort or do something better. So I don't really see a difference there. But then for a consumer who's just using an application that's made for them, maybe the repertoire of all the things they can do to control it is more limited. So companies can be more demanding because they usually investing so much money that they can already have this vendor relationship and say that these are the things I want. But otherwise I don't really see huge differences there.
Juan Sequeda: So bring this back to Knowledge Graphs, and AI, and large language models, and stuff. Just if we look at the machine learning and the creative models. Right now, if you just focus on just go use the data and go learn from that and that's all you do. And the humans are not putting their hands in there. And that's like the passive approach. Let's go do this, right?
Katariina Kari: Yes.
Juan Sequeda: But a human, the user/ organizational active approach and say, well, I'm going to put in define the stuff that is critical to us, it's the facts, the context, the rules of the game. And that's how I being active in providing that and the combination of these two things together is what we should strive for.
Katariina Kari: Yeah. And now in this context, it's such a hot topic, I don't want to really avoid it, but in the context of using large language models, so in companies use large language models and consumers are using ChatGPT. Both can take charge and be more active users of that technology because companies can actually use Knowledge Graphs to build structured knowledge of their domain and say that these are the facts inside our company, they're human created. And then they can give the large language but actually they can copy paste RDF to those machines and have them... They were spouting bullshit before, they either now are actually eloquently still and giving out facts because they've been infused with facts so they've been given the right context. But consumers can do this too. In ChatGPT, when you're having a conversation, you actually have quite a lot of power as a consumer, just a private citizen to put inside their information, be like, no, this is the fact and this is the other fact, and that's incorrect. And now please give me the answer. Or now regarding this, please formulate it again and then you actually get the correct answer. So if consumers want to create a correct text or something. So this infusing not just being a passive receiver of the large language model and being like, you know what? I'm going to train you, I'm going to teach you this stuff. And there's even technology out there that enables you to do that. There was a really good talk here just in case, you see if I can refer to that?
Juan Sequeda: Yeah.
Katariina Kari: That was the saying exactly the same thing. So Andrea Volpini from WordLift. So WordLift helps companies to do SEO. And he was talking about how people aren't searching Google necessarily more, but they're searching in ChatGPT, which is kind of a huge shift in consumer behavior. And he was saying that the competitive edge for companies when they start also embracing large language model technology isn't necessarily in the kind of model or the amount of servers they run those models in. But it's actually in creating a well created knowledge base for their company and infusing the large language models with this knowledge and that's going to bring them their competitive edge.
Juan Sequeda: Something I've been seeing in the market, and this has just been the last one or two months now because with all the craze of GPT is this cool versus useful. So if folks are focused just on the coolness is because they're looking at, they're playing around with GPT and that's it. Oh, yeah, it's really cool to come up with the recipe and go shift and send the ingredients, the Instacart, and all that stuff. This was one of the presentations that I was at, that came up inaudible a couple weeks ago. That's cool. That's cool. But how is this useful for my organization? I think to bring in your organization you need to be active to be able to say, okay, here are the things that I care about. This is the context to make sure. And another awesome... One of my favorite talks at Ted was from Yejin Choi, she's a professor in Washington, I'm MacArthur Genius Fellow. And she's like, " Well, these large language models, they're really great at some things but they're also really stupid at some other things." And I think how the combination here is to have common sense Knowledge Graphs to be able to provide all that context. And I think the best analogy I get there is that you can't get to the moon by making the tallest building a little bit taller every day.
Tim Gasper: And I really like, Katariina, your comments about passing the facts to the LLM because I think most folks haven't really put two and two together on this yet. Where they just assume that ChatGPT and LLMs and things like that are much more of just a... It's just a language paradigm they think. They think it's just all about language. But some of the best applications of this technology are really when you give it good context and you say, consider that my name is this, consider that this is how our business operates. Consider that this is the framework I want you to think in. Okay, now here's my question and now let me pass that to you. And Knowledge Graphs are actually one of the best ways to represent facts to a system.
Katariina Kari: There's also a humanistic point of view here as well, which is that if you just base all the output of technology on data, so that's a passive approach. It's just look at my data, look at how I behave and act like that or give me the answers. Then it probably will just give you the ugly truth of what you're currently at. Where we as humans, we're racist, we have all these social problems, and we're doing all these bad things as humans. But we do have ideals. And those ideals improve us or give us a goal to which to strive to and become better as human beings. These ideals very much I think they are in a way shared as well. And I see that infusing large language models like taking the active role and giving it facts, you can also give it ideals. It's like I know our data is racist, but please don't be racist and this is how you're not racist. So we can also tell it to be more the ideal of what we want to be. It's not like don't do as I do, do as I say.
Tim Gasper: Right. That's super interesting. I mean, in the same way that you can tell it to say something in the voice of John Lennon or something like that. You're basically asking, I want you to be within this framing.
Juan Sequeda: Yeah, no, I love how we're actually getting very philosophical on this. But these are the things that we start thinking about, otherwise it's like yeah, just throwing stuff at the wall and see what sticks and just thinking that that's the way how life is. And this is the point in time in a... We're in a paradigm shift and an inflection point. And this is where I think the leaders in the room are the ones who need to be critical in thinking about this stuff right now. So ask, are you a leader or you're a follower? If you're a follower, you're just going to call follower, you're going to be passive around this stuff. And fine, not everybody's going to be have to be a leader being active. But if you want to go stand out lead, think about being active around this. And I think our message here is... I mean we're very biased. We come from this community stuff, but because we genuinely believe that this is the right thing to go do, thinking about knowledge. I'm interested kind of shifting a little bit gears into your experience and how you're seen within IKEA stuff. How are the folks, the teams that you're working with, seeing this combination of oh, we got machine learning and you got Knowledge Graphs. How is this combination happening internally?
Katariina Kari: Yeah, it's really interesting. So before IKEA, I used to work for Europe's biggest fashion e- commerce platform, Zalando. And I think on my first day at my job, this person walk to me and said, " Knowledge Graphs, I think machine learning does it better." So that was the general attitude. And Zalando was really big on data science and machine learning. So I felt like the odd one in the crowd every time I was talking about this and I was kind of always dismissed. But now in IKEA and also towards the end of my career I noticed a shift. I noticed a shift in data science. People are like, you have that structured knowledge, I want it. That would be good. That would make my machine learning thing better if you have that structured knowledge. That's what I'm getting in IKEA. We started the moment ChatGPT came live and people like, oh this is the solution. Actually the team that was responsible for bringing large language models into IKEA said, oh, we have a Knowledge Graph team, great. Bring us the structured knowledge. That shift has happened now and I've really seen it. So in IKEA, we're all talking about the hybrid approach and even management comes to us and says, " You're now being made redundant because of large language models." They're testing us to see if we're saying no we are not because we need structured knowledge. They're like, yeah, that's the right answer. So even management knows this now in IKEA or at least inter IKEA systems really.
Juan Sequeda: Yeah, this is a fascinating thing to go see because we are hitting an inflection point around this. I mean, the technologies are coming together. But it's all about getting that people, the cultural change, I mean this is always a theme three years, Tim, of us talking what is the gen general theme of everything? Who gets those people around us?
Tim Gasper: People and change management.
Katariina Kari: And I'm comparing usually this paradigm shift in Knowledge Graphs to the paradigm shift that happened in software development 10 or so years ago when DevOps came. So DevOps was a huge shift in how we build and operate software because it introduced that whole continuous integration, continuous development idea. And it was very foreign to how it was done before. And I actually have friends in Finland who are also the frontier of DevOps and they said that they just had to tackle it use case by use case. It's exactly what we're doing now as well. Use case by use case we're trying to bring forth and then Knowledge Graph would make sense that they just had to do a lot of work, show, be persistent, and just keep on going. And then at some point it did break through. And there were just some hoofers adopted it, others were still being traditional. And then slowly and slowly it kind of grew and became the standard approach.
Tim Gasper: This is exciting.
Katariina Kari: And it always helps when Google does it, right?
Tim Gasper: True story.
Juan Sequeda: So we wanted to get into one of the things you talked about the pyramid. This is something we've been talking about. So you know had a talk today and you talked about your pyramid layers. Can you expand on this? What you wrote, you can say it now.
Katariina Kari: Yeah, so the pyramid, which I mean really, is it a pyramid looks more like a triangle on paper. But it's that the three layers of a Knowledge Graph. I remember I talked to you about it and you're like, "Katariina, you need to write about this thing." And that's how I wrote my blog post about it as well. So if you want to read more about what it is, but the idea is... And it's not my idea, this comes from Dave McComb from Semantic Arts is that to divide the Knowledge Graph in terms of size into three layers. The top tip the small bit is the ontology. So that's the class definition. You could call it the schema, the properties, the data model that's on top and it runs in the hundreds. It's like hundreds of class and property definition to map out a certain domain. And then the middle layer, which is medium large in the thousands, is that is your collection of taxonomies. Another term for it is control vocabulary or categories. So I also say that if you look at the Knowledge Graph, there are the instances that have the highest page rank. Because all the other instances are pointing towards there. An example from IKEA, we have products. And every product has a material. And we have thousands, tens of thousands of products. But the selection of materials is in the 10s, it's like 20 or so. So they are from the 10s of thousands of products. Each one has one or two materials. So you do the math and that material node in the Knowledge Graph has a high page track. These are very special instances and it's good to predefine them, create them, and have them in your taxonomies. And then the last layer, the biggest one which is in the millions is the actual data. So this will be like every single product we sell in IKEA, connected to its categorization with its data, with its data attributes, and with its connection to other products, and so on. So those three layers really help me at least to explain the large graph to the stakeholders and say that we have a little bit of schema going on here. Then we need expert teams to... The experts on materials to create the taxonomy for the materials and the expert on customer experience to create the taxonomy on activities like sleeping. This bed is good for sleeping, this pillow is good for sleeping. And then we have that big amount of connected data of all the products and everything that goes with that, designers and so on. And that should never be done manually. That should be generated. That should be or calculated from a data source or transformed from somewhere else. So those three layers, I use it always as an educational material for non- tech people who have never heard about Knowledge Graphs to explain what the Knowledge Graph is about, what is inside of it, and what makes it.
Tim Gasper: So there's also sort of a... To make sure I'm understanding correctly, there's like a fourth aspect of the three things, which is the data, the source data, where everything's kind of coming from.
Katariina Kari: Yes.
Tim Gasper: And then the Knowledge Graph itself is the three layers you mentioned. It's the concepts, the categories which are more empowered and then the data. But in the three layers that data is the connected data, the sort of Knowledge Graph facts.
Katariina Kari: Effectively that. Or the data graph. Data graph is another term for it.
Juan Sequeda: And one of the things when we were discussing about this stuff was a while, I mean probably we started discussing on this project talking over a year ago, and it's kind of in the height of data mesh. And I think you remember it's like, this is an interesting aspect. The connecting it together is that the first two layers, the hundreds and the thousands, this is technically something that can, and probably you'd argue that should be centralized. And on that bottom layer it's something that, depending where you are, that can be decentralized or not.
Katariina Kari: This is the thing. Now you can use the three layers to reflect it against many things. In terms of authorship, the first layer, the top layer is centrally defined.
Juan Sequeda: By the way, can you give on what you're going through is, can you give some examples specifically to...
Katariina Kari: Yeah.
Juan Sequeda: But everybody who is listening has probably bought something at IKEA so they can't make it inaudible.
Katariina Kari: So on the top layer we centrally define that we sell products. Product is a class. We have material, they are made out of a material so we can all agree on material and they are bent for certain activities so we can talk about activities. So that's a very little ontology that we define. Now on the second layer, we have the psychologist who knows so much about our customers and the specialists, they define what those activities are. There's actually a lady in IKEA, one lady who defines the 12 activities that we are designing our products for. And then we have the team that's responsible for the expertise on the materials. So they define the set of different kind of materials like wood or metal or et cetera. But the products, their instances are not on that middle layer. So they're not there. But the authorship is decentralized for that second layer because it needs to be distributed across experts in the organization.
Juan Sequeda: But there's not multiple definitions of activities.
Katariina Kari: No.
Juan Sequeda: There is one...
Katariina Kari: There's one. There's one.
Juan Sequeda: ...Set of activities that they follow.
Katariina Kari: Exactly. There is a central thing coming too soon. But the authorship is good to know that for the first one it's centralized and the second one is decentralized. And for the third one, the authorship, a human should not be responsible for, it's more like created through mappings or through recipes and how that is created because it's in millions and nobody manually writes down every million thing. But in terms of storage, the top layer and the middle layer, they are centrally stored. Because we need to have a central source of truth. A source of truth, like one place from where you can find all the taxonomies, all the definitions from those different expert teams, but they should all be in the same place. In our case, it's in GitHub in one repository in GitHub as TTL files, as RDF files. And then the bottom layer can be central, but doesn't have to be. Can be decentralized, can be virtualized, doesn't have to be in a draft database. Can be in relation database because you can just virtualize the access to through the Knowledge Graph. And that way you can do this data mesh approach with it because it could sit with the teams owning the data.
Juan Sequeda: I really, really love this kind of partition of these layers because last night, I was having just dinner with some folks and we got to the same topic of how many concepts are there within an organization? It was like, yeah, we were just throwing numbers out. I think about in the retail, e- commerce it's like everybody thinks that they're different. But there's like these core concepts through the same and these are going to be in the hundreds max. You keep expanding. Then yes, you're the way how you define something specific or other taxon is maybe be specific to you, but that probably goes down to the thousands and so forth. So I think this is a great way of figuring out also what are you focusing on? Are you more at a higher level that needs to cover many domains or getting more specific? So I really, really, really like this approach. Okay. So other topic I wanted to talk you about was, you brought this in your talk today is let's connect this with the business. How is this helping defining these on these ontologies and the knowledge? How is this helping IKEA for example, the make more money, save more money?
Katariina Kari: So really the way it helps IKEA the best is, I don't know if not many people who shop at IKEA realize that it's a very thought out experience. Every single detail in the store, someone's gone there and put it there in a special place. And it looks the same if you've gone to an IKEA in Europe, or in the US, or in Japan. Yes, there's some local variations, but it's always a very set experience, always kind of the same experience many times. And to discipline I find it is almost like theatrical and it's what we call the IKEA magic. And it means that there's a lot of human thought and human touch put into it. Now it's really hard to translate this magic into ikea. com because the way it works today is, we have our coworkers in IKEA stores reading PDFs with general instructions. And then they are very intelligent humans. They are these really good, real intelligence models that then can translate these PDFs and put the things together and it creates this really uniform experience throughout the and throughout the world. Online, this is really hard because now we have machines and applications doing the thing and so we need to bring the magic of IKEA online. One thing is that we actually have, we're selling accessories with furniture. So that's one mechanism, a very established mechanism. So if you look at a sofa and they're on the floor, they're like 25 or so sofas on display. The interior designers working in IKEA match them with accessories, like really nice cushions or throws or laptop tray or lin roller to get all the dust off or the cat dog hairs off. And so online what we do is we talk to the interior designers who have this mental model of how they match accessories with furniture. We translate that in the Knowledge Graph. We say that, okay, it's like indoor sofas should be matched with indoor cushions and throws of certain size, and these care maintenance products. So then we translate that into rules, because the mental model is logical. So we translate that into rules using product attributes. Not actual products. We're not matching actual products with others because that's not feasible. We have tens of thousands of products, we shouldn't do that. So we create these general rules from which we then can calculate all of the thousands of sofas that we actually are selling in IKEA with the tens of throws and tens of cushions in the lin roller. And then we combine them automatically and generate these accessory to furniture connections on that bottom layer in the data graph. That's one application we're working on. I mean there's a lot, but this is the one we're working on currently. One before was, our first one that we went live with was how we can upsell products. So customers looking at a product and we always guarantee the best price, but sometimes we have also other equivalent products that put a little bit more money. You get really good quality and really kind of thought about and designs. So we used this upsell information and that was actually missing in the data. So the data had one part of it but it didn't really know what to upsell to so that we are curating in the large graph as well.
Tim Gasper: Interesting. Interesting. I know one other use case that you talked to Juan about in the past is around interior designers and being able to get some of the knowledge out of them and into a more structured format. Can you talk a little bit more about that example as well?
Katariina Kari: Yeah, so it's exactly that accessory to furniture knowledge, the mental model that the interior designers hold. So that's the first thing where we're trying to pick the brains of the interior designers, at least with this little mechanism of matching accessory with furniture. But there's more that we could do because we can also... Like the next one that we're tackling is product similarity. So interior designers, somebody comes in and it's like, " There's this white sofa I saw online, do you have it?" It's like, " No, sorry, that's been long out of stock. I'm sorry, we stopped selling that last season. I'm really sorry. But why don't I interest you in this one, which is very similar?" And the interior designers knows why is similar. Not just like, oh, white couch here, white couches are over there. But oh, that one. Yes, I know the designer. And yeah, the style was, okay, this is beige, but it really drives the same thing and it kind of gives it a nice new look. So that information, that's like magic and quality you getting in IKEA-
Juan Sequeda: Couldn't you just argue that why do we need all this knowledge? Let's just go look at machine learning, show all this sales data, creative models out of that. Why isn't that enough?
Katariina Kari: Yes, you could do this with computer vision, visual similarity. It doesn't really pick up on the style nuances. So it will mostly look like the statistics don't get to that level. It doesn't get to that level of style similarity because there's a little bit certain details that are always slightly different, make it the same style. So it will get to the shape, it will get to the color. And the other thing it doesn't reach when you do computer vision is price level. So price level is another one. If you're operating in a certain budget, we want to match that with a similarity, and I don't think computer vision can. We haven't seen it being able to match that. Plus it currently has a hard time recognizing these products. This is faster. This is just faster than bringing the model sometimes. And that's why we are doing that approach because we already have that knowledge. And it is just vector similarity, it's graph embedding similarity, there's like algorithms there, which we can already do this inside the graph, because the products are in the graph information and this that. And it's just easier to do it with Knowledge Graphs.
Juan Sequeda: One of the interesting things that you've been saying in these last examples, and this may be just specific to IKEA, but I think it resonates with different types of organizations, is this magic, this experience that you want to go offer. And so, you could argue that if you're truly customer focused and there is an essence of the organization that you want to transmit to your customers all the time, then it's that personal touchy feeling that you want to be able to... The knowledge that's being active around it. If it's all about like, oh, just leave it all mechanic. I'm just going to sell things, go in and out, I don't care where you are, you just... Then yeah, maybe that's not it. But maybe your competitive edge, the differentiator is that magic. Yeah, how do you deal with that? I think this is what I'm thinking.
Tim Gasper: One thing I want to ask that's related to this, it goes a little bit technology and abstract oriented, but I'm very curious about your opinion, Katariina. So when the AI movement especially really started to pick up steam in recent times, sort of the recent set of iterations around it. First the excitement was centered a lot around the algorithm design. It was algorithmic centric, architecture centric AI, and deep learning, et cetera, et cetera. Computer vision, exciting enhancements. And then we've kind of entered the second phase of that, which was more, I think Andrew Ing whether he came up with it or not really pushed it right around data- centric AI. Oh, we've kind of maxed out. I mean there's always going to be algorithmic innovation, but really it's all about the data. You got to have the right data. It's all about having the right data. Are we moving now into a third paradigm of realization here, which is to the comment you made of how fast you can train the model. That yes, you could throw a billion pictures of IKEA products at a computer vision product along with some unstructured, some tagging. And that's an approach and you're going to get somewhere, something useful will come out of it, but also a lot of things will not be useful with that. Are we moving into a third paradigm now? And I'm going to get into the bad habit that I often do with Juan here and start coining things, but let's call it knowledge centric AI. Actually, what we're doing is we're designing knowledge and sure there's a lot of practical reasons why you do that. But one very practical reason for data teams to do that is actually to accelerate the time to train the model, make the model more accurate, make the model smaller. Anyways, I'm kind of ranting here. I'm curious about your thoughts on what I'm saying so far.
Katariina Kari: I mean, I don't know. We are all convert. So we'll be saying yes knowledge, that's the way it should be. But it's like a use case by use case thing. And there's also one more aspect. Instead of training a model to do one thing, let's rather capture the why to the rule that drives be the knowledge driven take the not driven approach. I'll capture this why? Because there's one more reason for it. It's not just that it's smaller and you can make it sometimes faster than teaching it. But it's also because somebody else can reuse it. So somebody else can actually, who has a completely different use case, is happy that you organize that part of the knowledge for them in a machine readable format because now they can come and reuse it. This is actually super funny. So I said that we could do this, we could do these product similarity recommendations with computer vision. Well, one fun twist to the story is once we started doing this, these accessories, these pillows go well with sofas, these throws, computer vision guys who came to us and said, you have structured information of a pillow can be found usually next to a sofa. Give that to us. We need it because our machine learning model that's trying to recognize products out of pictures meets that. Because now it's trying to guess from all the products. But with your information of things that actually go well with each other are usually found together with each other, helps us to bring down the level of calculation and make it more accurate.
Juan Sequeda: So I really like how you frame this out, Tim, about an algorithmic centric AI, data centric, like a knowledge centric. And so my first thought was maybe this is a maturity curve. Like oh yes, you need to get into the knowledge, and knowledge is better than just the data, better than algorithm. And I don't know if it's better, but I do probably see this a spectrum of exclusivity. Or maybe not even exclusivity, it's like more accuracy or something. So for some use cases you just maybe find just the algorithmic approach and for some other use cases you get with data you can get better. And then for other use cases, you need knowledge. But for there may be use cases like, oh, with the data it's enough. Actually investing more into the knowledge is not going to get me better, improve my accuracy, whatever. So the ROI on doing that is not worth it for what I want to go do. I could just accomplish with it just with data without the knowledge. But I think in this particular example of the pillows, it's like I think in your case I can imagine, oh they kind of hit a wall. We can't improve this anymore and it's actually not enough. We really need to go improve this to make this better, to make this actually usable. And they're like, oh, this other knowledge thing that could help it to make it better. But for other scenarios they probably didn't need it.
Katariina Kari: Exactly. But this is the paradigm shift we're talking about. This is that we just need a better model for data. So it's not enough for the data to lie around and again the passive approach, but it's good to somebody to do a bit of con for the data. Yeah, it's actually really good because then it's not only that I understand it, but others will understand it and the machine will understand it better. And you said, Tim, you want to keep your house clean. There's a reason why you want to keep it clean. And this is about keeping your data clean and organized and organized, put into nice boxes from where you can pick them up and find them better.
Juan Sequeda: So before we head out to our lightning round, there's one more topic. And it's on a topic that we've discussed over last year and so forth, which is upskilling the subject matter experts to be these knowledge savvy ontologists. We upskilled the interior designers. And we've talked about this for a while, you've been very kind of on this mission. What is this? How's it going?
Katariina Kari: So last year, I was talking about how we have this grand vision. We are going to take IKEA experts and we are going to teach them about ontologies and taxonomies. And I was telling about all these games that we are playing for them to learn. This is a class, this is an instance. Just a lot of different thoughts work that we put into it. And that was because we were working on this hypothesis. We're going to take these IKEA experts that have almost like 20 years of IKEA interior designer knowledge. And then they're going to be writing RDF in the end and they're going to be supercharged, a fantastic IKEA Knowledge Graph, and putters, and authors. And since then we discovered and realized that it's not feasible for them to become full- fledged. So we first thought that that's going to happen. We're just tell them about, oh-
Juan Sequeda: But why did you think that was something possible?
Katariina Kari: Well, we thought that that was possible. We just assumed it because my boss, Adam Garastej, who actually started the whole IKEA Knowledge Graph project and hired me, he's like that because he was start as an interior designer at IKEA. He then later went into UX design and then learned. There, he saw all these problems with data and learned about Knowledge Graph and realized that's what I need to drive better experiences. And then now he can read TTL, he writes Sparkle on and he thought, yeah, we're going to have other IKEA people who are going to become like me. Then we realize that it's kind of hard. It's really hard to be both. To know so much about this humanistic side and to then also be able to translate it to computers and think an ontologist, that ultimately either you're a special person like Adam or you really need a degree in IT to be that. So we kind of exchange our strategy a little bit and now we're seeing them more as tool enabled Knowledge Graph aware domain experts. So we're now paying much more attention into getting them the right kind of tooling. They're still participating in my Sparkle course that I keep for our other developers and they're kind of learning about it. And I think one of them is actually, she's now going to a Udemy course to learn about RDF and Sparkle. With time she might actually learn this. But the other said, " You know what, I'm not interested in being learning to become an ontologist. I just understand what we're doing here and I want to keep on doing it. But give me the tools with which I can do this." And we just saw that it's inaudible time spent here.
Juan Sequeda: So in a way it's like, oh, we saw this unicorn, which is your boss. Like oh, we can do more of that. But in reality, that was harder.
Katariina Kari: He wanted to clone himself and then he realized that's not necessarily possible. Maybe he'll realize that he's actually special person, like a unicorn.
Tim Gasper: I have one more quick question before we go to our lightning round and stuff, which is, as you look to the future within IKEA, is there a use case or something that you're most excited about that is maybe futuristic, or pushing the envelope on things, or that you think is very interesting?
Katariina Kari: Yes. Cookie. Cookie less personalization on ikea. com. That's my pet peeve currently. That's what I'm super excited about. Ethical personalizations. Transparent, non- creepy recommendations, things that says, " You click on this and this, this is the kind of idea we're having about you. This is correct. And I noticed that you were looking at baby beds, but you're not looking at them anymore. Should I forget this?" Well, someone who's like a personalization that's actually decent and yeah, I always say no creepy.
Juan Sequeda: I like that. That's the honest OBS tear. Yeah, all these stillest personalization that's freaking creepy.
Katariina Kari: Yeah. Or just stupid. Most of the personization is just stupid. I booked the flights already. Shut up. I don't need to book them again.
Tim Gasper: Yeah. Yeah. Or Amazon telling me, " Are you ready to buy that vacuum cleaner?" I bought one last week.
Katariina Kari: Sure.
Tim Gasper: Come on.
Juan Sequeda: All right. Time flies when you're having fun. We got to get to a couple more things. I know inaudible too. All right. AI minute, one minute to rant about anything you want about AI. Ready, set, go.
Katariina Kari: Okay. AI is just another tool. Don't be so emotional about it. It's not going to kill you unless you let it kill you. So forget about Terminator, you're your own Terminator in your passiveness. Just see AI as another tool. Also, AI art. It's just photography. It is just another tool. We had fine arts with painters, and then we had photography cameras with photography, " That's not art." And now they're doing AI art and like, " That's not art." Well, if there's a human behind doing AI art, it's going to be art.
Juan Sequeda: All right. Less than a minute.
Katariina Kari: 40 seconds.
Juan Sequeda: Perfect. You got your thing about AI. All right. Lightning round presented by data.world. Let me get kick it off. So is the rise of large language models like ChatGPT, causing an inflection point for Knowledge Graphs?
Katariina Kari: What's inflection point?
Juan Sequeda: I mean, it's a tippy... We're tippy, right? Things are changing because-
Tim Gasper: Hockey stick.
Katariina Kari: It's not. No, it could just kind of gives us more fire under the ass to say, now you really need us and this is actually necessary. Maybe it's becoming so fast and so ridiculous that we now need that grounding that we get from Knowledge Graph. So in that sense, but I don't think it's a tipping point for us. It's been a long development to get here.
Juan Sequeda: All right. Tim, you go next.
Tim Gasper: Maybe if LLMs are helping us go to the moon, sometimes we need a little gravity to keep us settled.
Katariina Kari: Gravity goes back from the moon as well. So we can let home.
Tim Gasper: That's true. We have to make sure we can come back too. Second question is, was it easy to teach folks about Knowledge Graph at a company like IKEA?
Katariina Kari: Yes, because in IKEA, the humanistic side is so important. So when I told them that this is the non- creepy AI, this is the one where humans have a say, and that machines kind of know a little bit more about humans, they were super adaptable to it. They didn't like machine learning. They were like, that's horrible. They were getting the Terminator creeps and they're like, " I don't want it." But when I told them, Hey, we have this friendly, transparent human takes control over everything AI, and that's the Knowledge Graphs, they're actually really taking on it.
Juan Sequeda: It's humanistic. This is interesting, an interesting pattern. They're like companies who really care about people.
Katariina Kari: Yeah.
Juan Sequeda: Next question. Will AI development teams learn and build Knowledge Graph expertise on their own? Or do they need to hire externally?
Katariina Kari: I think everyone who invests in Knowledge Graphs finds themselves to needing to hire an ontologist. So they can learn about Sparkle. Developers can learn Sparkle and they can learn RDF as a standard. But they do have a hard time to have this ontological thinking where you need to design for description logic. So they do need those people.
Tim Gasper: Wow, I just took a note on that. I think that's a very interesting that developers can learn the Sparkle, but you need an ontologist with a description logic. I think that's a really good take away. All right, last lightning round question for you. Will data teams be the one to drive this knowledge capture and knowledge design, or should a different team be doing that?
Katariina Kari: No, it should be the business. It should be the management who's saying, I don't understand our data. I cannot ask this question or this question takes me 20 days to be answered. I want it to be answered in 20 seconds. So definitely it needs to be management who drives the inaudible.
Juan Sequeda: All right. I was actually thinking you were going to say the data team. Okay. Good.
Katariina Kari: They're too low in the food chain.
Juan Sequeda: More honest, no BS this coming out.
Tim Gasper: Less data engineers asking for Knowledge Graph. More CEO is asking for Knowledge Graph.
Juan Sequeda: There you go. All right, we got a lot of notes. Tim, take us away. Take away time.
Tim Gasper: Take away time. So we started off by talking about how humans shouldn't be run by technology. We shouldn't be passive, we should take control. We should have agency, we should be active with technology and things like ChatGPT provide a new opportunity for us to be kind of lazy and let it run us instead of us doing the work to make it smarter, to make it better, to make it more responsible for it to do better things for us. You talked about mobile phones for an example. You don't have to throw away the mobile phone in order to get control of things, but you do have to take active action. We should take charge of the fact that AI is spouting BS and things like that. If facts, if it's saying things that aren't facts. So we need to take charge and we have to accept that that's crappy. That can't be something that we accept. It has to be something we address. And we don't have to take the attitude of it is what it is. We talked about how is there a difference between companies taking an active approach with AI and with technology versus consumers taking an active approach. And you mentioned that it's not necessarily that different. Even though maybe consumers have a little less agency, like in the phone example, you can only configure the notifications based on what it lets you configure. But you do still have some control. As companies, maybe we have a greater realm of control. But ultimately, we still have to choose to take that action and choose to take that active role in the context of large language learning models. Everyone can take more actions as learning of the users of the technology. For example, organizations could be creating a Knowledge Graph to create structured knowledge and passing that onto the AI. Even consumers can do something like this, where you can tell ChatGPT some facts, some context, either part of your prompt or part of your session. And now it's going to have more grounding, right? It's going to spout less BS, less hallucinations. So I think this is a new skill that we're all learning around how to curate structured knowledge and use it to get better results, get better answers, get faster answers. Whether you think of it as a Knowledge Graph or not, that's essentially what you're doing. You're building a little tiny Knowledge Graph in the context of that right there. Last, before I switch it over to you, Juan, we talked a little bit about machine learning versus AI. You really emphasize the importance of this humanistic factor that when you take more of a structured approach to knowledge, you're really, you're creating facts and you're giving AI facts. And in the past folks may have told you machine learning does it better than KG or something like that. But there's a shift happening where folks are saying, oh, structured knowledge would actually make my model better. And there's a paradigm shift starting to happen. Similar to what happened with DevOps maybe 10 years ago around building, deploying, and managing software. That same thing is starting to happen now with machine learning, KG, and AI. Juan, over to you.
Juan Sequeda: Well, so much to go. So when we went over the whole three layers, sorry, the Knowledge Graph, the hundreds of concepts, the thousands of categories, the millions of data. The concepts are going to be like the year temple for IKEA, it's like, oh, a product that's a class. It's a concept. They're made of materials and it has activities. That's that first layer. It could to be hundreds of those. And then we go into the second layer, which is going to be the thousands, which now we get into like, okay, so we are going to go do activities. Well, there are 12 activities actually. In IKEA's case, it's just literally one person who's in charge centrally to go to find those things. And then there are other types of materials. There's wood, and metal, and so forth. That first two layers should be stored centrally, so everyone can find that. And that third layer can go to the millions that this could be central, this could be decentralized, it could be in a graphic, could be in a database, it could be virtual, whatever it is. And this is now I'm starting to connect all the different data back to connecting to those two first layers. And this should be automatic. Nobody's going to be manually curating any of these millions of things. So I think I really like these three layers. And you have a blog post about that. Highly recommend folks to go take a look at it. Let's connect us to business value. How is this providing? Making more money, saving more money? I think it's really interesting for a company like IKEA that you have this whole human side, the IKEA magic, which actually when you go inside a physical store, things are packed, are presented in a way because it's been thought out about it. Now, how do you translate that magic inside of the digital, the. com? Well, it's not there and you want to get it there. So providing all that expertise in the form of the Knowledge Graph. That's how you're achieving it. Because at the end, you want to go sell more accessories with the furniture, make it easier to go shop. And I think by mining, kind of understanding how interior designers think about it helps to go drive upsells, product similarities and so forth. At the end, people are doing already things with computer vision, but it's actually not as good for managing prices. So this helps for some sort of computer vision. We had this discussion around going from an algorithmic centric AI, data centric, knowledge centric. And it seems like, yeah, you really need to understand the use cases, and depending on the use cases, you need to invest more on this. And then we kind of wrapped up with, hey, upskilling subject matter experts. It kind of seems like a great idea in theory. And maybe there's some possibilities that you could have upskill them, but there's probably going to be unicorns around that. So really what you've kind of adopted is they need to be tool enabled and knowledge craft aware these subject matter experts. And finally, what are you excited about next? A non- creepy cookie list, digital personalization. All right. How did we do? Anything we missed?
Katariina Kari: Well done. I felt like my life flashed before my eyes.
Juan Sequeda: Well, this was all you. So hey, let's wrap this up. Three final questions quickly. What's your advice? Who should invite next? What resources do you follow?
Katariina Kari: My advice is go and look at other fields. Go look at what's happening in the arts. How are they talking about... No, not knowledge they're not talking about... But how they're talking about these phenomenons. You put attention to the emotional reaction to the technology we're building, and understand it a little bit better to play with that. Because even though technologies are more known about... Know more about technology, they will still have those emotional reactions. So at how Art is discussing these artificial intelligence. I don't mean watch Terminator. There's also other things that are really nicely working with it. And then what was the second one? Who
Juan Sequeda: Should invite next?
Katariina Kari: Who should you invite next? There's a lot of really great names here. I'm not sure if they've been here already. Just I think you should really talk to... I've been really impressed by learning more about WordLift. And Andrea Volpini, I think you should talk to next.
Juan Sequeda: Yeah, we have not had him. No.
Katariina Kari: And what they're doing, WordLift looks really promising. They're really on it to do SEO, but now with not compliment, people do searching on Google, but people searching on ChatGPT. So that's super interesting. That's one of the talks that really impressed me. And then third one?
Juan Sequeda: What resources do you follow?
Katariina Kari: So I read newspapers. I really love any sites that talk about culture. So my favorite is, my most favorite Saturday past activity is to read the Financial Times Weekend of Life and Arts section. So the life and arts section FT weekend because it always has this beautiful, very well written journalism about literature, world events, especially interesting culture, things happening in Africa. They covered that really well. There's always a person that they have lunch with. And sometimes that dream that maybe, who knows? FT will have lunch with me. I'm not that person. But I used to dream that I would be in Catalog & Cocktails a few years ago and here. So maybe next time in a few years, in 20 years, I'll be in lunch with FT. That's my dream. And for me, it's like my women's magazine. I relax when I read about culture, when I read about just somebody having really thought out reflections of things that are happening currently.
Tim Gasper: I love that recommendation. People sometimes get so steeped in technology, and business, and stuff like that. To look at it from the arts angle is such a good contrast and a great way to get a wider view.
Katariina Kari: One of the best tech comments I found on FT Life and Arts.
Juan Sequeda: Well, there we go. Well, Katariina, this was fantastic. We went through so much and we learned so much about your experience perspective, applying this at IKEA. Just quick reminder, next week we have Maddy Want who is the VP of Data Fanatics, bedding and gaming. And she's also the author of Precisely: Working with Precision Systems in a World of Data. We have tons of digital transformations started trying to go through, so it's actually going to be hard to pick kind of her favorite ones. But really excited about that. With that, Katariina, thank you so much. Thank you, Tim. Thanks, data.world, let's us do this every Wednesday. And we're off to this really nice rooftop bar right now to go have a drink and overlook at the Manhattan skyline.
Katariina Kari: Yes.
Tim Gasper: Y'all enjoy. Cheers.
Juan Sequeda: Cheers everyone.
Speaker 1: This is Catalog & Cocktails. A special thanks to data.world for supporting the show. Karli Burgoff for producing, John Williams and Brian Jacob for the show music. And thank you to the entire Catalog & Cocktails fanbase.
Speaker 1: Don't forget to subscribe, rate, and review wherever you listen to your podcast.