About this episode

Is your data infrastructure designed for the next decade of data work? Probably not, but are you at least heading in the right direction? It’s an existential question that often reveals fundamental flaws in the system. 

Join Tim, Juan and special guest Mammad Zadeh, former VP of Engineering at Intuit, for a deep dive on data tools, leadership, and change management.

Special Guests:

Mammad Zadeh

Mammad Zadeh

Engineering Executive

This episode features
  • A look at the rocky road ahead for CDOs
  • How to effectively scale data teams
  • What’s your crisis comfort food?
Key takeaways
  • Data found itself in the front, as a product; not as back door anymore.
  • Over-centralization (such as all the data engineers sitting in one silo responsible for everything) — it doesn’t scale.
  • Data leaders need to zoom out from just the technical deep view and become more business-oriented.

Transcript

Tim Gasper:
Catalog and Cocktails and hello to everyone who’s actually joining us from our data.world summit, which was an event that happened today. We’re happy to have you here and continue the fun with some cocktails in hand, some tasty beverages. This is a weekly live hangout and honest, no BS, non salesy conversation about enterprise data management. I’m Tim Gasper, longtime data nerd and product guy joined by Juan Sequeda.

Juan Sequeda:
Hey, I’m Juan Sequeda. I’m the principal scientist here at data.world and it’s Wednesday. It’s middle of the week. End of the day. We started out the day with our summit and we were having coffee and now we moved to a cocktail, so it is always a pleasure to kind of take a break and chat about data and today we have a very special guest and I want to give a little bit of a backstory. Data mesh is such a hot topic. Back in episode 44 … Can’t believe this is episode, I don’t know, 55, almost 60, something like that.

Tim Gasper:
Yeah, something like that.

Juan Sequeda:
Anyway, that was back in April. We had Zhamak Dehghani as a guest and actually she was a keynote today at our summit. And we always ask our guest, who should we invite next and Zhamak said we should invite somebody who is her coach, who is her mentor and in her words a very quiet but a very wise man who has seen a lot. And that is our guest today and that is Mammad Zadeh. He is the former VP of engineer at Intuit, he’s previously been at LinkedIn where he did Kafka. He has been at Netflix, at Yahoo. He has seen so many things and I am so excited to spend this afternoon with you, Mammad, and talking about data and everything that you’ve been seeing. So cheers. Great to have you here.

Mammad Zadeh:
Cheers. Thank you so much and great to be here with you, Juan and Tim, thanks for the invite. And it’s great to … Well, cheers.

Juan Sequeda:
Cheers. So let’s start out with what are we drinking and what are we toasting for?

Mammad Zadeh:
I don’t know you’re drinking, but I’m drinking … It’s a little early, so I’m drinking a little bit of Amaro here, so cheers

Juan Sequeda:
And you actually mentioned it was your homemade.

Mammad Zadeh:
Yeah. It’s a little side thing I do, so that’s what I’m drinking. I don’t know what you’re drinking.

Juan Sequeda:
Well, I’m having a mix of a bourbon and I’m back on my passion fruit. I have a bunch of passion fruit water and syrup so I’m mixing it up with Waterloo watermelon water, and it’s really refreshing and some bitters in it. So that’s my drink right now.

Tim Gasper:
That’s cool.

Mammad Zadeh:
I just planted a passion fruit plant this weekend so I’m hoping I’ll get some fruit.

Tim Gasper:
Oh, that’s awesome.

Mammad Zadeh:
Yeah.

Tim Gasper:
Yeah. I am drinking a sidecar. I’m trying a new category here, getting into the cognac a little bit. So doing cognac, orange liqueur, and lemon. And I know a lot of people argue about what are the right ratios for a sidecar. So I’m doing two ounces of cognac, one ounce of orange liqueur and half an ounce of lemon for those that are nerding out about sidecars.

Mammad Zadeh:
That sounds about right.

Juan Sequeda:
That’s a very boozy drink and I’m going to go cheer for just we finished our summit and we had a successful event. We had fantastic speakers from Zhamak to Doug Laney talk about informatics and I mean, Barr Moses talking about data ops and so many different topics.

Mammad Zadeh:
Excellent.

Juan Sequeda:
So cheers to that. How about you? What are you toasting for?

Mammad Zadeh:
Well, I’m always toasting to health, happiness and prosperity, but I just found out that we won the game a few minutes ago so cheers to Man United fans.

Juan Sequeda:
Cheers to that. The champion leagues are going on. So we had our warmup question so today we’re going to be talking about the data existential crisis so when you’re in your existential crisis of your own, what’s your comfort food?

Mammad Zadeh:
Ice cream.

Juan Sequeda:
What type of ice cream?

Mammad Zadeh:
Plain vanilla. It’s got to just … Just gobble it down. That’s the comfort.

Tim Gasper:
Nice.

Juan Sequeda:
How about you, Tim?

Tim Gasper:
Well, I’ll say I am a vanilla fan as well, either Ben and Jerry’s or sometimes Tillamook if I can get my hands on it, but in terms of comfort food, I would say Dolsot Bibimbap. So I’m half Korean. I love my … I love a little bit of Bibimbap, which is Korean beef, rice, and veggies with fried egg and a little spicy sauce and if you can do it in a hot stone bowl, even better. So that’s my comfort food.

Juan Sequeda:
Wow. I’m going to go I like pizza and I will confess that I will eat Domino’s thin crust pepperoni and sausage. That’s my very specific comfort food, so.

Mammad Zadeh:
Very nice. Very nice.

Juan Sequeda:
All right. Well, we talk enough about food. Let’s talk about data and so actually earlier today I was chatting with Zhamak and she mentioned to me that you have bottled up a lot about the disfunction of data in the enterprise. So let’s kick it off with an honest, no BS question. What’s wrong with enterprise data today?

Mammad Zadeh:
I don’t know if I’ve bottled it up or not, but I mean, for those of us who’ve been on …. Working on that stuff, I think it’s … We’ve picked up, over the years, bits and pieces of what works and what doesn’t. I’m not necessarily … I wasn’t going to use big words like crisis, existential crisis, or anything like that. I think I just see us on a continuum. I see that we’ve learned a lot in the last two, three decades and data has grown exponentially. The value that we want to get out of data has also grown. In fact, it’s become … I think the main thing that I see is that data has found its way into the mainstream of the product offerings, and I guess in the old days, that wasn’t the case.

Mammad Zadeh:
In the old days, it was kind of a side thing. It was kind of analytics as a back office thing, but those days are over and we now have features and capabilities within the product offerings that directly are a result of the analytics work and the machine learning and all that stuff that’s happening. So the traditional ways that we have organized ourselves around data, which usually has been moving to a centralized organization to deal with the complexities of data, isn’t really addressed seeing the needs of today. So I think that’s where we are. If it’s a crisis or not, maybe. I think we need to find a solution so that’s kind of where my hit is.

Juan Sequeda:
All right. So let’s kind of go back a little bit. I like how you’re establishing that we’ve had … Before data was more kind of the back door. Let’s just make sure that it’s keeping the basic stuff running about analytics and BI reporting, and now we’re getting into this world where data is kind of like the front matter, right? It is this notion of treating data as a product and that’s a switch, right, because before we just need just kind of the typical old legacy tools, right? We need our data warehouse so we need our ETL, our BI tools, and that was it. But now we’re taking it to the next level and that transition is when we’re realizing, “Wait. The way we’re doing things beforehand, it’s not working for the stuff we want to go do now,” and that’s something that we’re being … That we were seeing that’s happening in the what? Last five, 10 years or?

Mammad Zadeh:
Yeah. Yeah, I think that’s about right.

Juan Sequeda:
Okay. So what is it that we’re … What are like the top couple things that you’re seeing that we must either stop doing this old thing that’s not working for this new thing or we need to go update it? Are there things that we just stop doing and something new we need to go do or something we need to go transform? How are you seeing this?

Mammad Zadeh:
Yeah, I think fundamentally, and this goes back to I think the heart of what the promise maybe that data mesh is offering, which is decentralizing or breaking that centralized organizational structure that we normally see, especially in bigger companies where you have like a bunch of data engineers sitting in a central team that are responsible for collecting the data, making sure the data is good, all that, and getting it to the hands of the consumers who want to do something with the data.

Mammad Zadeh:
I think the problem with that model is, first of all, it’s not scalable. The accountability is not quite in the right place and frankly, the subject matter experts are really within the engineering domains where the main product offerings are happening and I mean, it’s very common these days to see, I would say, data related or data product related features on the backlogs of the main product offering.

Mammad Zadeh:
So we’re seeing that sort of traditional dichotomy, that traditional chasm, that existed between the operational world and the analytics world that’s kind of blurring up and the old structures aren’t addressing that. So I think that’s kind of where my head is and what are the better ways to organize ourselves and put the accountability where it needs to be?

Tim Gasper:
Yeah-

Mammad Zadeh:
That is, to move it to the domain engineer, move a lot of that responsibility to the domain engineering teams but then that leaves a gap. That leaves us with the question of, “Do we have the right tools? Do we have the right infrastructure? And do we have the tools that the average engineer on the product offering side would be able to do the kind of things that we expect them to do?” And I think the answer to that is no and that’s where we also need to make progress on kind of, I would say, eliminating that super extra specialization of skills that are required to do some of the stuff that we do on the data side.

Tim Gasper:
Yeah. That’s interesting. Obviously, as you get into this topic, it’s starting to set up a little bit around, “Hey, maybe this solution is the data mesh,” and that kind of thing but just before we start to dive into that, just focused on this sort of dysfunction that you’ve identified here around over centralization and some of the issues that can come with that, right? Scale issues, accountability issues, subject matter expertise issues. Has this always been a problem in your view? Is this just like enterprise data used to be hidden in the back office and now it’s in the front office, it’s in the spotlight or has it been exacerbated not just by that, but like the advent of AI, the advent of self-service analytics? Is it the combination of both of these trends that is forcing this more into view?

Mammad Zadeh:
I think so, and again, I’m not necessarily saying that what we’ve done in the past was wrong. All I’m saying is that looking ahead, it’s not going to work anymore. I firmly believe that we’re on an evolutionary path and and we learn and go. Maybe back in the day, it did make sense to quickly get a bunch of people that know how the stuff works to get something going, but the problem with that approach is that it’s not a durable approach, it’s not a scalable approach. It might get you out of hot water for a while but … The other thing that I kind of want to throw in here is that this is an engineering problem. Like any other engineering problem, you have a problem, you need to think about the right architecture and you need to put that in practice and I think we kind of maybe were a little lazy back in the day. We were just focusing on a few important infrastructure level tooling and kind of not really taking the bigger view of the experience and-

Tim Gasper:
Thinking small, thinking technical, that kind of thing?

Mammad Zadeh:
Well, thinking nerdy. Yeah. Thinking deep, as opposed to what is the enterprise problem here and what is the right solution and what do we need to build in order to make that happen? I think we need to invert that a little bit, because today it’s all about the tools and the things that we already have. The big databases, the query engines, the stuff that you need and they’re very important to have, but I think we need to take a fresh look at what is the right solution, big picture? What is the right experience for the engineers? What’s the right experience for the consumers, I call them producers and consumers of data and, then see if we have the right tools or not and what are the missing pieces in the infrastructure tool chest that we have and start working on them and then that brings us into interoperabilities. Do we have to go down one path with one company or is there actually a way to pick and choose and have interoperable bits and pieces of infrastructure?

Juan Sequeda:
So you … A couple minutes ago, you were saying that this was an engineering problem, but you’ve just described more of like the social aspects about it.

Mammad Zadeh:
Yeah.

Juan Sequeda:
So … And I completely agree that we focus so much on the technical nerdy stuff and I think that’s always a problem is that we jump into the technical stuff. We dive into the … We’re at the thousand foot level and we forget the big picture about it and we define success from a technical point of view instead of understanding who are the consumers and so forth, but is it really … I’m going back to the thing you said, it’s an engineering problem. I don’t think it’s an engineering problem.

Mammad Zadeh:
So this is where Conway comes into the discussion, right? The Conway law is live and well here. How we organize ourselves is a big part of how we architect and design things, and they go hand in hand. So, yeah, you identify that link which is great.

Juan Sequeda:
Okay. So I do want to go … I want to talk about organizations and people and structure and teams and stuff like that. I want to talk about tools, but before we get there, we need to go talk about data mesh here. I know that you’ve been part of seeing the birth of this in supporting Zhamak and other folks about it. What is your definition of data mesh and how do we get there and who actually needs to get there? Does everybody need to get there or not, and kind of open the floor there.

Mammad Zadeh:
Yeah. I mean, I’m sure if you ask Zhamak the same question, she would have a much better, more eloquent answer to what is data mesh, but what excites me, or what really interests me about it, is because center in the argument discussion about data mesh is the notion of decentralized ownership of data, or I should say decentralized accountability of data, where we’re trying to say that if you are an engineering organization that today has a whole bunch … A set of microservices or whatnot, that is offering your functional capabilities on the operational side of the business, it is your responsibility also to make sure that the consumers of the data that you produce can get ahold of what they need with the right SLAs, quality assurances, trustworthiness, security, all the aspects that go along with data is no … I mean, data and functionality should be treated the same by the domain engineering teams and that’s, to me, what I think is needed and if data mesh is the vehicle to get us there, then I’m on board.

Juan Sequeda:
Data and functionality should be treated the same. I think that’s a beautiful sentence right there. I think that’s something that we miss, right? We think about data as the bits by itself, but you got to go do something with those bits-

Mammad Zadeh:
And yeah, I mean, maybe I should have said there’s … And that’s really the key here, that … There’s new functionality that we’re coming up with based on data. When we’re building ML models, we are essentially introducing a new service, sort of, a new function, a new feature to the product. So all of that engineering rigor that goes into building the functionality of the product applies here as well and I think it does … Today, it kind of doesn’t. In some places, maybe it does, and some companies probably are doing it better and some aren’t, but again, I go back to two, three decades ago, right? Used to be, “Hey, I’m an analyst. I need to run some queries on your database.” There was only one database, right, and then we engineers would tell them like, “Yeah, well, you can’t do it during the day so why don’t you come back at the night when nobody’s doing something, and then you can run a bunch of queries.” Those days are over. We are now much, much more integrated. Data is much more integrated into the mainstream of our products than it used to be.

Juan Sequeda:
Yeah. I think that’s one of the issues that we have there, is that that’s the mindset that we need to say, “It’s over,” right? Before, it was like, “Oh, you can only run things on the weekend. You can only do things at night.” This is over, and if you still live like this, you got to … You can’t live that like that anymore. Period. And so many companies still live like this. So on that aspect, what is your … What are your recommendations? How do you start getting over that?

Mammad Zadeh:
You mean getting over which aspect of it, specifically?

Juan Sequeda:
Well, I mean … Well, that, I mean … Well, if that’s my problem right now, like if, and I encounter this all the time, right? “Yeah. We can only … We cannot upload the data during the week. It has to be on Sunday night because it’s going to break things,” okay?

Mammad Zadeh:
Well, you know what we did … I mean, if you look at the solution that we came up with, right, over the years was to create that chasm between the two worlds, right? We created a … “There’s the operational side. Don’t touch it. You don’t have any business going there, but here’s your other playground. You can copy data, you can collect data from this side, bring it over here and then do whatever you want to do with it. Just leave us alone,” and that got us going for maybe a decade or so, which was fun, right? We had Cloud, we had all kinds of new nifty toys to scale the separation of infrastructure from the application. All that stuff was great, but we were still operating, and we still are to this day, largely operating in two different worlds.

Mammad Zadeh:
What I’m saying is there are forces, natural forces, rightly that are saying, “Look, there’s a lot of stuff that you guys are doing on the analytics side that needs to come back into the product. How do we do that?” But then as soon as you begin to think about those things, it becomes an engineering problem, right? You need continuous delivery. You need to be able to build things reliably. You need … You can’t have a script on your laptop somewhere that nobody knows about anymore, right? So to me, some of the movement, and data mesh might be one of many, I don’t know, but these are some of the discussions are happening in these areas that are saying, Okay, how do we do that? How do we create a data product? What does a data product even mean? Is it on the analytics side?” I mean, I was having a conversation with [John Mike 00:22:36] not that long ago, arguing why can’t the data product actually serve information to the operational side, because back in the day, and I’ve been … I’ve built these systems before to move data back and forth between the two sides. Kafka was built primarily to be able to be that vehicle, and it’s not just one way, it’s both ways, right?

Mammad Zadeh:
So, there is a need. The question is what’s the right approach? Is copying and having sort of some kind of a highway between the two worlds the best way to do it? If you talk to the virtualization people, they might have a different answer altogether about that.

Tim Gasper:
Yeah. You have different religions depending on who you’re kind of talking to here, right, and I know some of what you were talking there was more on the technology side, right, but obviously another part of this is more on the people side and the team organization side and the responsibility and accountability side and as you think about people … This sandbox, as you’ve kind of noted, has been built up where there can be more self-service but also a lot more chaos, is there … Is this separation of sort of data people and business people, has that been a, a good thing or has that been a bad thing? Is it a mixed thing? And as you start thinking about things like data mesh, do we have to change the way that we organize these things and organize these divides?

Mammad Zadeh:
Yeah. I think so. I think that goes to the crux of the issue here. Are we organized for success going forward for the next decade and I would argue that I don’t think we can achieve success by throwing everything into a central organization and say, “Okay, this is too complicated. This is all data related. Nobody understands the stuff anyway so we’ll just hire a bunch of really smart people, put them in a central organization and say you guys deal with anything that has to do with data.”

Mammad Zadeh:
That’s what I’ve been arguing all along, so far, that I think we need to think about a different organizational pattern where … That we actually think about notions like data products, like real products, that are part of what engineering teams offer, and then what’s left at the center becomes essentially the platform and infrastructure related tooling and experiences like data mesh, for example, require or it argues that we should have a mesh of data products that are interrelated, right, and there’s a way to organize these things, or there’s a way to discover them, there’s a way to understand them, there’s a way to be sure of the trustworthiness of the data product.

Mammad Zadeh:
So there are elements that are centrally governed, but really, the autonomy is the key here within the data product so that we can put the accountability within the right domain. So I think that structure, to me, is a more appealing structure. It’s a more scalable structure. I think we’ve seen it work in other aspects of what gets produced on the engineering cycle. I don’t see why we can’t … An example that I bring, usually, I don’t know if it’s a great example or not, but if you go back 10 years or so, we were kind of in a similar nascent situation with mobile technologies, right, and back in the day we all thought, “Why don’t we have a mobile team,” right? I mean, who has a mobile team anymore, unless you’re building infrastructure for it. I mean, everybody is doing mobile. It became … It’s become part of our DNA. I think data is going through a similar journey in that sense where we can’t just leave it on the side. It has to get … It has to come in and become part of the mainstream engineering work we do.

Juan Sequeda:
One thing you said is elements … There are elements that do need to be centrally governed. So let me go back. I completely agree with you that kind of the issue that we’re in is we have always a centralized structure. There’s a data team for the organization. That’s not going to go scale. We need to figure out how to go decentralize that. I think the issue is to figure out what is that balance. There is some centralization that needs to go on. What is that? What should be centrally governed? My perspective, for example, is that we need to go centralize a lot of the core metadata models, like the core schemas and I think this is something I learned from Intuit was that the fixed, the flexible, and the custom and so there are things that are fixed on the models that you should just go use. You should go … You can go extend them. So I think that’s something that should be govern centrally, but I’m curious to learn from you what should be centralized-

Mammad Zadeh:
Yeah. That may be one. I would say we should use a little bit of caution not to go down sort of the old path of trying to specify every last schema to the detail centrally. That’s … and I know that’s not what you said.

Juan Sequeda:
Yeah.

Mammad Zadeh:
But that is a slippery slope that-

Juan Sequeda:
Agreed.

Mammad Zadeh:
Yeah. There should be a warning sign when we go on that road, but yeah, you’re right. I mean, maybe there are core entities within an enterprise, maybe a half a dozen of them, I mean hopefully they’re not like a lot, that everybody needs to agree on. Like your identity might be one of those things, right? But there are other things as well. Like I think a lot of the stuff that we need to do around compliance, around security, around … There might be different security postures for different organizations because of many different things and compliance in certain industries are also very important, especially, let’s say, in the financial industries or other stuff.

Mammad Zadeh:
So there might be things that we have to do, and we can’t just leave it to anyone. As you said, it’s one of those fixed things, not flexible or free things, and I think that any design we come up with for data mesh should and will incorporate a mechanism to be able to accomplish that within reason.

Tim Gasper:
What’s the role of the CDO in all of this? Are they driving the implementation of the data mesh? Are they sort of an obstacle to the whole thing?

Mammad Zadeh:
Yeah. I mean, that’s an … That’s an interesting-

Juan Sequeda:
Yeah, they are an obstacle or?

Mammad Zadeh:
No, I mean, it’s an interesting [inaudible 00:30:30] because I think, and Juan and I were talking and joking about this not that long ago. I sometimes for effect and provocation say things like, “I’m not sure if we’re going to have any CDOs in 10 years.” Whether that’s true or not is kind of irrelevant. For me, it’s mostly to make a point, and the point is if the CDOs today are going to help break that monolith and get us to a better organizational structure, then I would say that’s what they need to do. I mean, that’s their job. And by the way, this is not just around data engineering. I mean, I think a lot of what I just talked about applies equally to data science and what’s happening on that side of the world. I fundamentally believe any work that is happening on machine learning and stuff will have to be coordinated very closely with the engineering teams that are … So it naturally … Also, I think we will see a trend that that monolith will also go away and we’ll see data scientist … Actually, what we will see is that the true democratization of data and data science, in my opinion, which is where just engineers will be able to build models.

Tim Gasper:
A truly empowered model.

Mammad Zadeh:
Yeah. At least the basic ones, not the fancy stuff that you have to go do science, but there are plenty of canned approaches that might get us 80% there and there’s no reason why engineers shouldn’t be able to do that. Maybe they’re just lacking some tooling.

Mammad Zadeh:
So anyway, if a CDO is thinking in terms of the next 10 years and trying to figure out how to organize a structure, the teams, then I think they’re doing a great job. With after 10 years, what are they going to do? That’s an interesting question. I don’t know. I mean, I again sometimes half jokingly say, “Do we have a chief mobile officer?” I mean, I don’t think we do and to some extent, I’m not sure if we’re going to have a CDO in 10 years or do we need to have a CDO.

Tim Gasper:
Sometimes I hear titles like the chief AI officer and things like that and I’m like, “Oh, that that’s not going to exist in 10 years,” right?

Mammad Zadeh:
Yeah.

Tim Gasper:
But then sometimes I think CDO and I’m like, “Well, I mean, maybe, right? Maybe that will exist,” but maybe, to your point, maybe the role evolves a little bit, right?

Mammad Zadeh:
Maybe … It has to. It definitely has to evolve, I think.

Juan Sequeda:
And how does this, even the connection with the CIO then, too? A lot of this infrastructure is now going to the Cloud, and they’re like-

Mammad Zadeh:
I don’t know. You’re asking the wrong person about all these titles. I’m just an engineer.

Juan Sequeda:
Well, I mean, one thing is a title, but we also see is like what is everybody doing within their organization? At the end of the day the data we have within an organization, we need to … We’re in this new era. We need to treat it as a product. We need to … I love what you’re saying is that we need to make sure that we know who’s consuming it, they know how to go use it, how that functionality is tied to it, if I can trust it what. What are the SLAs like? This is something that it kind of sounds obvious today now, but it’s … There is this big, I think, gap that why aren’t we getting there, which leads me to kind of move to the next topic is about tooling and technology. I’m very curious. What technology needs to be invented to be able to accomplish all of this, of treating data as a product, or do you think that most of the technology is just there? We just have to repurpose it. Where’s your head at on this?

Mammad Zadeh:
I think we have a lot of good technology, but I’m not sure if everything is here yet. Again, like … Tim, you’re a product guy. How do you build a product? I mean, this is no different. You start from the problem. You go to your personas and the people that you want to build a product for. In this case, we’re talking about engineers, the developers as the producers of these data products and data scientists and others as the consumers and we really will go through whether the experiences today that they are going through accomplishing what they need to do is optimal for what they need to do and I think you will agree that at least the state of our technology today is such that if I go to any engineering team right now and say, “Hey, guess what? Christmas is early. You now own all this data product stuff that this other team used to own now.” They’ll go like, “I can’t do it. I don’t know how to do it.” The tools are either not straightforward enough for them to be able to use, or it doesn’t integrate well with the other set of tools that they have on the operational side.

Mammad Zadeh:
So I think there are missing … Definitely missing pieces and especially when you talk about data mesh and introducing new concepts like the connectivity between the data products, obviously the tools to do that aren’t quite there. Maybe John Mike is working on them. I don’t know. I know my team, my old X team at Intuit is also doing some work around that. So, yeah, I think there’s definitely still areas that we need to work on.

Tim Gasper:
Are there certain tools that you think exist today that you think play an important role here? Around governance, around testing that you think are key versus just holes, just holes that exist?

Mammad Zadeh:
Yeah. I don’t know. Probably the answer is that there are probably flavors of solutions for problems kind of sprinkled here and there, but that’s also another problem, I think, where there’s a basic lack of interoperability between some of these things. For example, let’s take … You mentioned governance, access … Let’s take access control, for example, right, to data.

Tim Gasper:
Mm-hmm (affirmative).

Mammad Zadeh:
There are solutions for that, but, once you pick one solution, you kind of have … You have to go down a certain path, either use just this tool or that tool. It is very difficult if you’re a multi vendor company to have the tools agree and work on the same set of principles, low level principles, like in this case access control mechanisms. And again companies are trying to do that. Databricks has done some work on that, AWS has done some work on that, but they’re not the same, so.

Tim Gasper:
Yeah, no. That makes sense. And yeah, just to keep on getting specific here, because we’re just kind of exploring this particular topic, which is like … So for access control, for example, you’re kind of saying that like you may use Amazon but maybe you also use Azure and maybe you also use a virtualization tool and maybe you have Okta for your identity management and so now you’re stuck in this situation where you’re trying to choose, like which tool are you going to use for access control and how much do you centralize and the pros and cons of that? Is that kind of where you’re going as you talk about that?

Mammad Zadeh:
Yeah. I mean, I guess kind of. Where I’m going is that some of the decisions that we make very early on the infrastructure side, low level infrastructure side, takes us completely in one direction and I’m not sure I like that. I would’ve … And maybe this is a very romantic idea and it will never happen but if we had some level of agreement on lower level infrastructure elements, that would help interoperability a lot between tools so that we can pick and choose what we feel is the right choice for each organization. Today, that decision making process isn’t like that it’s more like, “Do we pick Vendor A or do we pick Vendor B,” and then it becomes-

Juan Sequeda:
Are you arguing, do we need to have more interoperability standards for this?

Mammad Zadeh:
I think so. I think … Again, I don’t know how achievable that is. As I said, it might be a romantic notion, but I think it’s worth an effort.

Juan Sequeda:
So what are your thoughts? From the entire semantic web W3C specs, [inaudible 00:40:20] metadata, RDF, and all this stuff on schemas, ontologies. I mean, that’s my background, right, and I truly believe that this is how we are going to be … This is how we can interoperate data and knowledge at scale. I mean, that’s what the web is itself, right? The web has URIs, right? HTP, like all … These are the standards that we have there and and that was the whole … That is the whole goal of bringing out all these standards to data and using the whole web as the infrastructure. I believe that that’s the way how to go do that and-

Mammad Zadeh:
And you might, because that is your area and that’s where your expertise is. You might … I’m sure you can identify much better than I can what are those foundational elements or elemental things that you believe need to be standardized. My background is more infrastructure so I tend to think about some of those aspects and I think at each layer, there might be a few of these things that we could think and talk about standardization that will ultimately help in interoperability as a whole.

Mammad Zadeh:
I don’t want to necessarily get into the specifics of vendors, you know?

Juan Sequeda:
Yeah, no, no.

Mammad Zadeh:
Vendors and stuff, but-

Juan Sequeda:
We’re totally on board. I mean, this is the non salesy podcast on purpose, right? We’re not pushing anything here.

Mammad Zadeh:
Yeah.

Juan Sequeda:
But I like to get pushed more into the categories of things, right, so what are the categories of tool? A warehouse is probably involved, right? You have to go move data around so some sort of ETL or some sort of … I mean-

Mammad Zadeh:
Maybe. Yes, maybe, but see, I think this is exactly kind of where that inversion needs to happen in our discussion that instead of … Instead of the starting point being the tools, like warehouse or this or that, the starting point of the conversation needs to sort of be what do we need and do we have it? The warehouses, I have no opinion on whether … I can envision a data product, some data product might decide that it is best for them to go use a warehouse technology to build whatever it is, as long as they adhere to the rest of the rules of the game, which is the autonomy of data products within the engineering organizations, the easy way to discover, understand the data, be accountable for the quality of the data. As long as they adhere to those principles, the basic principles here, I … You know, VI or EMAX, I don’t know. I mean-

Juan Sequeda:
This is a great … I’m having this really nice aha moment at this moment. You just said autonomy, discoverability, and be accountable. If you adhere to those principles, which in a way is kind of aligned with the data mesh principles, you don’t care how you’re implementing that. You don’t care what technology you do. Just be auto … Enable people to be autonomous, domains to be autonomous, enable those data products to be discoverable and yes, let them be accountable. I want to have data product, orders and managers.

Mammad Zadeh:
Yes. So everything … Like everything else is, it’s within bound, within reason, right? We want to make sure that we don’t, again, gravitate back to the old habits as well, right? So sometimes the tooling may accidentally again sort of move us back there. So I don’t know. There’s a certain amount of discipline involved in adhering to the rules and I also think there’s probably some things that the tools can do to protect us from accidentally sort of violating those rules and I’m hoping that over the next several years, we will see that these are actually the concerns also of the toolmakers. I think the toolmakers, if you … My advice to them would be, I think the persona of your customers are changing and you need to go figure that out.

Juan Sequeda:
That’s a very interesting insight right there. The per … Yeah. For existing vendor … Vendors, who’ve been around for so long, right. They’ve been selling their tools to a different specific type of persona.

Mammad Zadeh:
Yeah. They’re selling to the specialists, right? They’re selling to people who deeply understand maybe the inner workings of some of this stuff. I think if we want to democratize the way that we’ve been discussing so far, which is eliminate the hyper specialization that we see today in our field and allow engineers, I’m not putting any qualification on them, to be able to do most of the stuff, then yeah. Then the personas change. The experience has to fit much more nicely with what the engineers are doing on the operational side. So I’m … See, I came from that side to the analytics side. I wasn’t born in the in the analytics world, so that’s kind of what I’m saying.

Juan Sequeda:
Yeah. Well, one final take … Yesterday, I think yesterday, Matt Turk comes out, right, with the data landscape and this year it’s the machine learning AI and data landscape. It’s this gigantic PDF picture with so many different logos.

Mammad Zadeh:
Yeah. I’ve seen that. Yeah.

Juan Sequeda:
I mean, I look at that stuff and I’m like … Then we have the conversation, like right now, I’m like, “Wait. Our goal is to go create data products that we know that consumers are going to go in and the teams are autonomous. They can go build them, right, and they have accountability. They’re discoverable.” And I’m like, “Do we need all those tools that are in that landscape?” And I’m just … I just … Kind of overwhelmed. I don’t even know what to think.

Mammad Zadeh:
The answer is probably no, but I haven’t looked in detail of exactly what all of they are. I just saw … Like, you can’t even read that, right? I mean, it’s-

Juan Sequeda:
I know.

Mammad Zadeh:
It’s just-

Tim Gasper:
You need a magnifying glass.

Mammad Zadeh:
Yeah.

Juan Sequeda:
Yeah. Oh, well. Yeah. So this is always an interesting … I “look forward” to it every year, but just to see how overwhelmed I’m going to get. I just … What’s going to happen next year. But anyways.

Juan Sequeda:
Hey Mammad, before we started, you said, “Hey, can we even talk about for this for an hour?” Look, we’re almost 15 minutes-

Mammad Zadeh:
Yeah. Well, great job, guys, keeping the conversation going. I don’t know. Looks like some people are bored. “No CDOs in 10 years? Yawn.” Okay. That was their only takeaway.

Juan Sequeda:
All right. Well, let’s get into our lightning round here. So we got some questions we’ve been writing, Tim and I. Let me go start first. So is a data mesh concept something that’s going to have a lasting relevance. Yes or no?

Mammad Zadeh:
Yes.

Juan Sequeda:
That was an easy one.

Mammad Zadeh:
I mean, I’m just saying yes or no. I’m not explaining so.

Juan Sequeda:
Well, I mean, if you want … Anything quickly you want to get out there?

Mammad Zadeh:
No, I think we talked about it plenty.

Tim Gasper:
So next one, is the state of data better today than it was 10 years ago?

Mammad Zadeh:
Yes. I mean, it’s one of those things it’s like, yeah, but it’s not great.

Tim Gasper:
Better, but not.

Mammad Zadeh:
Yeah.

Tim Gasper:
Let’s not be satisfied.

Mammad Zadeh:
Yeah.

Juan Sequeda:
All right. Will CIOs take on more of the CDO responsibilities in the near future?

Mammad Zadeh:
I have no … I mean, I don’t know what’s … Look, I don’t know what half of these titles mean, so.

Juan Sequeda:
That’s your honest, no BS answer.

Mammad Zadeh:
Yeah. I don’t know. I don’t know.

Tim Gasper:
Mammad we don’t know either. We’re 55 episodes in and we’re still trying to figure it out.

Juan Sequeda:
I think we should go … We should invite a bunch of CIOs going forward and see what happens.

Tim Gasper:
Yeah. Help us define … Let’s create the job description together. All right. Last lightning round question for you. So if data’s going through a similar journey maybe as mobile did, particularly like the CDOs versus the chief mobile office, that kind of thing, do you think that most people in the org someday are going to be data people? Like is this divide of data people and business people going to start to go away?

Mammad Zadeh:
No. Most people … I think everyone will be an engineering people person, right? That dichotomy, that distinction, that this is data and something else is not, I think just is going to be less and less and less and less.

Juan Sequeda:
Nice. Well, it it’s our TTT time. So Tim, take us away with some takeaways. A lot of good stuff in here.

Tim Gasper:
Oh goodness. The takeaway section is always hard for our episode, because there’s so much that we cover and you just said, “I think everyone will become engineering people,” and that just like triggered 10 questions in my mind, but I’m going to hold that and maybe I’ll follow up with you on that.

Tim Gasper:
So takeaways. So I really liked as we got into the dysfunctions, you talked about sort of the over centralization and some of the challenges that happened there around sort of accountability around scale, around SMEs and domains and things and you had mentioned that throwing people together to try to solve the problem isn’t really a durable approach and I couldn’t agree more with that and as you were talking, I know you didn’t explicitly say this, but you mentioned these along the way. There seemed to be some like mega trends here that you’re kind of pointing to. One of them is around the movement from the back office to the front office that you talked about. There’s sort of this overall sort of AI ML self-service, this sort of attempt to democratize and also do more intelligent things. But also you said some things that really went around sort of data being more seen as business value, like data people and leaders needing to now zoom out, not just from that deep technical view but actually needed to become more broad and business oriented and that’s interesting and it seems to be a lot of the trigger around us trying to create new approaches and create more value from our data. So I thought that was a good setup.

Mammad Zadeh:
Yeah, I agree with that. I just need to maybe mention one thing because I know my data scientists friends are going to fault me. What I’m saying about the trend here, especially around the democratization aspect, isn’t in any way to suggest that the … To diminish the value that the data scientists are bringing to the game. In fact, what I’m trying to do is to make sure that they can focus on what really matters and some of the stuff, some of the lower hanging fruit that can be democratized, should be and the focus for data science should move really to a higher level that requires that level of expertise on specialization that they bring and not … So if there are things we can do with tooling, let’s do it and let’s free them up to go do bigger, better things.

Juan Sequeda:
You want to give them a product that they understand it. They can go … They can find it, they understand it. They can go run with it.

Mammad Zadeh:
Yeah. I mean, go-

Juan Sequeda:
Do their data science things.

Mammad Zadeh:
Yeah. Come up with new algorithms, come up with the science behind some of the stuff that it doesn’t exist, right?

Juan Sequeda:
Yeah.

Mammad Zadeh:
I would love for them to focus on that stuff.

Juan Sequeda:
So I got several here. I’m only going to go a few. Definitely we can’t … We won’t be successful throwing things into a centralized organization so they can manage the data, so we need to have this different organizational pattern, right? Think about data products and really think about them as real products, right, that the engineering teams are there to go offer. At the center you want to have the form and infrastructure tooling, and when we think about the data mesh, the whole mesh aspect is that all these data products are connected such that they can then be discovered and people can understand them.

Juan Sequeda:
There is this balance that I I’ve always talked about this and I like that we’re … Seems like we’re on the same page of understanding how you want to be decentralized, but there are some things that you want to centrally govern, right, and talk about some of the core entities and I love how you were specific. You were saying half a dozen. They should be really small because you want to be careful about going down that slippery slope to start governing everything, right? Other things to be governed at central is identity, compliance, and security.

Juan Sequeda:
I’m loving these … Adhere to the principles of being autonomous, discoverability, and be accountable, and really at the end of the day, I don’t really care how much the technology plays the role. It’s really just make sure that you can accomplish these three things, but be careful. Don’t go back to your old habits.

Juan Sequeda:
And then I know we talked about titles and stuff, but if the CDO or actually whoever, if their job is to make sure you’re breaking that monolithic to have a better structure, then yeah. Keep doing that job because whatever title you have, we need to have the people who are breaking that monolithic and understanding how we can be autonomous, discover data products such that they’re also accountable. That’s our summary.

Mammad Zadeh:
Wow. You’re smart.

Juan Sequeda:
I … No, you’re smart.

Tim Gasper:
We’re just good note takers.

Juan Sequeda:
I’m listening and taking notes. Mammad, let me throw it back to you to wrap up here. One, what’s your advice, and second, who should you invite next?

Mammad Zadeh:
My advice? Have fun doing whatever you’re doing. That’s really what matters. So that’s my advice to everyone.

Mammad Zadeh:
Who should you invite next? I would love to see some of the heavy hitters of data infrastructure come on your show and tell us what they think the future is bringing. You know, execs from AWS, GCP, Azure, Databricks, whoever. I don’t … I mean, if you want names, I’ll give you names offline, but I would love to hear sort of some of their outlook and what they’re planning.

Juan Sequeda:
I would definitely like to go have them on the show and we’ll reach out to them and we appreciate any introductions you can make.

Mammad Zadeh:
Cool.

Juan Sequeda:
Mammad, thank you so much. This was a fantastic discussion. So many takeaways, and we just had a handful and we got a whole two pages of notes here between Tim and I, so thank you. Thank you. Thank you.

Mammad Zadeh:
Oh, thank you. Thank you so much. It was fun.

Tim Gasper:
Awesome.

Juan Sequeda:
Cheers.

Tim Gasper:
Cheers.

Juan Sequeda:
Have a great Wednesday.

Mammad Zadeh:
Cheers to you all. Bye.

Enter Content Here.

M

See the catalog for data discovery, governance, access, and analysis.

Request a demo