About this episode

You can’t get from your home to the grocery simply by owning a car. You have to actually drive the vehicle to get to a place that delivers value. Sounds obvious right? But we don’t instinctively think this way when it comes to data. We focus so much on tools, processes, and architectures, but we don’t talk enough about actually using the data

This week, Tim and Juan are joined by Mike Ferguson, managing director of Intelligent Business Strategies Limited to look at why we’re not using data nearly as effectively as we think and what can be done about it.

Special Guests:

Mike Ferguson

Mike Ferguson

Managing Director, Intelligent Business Strategies Limited

This episode features
  • Data usage do’s and don’ts
  • How to get your company aligned on effective data use and measurement?
  • What’s the most expensive thing you bought that you’ve never used?
Key takeaways
  • Data Complexity is a huge issue
  • Connect the data products to an actual use case
  • Centralized vs Decentralized → instead is a federated model

Episode Transcript

Tim Gasper:
For Catalog and Cocktails it’s your honest, no BS, non-salesy conversation about enterprise data management with tasty beverages in hand. I am Tim Gasper, data nerd and product guy at data.world joined by Juan.

Juan Sequeda:
Hey Tim. I’m Juan Sequeda, principal scientist here at data.world. And as always, it is a pleasure to take a pause in the middle of the week, end of the day and chat about data and have some cocktails around that-

Tim Gasper:
Your drink is a similar color to mine.

Juan Sequeda:
Yeah. And I just saw that you also had an interesting shirt on today.

Tim Gasper:
I do. This is my honest no BS t-shirt.

Juan Sequeda:
Well, I always … I don’t know if people have realized this, but I always wear a Seinfeld t-shirt. I think I’ve worn one for almost every single episode, and it’s just a tradition I started, so here we go.

Tim Gasper:
I love it.

Juan Sequeda:
And today we have an awesome guest. This guest today is somebody who has so much experience in the IT, I think almost 40 years of experience in IT. I’m so excited to have Mike Ferguson. He’s the Managing Director of Intelligent Business Strategies, and if you actually just look up his name and just see his website and look at everybody who recommends him, it is outstanding. It’s like amazing. Everybody you’ve worked with throughout so many decades. And I’m just so excited to be able to kind of just have this conversation and pick your brain and learn so much from you. So Mike, cheers, great to have you.

Tim Gasper:
Welcome Mike.

Mike Ferguson:
Cheers. Thank you, guys. A real pleasure to be here.

Juan Sequeda:
Awesome. Hey Mike, what are we drinking today and what are we toasting for?

Mike Ferguson:
I’m drinking a margarita and I’m toasting to the future and travel because I haven’t been on a plane in 18 months and I’m really getting itchy feet. Normally my life would be, I don’t know, about 35 return flights a year, so I would step on a plane about 70 time. I haven’t been on a plane over 18 months. It’s … yeah, it’s crazy. So I’m looking forward to travel again.

Juan Sequeda:
So am I.

Tim Gasper:
I bet Juan, you probably are feeling the same way. I know you’ve gotten a little bit in, but not as much as you used to, right?

Juan Sequeda:
Yeah. Well, I’m drinking a cucumber Japanese gin and tonic with a strawberry cucumber soda.

Mike Ferguson:
There you go.

Juan Sequeda:
There’s my fancy drink today, and actually cucumber’s in here. And I’m going to toast for two things. One is we have our summit. Our summit coming up in a month. And I’m super excited about that because we have just so many different topics and great speakers coming up. We have Zhamak Dehghani talking about Data Mesh, Barr Moses from Monte Carlo Data, Doug Laney, we have a bunch of panels and everything so that’s going to be fantastic. And I’m also, the second is I’m toasting for travel and I’ve been getting, traveling a little bit. I’m actually getting on a plane and I’ll be on your neck in the woods starting tomorrow. So I’m going to be in Europe for the next two weeks.

Mike Ferguson:
There you go.

Juan Sequeda:
I’m in Paris, Amsterdam, London, Edinburgh. So if you’re listening around, let me know. I’ll be over there.

Mike Ferguson:
I’ll do my best.

Juan Sequeda:
Hey Tim, what about you?

Tim Gasper:
I am drinking a … it’s called a rum and smoke, and it’s got simple syrup, lime, rum, mezcal, and a splash of peaty scotch, sort of a smoky drink but very delicious, almost like a margarita but smoky. And I will cheers to the summit is going to be great. I will also cheers to travel. Man, I have not traveled just as long as you Mike, and it is weird. And then second, and then third of all, I’ll cheers to making your own t-shirts. It’s kind of cool. I did it on printful.com. Check it out. Maybe we’ll set up our merch store at some point. Carly, our producer, is going to help us do that at some point.

Juan Sequeda:
We should do that. We should do that. So we got our warm up question today. What’s the most expensive thing you bought but that you’ve never used? Mike.

Mike Ferguson:
So I think that’s going to have to be my daughter’s car. I mean, she was quite happy to get it, but I never got to use it obviously, and now I can never use it because she’s now sold it so. So that’s it, just paid out and it’s gone again, but nevermind.

Tim Gasper:
You didn’t buy her like a corvette or anything like that, right? It wasn’t like a really fancy car, right?

Mike Ferguson:
No, no. It wasn’t a particularly fancy car. I think it was the basic Ford. But nevertheless, bank of mom and dad managed to fund her first set of wheels, and yeah, they’re gone again. So I guess that was … it was probably a good investment.

Juan Sequeda:
All right. It was expensive but you didn’t use it-

Mike Ferguson:
Expensive unused investment on my part but you know.

Juan Sequeda:
How about you Tim?

Tim Gasper:
When I was thinking about this, it took me a while but I remembered. I bought a Sony DSLR camera. And there was a period of time where I was like, “You know what? I need a fancy camera.” And I got it with the nice telescoping lens and all that. It was probably about a thousand dollars overall. I think I took about 50 pictures with it over seven years, so I definitely barely used it. But my oldest son is using it now. He’s doing some photography so that’s good. It’s getting a new life.

Juan Sequeda:
Cars, picture … cameras. Mine is actually on the transportation. I have a bike. I think it’s a nice bike. I don’t even remember how much it cost, but it was not cheap. It’s still in my garage, barely use it.

Juan Sequeda:
Well, talking about usage and what you use and what you don’t use, let’s dive into our topic today. Mike, honest no BS here. With all this data that we are generating, are we really underutilizing it? Because when we kind of prepped for this you were like, “Well, data usage is the problem that we’re not using it.” I’m like, “Really? Are we not using it?” Like where are we right now with this?

Mike Ferguson:
I think the last decade’s been about development in my opinion and it’s been real focused about all these new technologies. It’s been a complete frenzy around data and analytics quite honestly and the speed at which technologies have leapfrogged each other have been crazy. One minute it’s one thing and next minute it’s the next thing and the next thing and the next thing. And there’s kind of trash all over the floor, and the kind of development organizations have been kind of trying out almost everything and anything.

Mike Ferguson:
But I think definitely the case for my clients at the executive level, corporate level if you like, that they’re looking back at this and they’re kind of saying, “Okay. We’ve invested a lot of money here.” And there’s no shortage of that in the sense that the appetite is absolutely still there in the boardroom. But I think it’s, yeah, they want to put … they want to, let’s say, better put the work. And I don’t think they’re getting as much bang for their buck as they would like. I kind of think that this is the, I think, one of my clients said to me recently that I want this to be the decade of use, not the decade of development.

Mike Ferguson:
So I think there’s for that reason some top-down pressure from executives now who are financing all of this to say, “Okay, enough’s enough. We kind of want to not just experiment here. We really want to industrialize this.” And I think for that reason, yeah, I don’t think they feel they’ve got enough use out of the data.

Mike Ferguson:
And I think a lot of it’s just down to the fact that there’s been, I guess, what I would probably describe as a more fractured set of activities going on across organizations. I mean, I would say these days almost every department of anyone of my clients is screaming out for data and analytics, from sales and marketing through operations all the way to HR. I mean, they all want it.

Mike Ferguson:
So in that regard I guess you could almost say that data analytics is central to pretty well every business process, every part of the business. But I don’t think that a cohesive integrated use of all of that is going on within organizations yet. It’s not fully brought together so that people are pulling on the same rope.

Mike Ferguson:
So what we I think are now seeing, at least for my clients certainly, and obviously I spent more of my time since I’m based here in UK working across Europe, I have clients all over the world but nevertheless, and I have a lot of clients in Europe and I think from a European perspective at least that they want a bit more, let’s say, collective alignment around common business strategy rather than everybody flying off doing their own thing.

Mike Ferguson:
When new technologies emerge, it’s not surprising people leap on it and whatnot and they want innovation. But I kind of think now that at least the top-down pressure is, okay, can we industrialize this, because we really want to make maximum value out of this data and out of these models that are being developed.

Mike Ferguson:
And I think as well, they want it simplified. There’s too much complexity around. And I think particularly around the data landscape, I think data complexity has really become a huge issue for a lot of organizations who may even be global and have a global presence. I mean, just understanding what’s out there, the amount of data redundancy that’s out there. There’s copious of data all over the place. I think the number of data stores, the number of clouds that are in use, the whole data landscape it’s kind of becoming almost a roadblock in the way of progress unless they can get that under control, if you see what I mean.

Tim Gasper:
Yeah. I mean, it reminds me of, you see the diagram that people always talk about, of the MarTech Stack and all the things that are going on in the marketing landscape. That used to be sort of the thing that people always would show before like how complex things are. It feels like the data landscape is similarly become just as fractured, if not more fractured, to think about the data stack and things like that.

Tim Gasper:
You had a few things there in what you were talking about. Particularly we started this with this idea of data use. To you, what does it mean, what does data use mean? Like what does it mean to be using your data and why are people getting this feeling that they’re not using it? Is it just like there’s so-

Mike Ferguson:
Well, I don’t think it’s-

Tim Gasper:
… much data now that they’re not taking advantage of it all? Or is it really like literally they’re not using it?

Mike Ferguson:
No. I think it’s a bit of both in all honesty. I think it’s not just so much the data that they’re not using. It’s the fact that they’re not getting enough analytical models deployed. It’s the fact that they don’t even know what models are out there and who’s developing what because there are multiple data science teams perhaps scattered around the … There’s no log if you want to call it that. I don’t mean a data catalog. There’s no catalog of all the rules that have been developed that are available. No one to say, well, where do these get deployed, what processes are they … in, business outcomes, which like if we’re trying to reduce fraud, then what collection of things are involved in doing that?

Mike Ferguson:
Companies can’t see that. The executives can’t see that. I mean, they know that there may be some real-time models under being developed to stop fraudulent transactions. They might know that there’s some kind of activity going on in order to maybe with a graph database project or something like that. But seeing this whole collection of things that are being developed, what data, what BI reports, what dashboards, what predictive models, what prescriptive models, where are they being used, how they all work together or the goal of reducing fraud, I mean, it’s what they can’t see.

Mike Ferguson:
And because of that, they don’t have a strong enough feel for whether the collective effectiveness of all of that is working to the maximum.

Juan Sequeda:
So it seems to me that this past decade has been kind of the decade of a science experiment. It’s a data science experiment. We’ve been talking about, well, you have all this big data stuff right and all the NoSQL that kind of kicked off. We have a lot of stuff to go do. And then like, “Okay, now we got a lot of data. Let’s go do stuff with data.” But we kind of had no plan to what to go do with so much data. So we were like throwing money all over the place. And, “Hey, you go do this thing, you go to this thing.” And guess what? We kind of figured things out, but it’s all ad hoc, we’re reinventing the wheel all the time.

Juan Sequeda:
That means that at the end of the day, kind of the progress or the value that we’re showing is kind of spread across so many parts that if we would have thought about it from the beginning kind of with a plan let’s go accomplish A and B and these things that combine, that would have been better. But at the same time, we kind of were just, we were flying the plane and making the plane at the same time, so we kind of didn’t know it. Probably now is the time to sit back and say, hey, like-

Mike Ferguson:
Yeah. I mean I think what executives are saying is can we harness this? I mean you’re not trying to stop it. But what we want to do is harness it and kind of direct it. And also, we want to remove the kind of reinvention. We want to stop the repetitive rework, probably the biggest repetitive task going on right now as far as I can see in most of my clients is data integration. Everybody’s integrating data. Everybody.

Juan Sequeda:
That’s the problem that we’ve always had forever and ever, that problem-

Mike Ferguson:
Yeah, and I-

Juan Sequeda:
[crosstalk 00:15:24] that data integration started. You have the relational databases coming out, the first systems late ’70s, early ’80s. You have ethernet, the networks coming out. And then on the split second afterwards there was relational databases on a network and somebody says, “I want to go query those two databases at the same time,” and we’ve been integrating data from the beginning. And here we are.

Juan Sequeda:
We’re driving ourselves insane to the point that we keep doing the same things over and over again expecting different results. This is Einstein’s definition of insanity. Something needs to change here.

Mike Ferguson:
Yeah, I think it does, in the sense that, and I think we’ve seen organizationally I think is one of the major issues to solve here. I mean, I think that, and I think culture as well is another major issue to solve. But I think organizationally we’ve gone from centralized kind of IT. Then we went kind of distributed where it’s kind of like everybody’s doing their own thing in various teams. And kind of what I’m seeing now, at least with some of my clients is a federated model where we’re kind of saying, “Okay, we have these teams out here, but we now want to harness this thing,” and kind of put something over the top of it that’s going to organize it, and align all of these activities with business strategy. Because obviously data and analytics have to align with helping the business achieve its goals and improving business outcomes.

Mike Ferguson:
So I think organization has a big role to play here. And obviously for some companies, that’s a major problem if you’re very heavily distributed already. Some big global companies are that way. That’s a massive challenge to be able to try and harness something like that. But nevertheless, I think, you could always say centralized is kind of monolithic to some extent. Decentralized is very fractured. And I kind of think federated is kind of middle ground.

Mike Ferguson:
And I sense executives want the data problem fixed. What I mean by that is they want the foundation to be trusted, and they want the reinvention to be minimalized and they want that whole … I mean, if you’re going to be … An executive said to me, “Look, if we’re going to be data driven, we really need to organize to be data driven.”

Mike Ferguson:
There’s a supply chain of data coming in here. And we need to get good at delivering reusable trusted data that is also compliant, we don’t break any laws and privacy and whatnot. But at the same time, we got to get good at this and up the reuse in order to be able to produce, let’s say, trusted data once and reuse it everywhere. I don’t mean one monolithic, big database. I mean reusable data products. A good example in structured data would be master data and transactions. I mean products, customers, orders, shipments, payments, returns. These are business understandable data sets that could be used in multiple different analytical projects and probably are used in multiple analytical projects.

Mike Ferguson:
So why is it that we have some team off writing Python to do data integration, some other team using self-service data prep tools, some other team using an ETL tool, another team using a script or something else. And before you know it, you’ve got a myriad of different technologies in place trying to integrate data. And everyone assumes it’s going to be understandable when there’s no industry standard for metadata.

Mike Ferguson:
These tools don’t talk to each other. If I’m looking at what I’m doing in my tool, I can’t see that somebody else has reinvented this in another tool, I don’t even know. And that’s the problem, I think. I think there’s too many technologies, and I think there’s a desire, if you like, to slim that down and try and get this more industrialized, which I think is why we’re seeing kind of rise in things like data ops and the whole component-based development around pipelines in order to get reusable components speed up the development of these, speed up the testing and deployment of those kind of pipelines. But I think it’s got to get done in order to just get round the problem that we have [inaudible 00:20:47] all over the place.

Juan Sequeda:
You have described the problems that I personally have seen over and over again. Obviously, you’ve seen them for decades. A couple things I want to make. The first comment here is what concerns me a lot, and I’m seeing this and we’ve talked about this complexity of the data landscape, is this whole modern data stack that we’re seeing now, all these kind of data integration tools that are in the cloud. I have the impression too that it’s more also kind of for a younger, cool, hip generation of companies, and they’re not seeing the problems that you’re describing and they’re kind of like, “Well, that’s the modern stack today.” And then a couple years later they’re going to be stuck with more of this logic and all these issues but stuck in some cloud tool, whatever.

Juan Sequeda:
And this really scares me because we’re just going to … I mean before, at least you needed to know your Python or your whatever Sequel. But now you’re like it’s embedded in so many different applications and SaS applications, you don’t know how to get that stuff out there. So that’s something that really, really concerns me about it. So that was one comment I wanted to make.

Juan Sequeda:
And the second comment is, you were talking about centralization and decentralization and talking about data products. So this seems a lot to this whole conversation we’ve been having around data mesh. I’m also curious to hear your thoughts about data mesh. But one word that you said, and I kind of want to push more here and let’s get an obs on this, is on this federated model. You said that, “Well, wait, one thing is to be completely centralized, the other thing is to be decentralized, but a balance between that is to be federated.”

Juan Sequeda:
The centralized I get. We’re going to put everything in one place. Decentralized, go do [crosstalk 00:22:31]

Mike Ferguson:
All right. By centralized I mean obviously we’ve seen centralized IT. I think everybody knows that that’s gone away and we’re out there and distributed teams all over the place. And I think it’s fair to say that the folks out there in a lot of business units these days are pretty IT savvy. And yet the problem really is that if you look at it from a corporate level, I think people are saying, “This isn’t … We’re reinventing. We’re still reinventing.” We’ve got different business units buying their own technologies doing their own thing, potentially making progress within the context of that business unit. But if this person goes and develops or needs customer data for a particular machine learning model and some other team and some other part of the business needs customer data, they may go off and produce their own version of customer data, and so inadvertently end up with inconsistency because of it. And I guess the question is, is there a way to get away from the continuous reinvention of course?

Mike Ferguson:
I think the concern is from a federated point of view is can you organize? Is there a central program office so that there are these teams out there working but they kind of know which piece of the jigsaw puzzle they’re building? So we have a central program office. We know what the corporate objectives are. We know what projects are underway, who’s building what, which data products are we producing that are going to contribute to reducing fraud, which BI reports, which dashboards or stories are we going to use to do that, which predictive models and who’s building those in order to contribute to reducing fraud, which graph databases are being built in order to look for fraud rings and things like that to contribute towards fraud.

Mike Ferguson:
So there’s an organized set of projects that are underway in order to achieve a common goal which is in this case reduce fraud or improve customer engagement or optimize some part of your operations. But if you’re going to go do that, then all projects got to know that they’re kind of all on this bigger team, they’re all working on their bit. But I think also, when it comes to data, if multiple project teams within that context need customer data, why build it multiple times? Why can’t I produce it, have it ready to go, and then if people need subsets of that, then fine? If people need to take customer data and combine it with other data, then fine. But don’t take me back to zero every time in order to work on raw data for my project when I could expedite the whole project because of the fact that I’ve got a lot of ready-made stuff that I can potentially use.

Mike Ferguson:
And I think for me, that’s the piece that’s missing, is that there’s good work being done around organizations but there’s no front of the jigsaw box that says here’s what we’re building and you guys are doing this piece and you guys, which is kind of why I sort of see people like CIOs and CDOs kind of trying to get into that federated layer, if you like, organizational layer, and be able to kind of coordinate multiple project teams and align it with business strategy.

Mike Ferguson:
Because at the end of the day if somebody, some executive who is accountable for a particular strategic business goal says, “What exactly have we done to help reduce fraud,” you’d really like to sort of say, “Well yeah, we built these models, we built these reports, we’re using this data. Here’s the collection or the family of things that are underway or have been built to be able to do that.” Or what have we done to improve customer engagement or optimize the supply chain, so that you can see that.

Mike Ferguson:
And I think for me, that’s the big piece that I think they don’t see right now. I mean, they know there’s a lot of stuff underway, and they’re funding it in different areas, and it’s happening on different technologies. But it’s very siloed, it’s not very well integrated, and I think a lot of executives are saying, “Come on. We can pull this together. We can industrialize it and turn this thing into a more well-oiled machine than it currently is.”

Tim Gasper:
Right. No, that makes a lot of sense. And obviously there’s an important role that’s necessary for there to be leadership and for there to be some sort of a central point of view. I love that you said that there is no front of the jigsaw box. That’s obviously a powerful metaphor here where decentralization can really go wrong and where obviously the CIO, the CDO, and the leadership in the organization, even if they’re pushing responsibility out to the spokes where it’s bring your own data, it’s bring your own domain knowledge. But it’s not bring your own policy. It’s not bring unlimited amounts of infrastructure.

Mike Ferguson:
And you can go build a piece of your jigsaw, but if it doesn’t fit with the next piece, your piece is useless. You know what I mean? I mean, you can’t put-

Tim Gasper:
You’re just taking a bunch of puzzles and you’re just … My kids do this. They pile it all into the same box. It’s 10 different puzzles all jumbled together. It doesn’t make sense.

Mike Ferguson:
And they’re-

Juan Sequeda:
You build your own piece and other people don’t know that I have some missing part. Where is it? Well, you’re not talking to the person who’s actually done that piece, and then you’re like, “Oh, this is broken. We need to go buy the … We need to go buy another jigsaw puzzle or stuff like that.” People are not communicating. I think that’s interesting for me-

Mike Ferguson:
Yeah. I think there’s definitely a communication issue. I think there’s definitely not enough sight of business processes and business strategy and business objectives.

Mike Ferguson:
I meet some super smart-

Tim Gasper:
What about metrics, Mike? Is metrics part of this as well?

Mike Ferguson:
Totally.

Tim Gasper:
How do you measure proper data usage or data governance? How does that play here?

Mike Ferguson:
Oh, I think that plays a lot in being able, you’ve got the old cliché, you can’t manage what you can’t measure. And certainly, you’ve got to have a way in order to be able to measure how you’re governing data. But I think at the same time, there’s also other key measures that are out there which are true business outcomes. I mean, are you improving on EBITDA, or is there any key needles here that different business executives are accountable for that you’re really trying to move those needles one way or the other in order to be able to reduce the risk, or optimize some part of your business operations, or drive new revenue either through cross-sell, upsell, or gaining new customers through better personalization and those kinds of things.

Mike Ferguson:
I mean, I think it has to be tied to business objectives. And a lot of the time, well, you know the tech business as good as I do. I mean, it’s just been wave after wave after wave of technology. And I think that at times we get lost in the weeds looking at all the tech and forget about the outcomes. I think what we’re getting reminders from the boardroom it’s the outcome. Like we’re a bank, we’re an insurance company, we’re a retailer, whatever business you’re in, they want to see real improvements as a result of using this as opposed to let’s ditch this library and go get this library and aren’t these algorithms cool.

Mike Ferguson:
I mean, I think, for example one executive was with me on a Zoom meeting recently and they kind of said, “God, I learned about natural language generation when I talked to the BI team.” And I said, “Oh yeah,” because you get text to describe your visualizations. He said, “I wish somebody would give me that for data science so I could understand what they’re talking about.” I mean, it just seemed to me that they’re trying to understand the business impact of what they’re delivering is very, very important, and there’s a lot of super smart people in data and analytics. But I think the whole communication thing to the masses of other people out there that have to use this is still a big barrier to give people the confidence to use it to drive value with it.

Juan Sequeda:
Mike, I can imagine people listening to this saying, “BS. My organization, we definitely use our data. We don’t have those problems.” What is kind of the litmus test? It’s not a binary thing. We use the data. We don’t use the data. I think it’s a whole spectrum of it. What is a test to understand how much are we using your data? If I were to make up these numbers, you’re using it 50%. You’re leaving half of the opportunity on the table. Or no, you’re actually doing very well. Like you’re 80%. Or you really suck at this. Or you’re just starting … What would you suggest given your experiences? These are the things that would test that you should kind of ask yourself to go figure out where you are on this spectrum of data usage.

Mike Ferguson:
I think you need some kind of maturity model in order to be able to help you work that out. But I think as well the contribution. Are you measuring contribution to business outcomes from anything that you’re producing? If people … I mean, don’t get me wrong. There is lots of good stuff happening with data analytics without a doubt. It’s just that I think it’s not integrated enough. It’s still a lot of silos going on, and I think there’s an opportunity to, as I said before, industrialize it. And I think that’s what people want to see, is can we produce reusable data that can be used for multiple analytical projects without them having to go and reinvent again. Is there ready-made data products that they can use?

Mike Ferguson:
And then the second thing is if I’ve got analytics, is there a catalog of what analytics are available, what reports, what dashboards are available? It’s classified according to business outcomes. So I can kind of see which models are in play and where, which processes are they being used in. And I think this is … it’s the business deployment side that I don’t think people can see well enough.

Mike Ferguson:
So in that sense I think there’s some technology pieces missing. As I said, I think there’s a lot of duplication of technologies, particularly around data integration I think, and not enough reuse of stuff, which I think … So I think some rationalization is needed. But I genuinely think the ultimate measure of success is how much you’re contributing to a business outcome.

Mike Ferguson:
And so therefore, you have to be able, if you go and build these models and deploy them for a purpose, for example to improve customer engagement or to reduce fraud or to optimize the supply chain, then the question is, is it doing, is it resulting in greater automation and reduced cost in the supply chain, is the supply chain … are we have less waste in there? Are there measurable outcomes that allow you to judge whether what you’re deploying is effective or not?

Juan Sequeda:
So I’m interpreting this as the following. I think this is actually a great test, is go look at the business outcomes that you recently had, that people are celebrating. Oh, we’ve done this. This is amazing. We’re all celebrating for that. Then go trace it back to say how much did existing data analytics project contribute to that particular event that we’re all celebrating.

Juan Sequeda:
We increase client engagement or whatever, the retention and stuff. It’s like, okay. So what happened? What were the events that occurred to go do that? There’s obviously things that are not just data, but then we can go trace it back. It’s like somebody’s going to say, “Well, this event happened because we had this data.” So I think it would be really nice to go make that test, and be honest with yourself. And if you’re not finding it. Guess what? You’ve got all this data. You’ll be doing stuff with it. So I think that [crosstalk 00:36:56]

Mike Ferguson:
No, I agree. I mean, for example, if I build these pipelines to produce this data and I have these reusable data sets and then I build these models or I take these reusable data sets and put them in a data warehouse or something and then I produce these reports, is anybody labeling all of these things to say that this data and these pipelines and these models and these reports and these dashboards are all labeled with this objective that says reduce fraud or it says optimize supply chain. So that I can then say anything tagged with reducing fraud, could you tell me what have we got. And then the question is, is the combination of all of the stuff that we’ve built, is it actually doing it or not?

Juan Sequeda:
Yeah. We need to have actually … Your organization needs to have a business glossary basically of the use cases that they deal with. And those use cases need to be connected directly or tagged to the actual data products that are being generated. So you can actually search for, well, if I’m using these use cases, what are the data products that have been used?

Mike Ferguson:
Yeah. And you may find, you may find that a data product like a customer or something, that customer data could be used in multiple projects which could be associated with multiple different outcomes. But nobody’s labeling it.

Mike Ferguson:
We all talk about data catalogs, and it shows me all the raw data that’s out there. But I want to know about what about all the data we produced. I want to know about all the analytical models that we produced. I want to know what all the reports, all these things. And I want it tagged or hung on a hook which … And the hook is this is the strategic objective in my business strategy. And the business strategy was set by the board and it’s got these executives who are accountable for these goals, and these are the measures that we’re using to tell us where we’re on track to achieving those things. And then I can label all this stuff and say, “Well, all of these is to do with this objective. So therefore, is it working?” And if it’s not, then okay. Then, we have to say, “Some of this stuff is working, some of it’s not. So let’s fix the stuff that’s not.”

Mike Ferguson:
But I think if we can’t tie it back to business strategy, then how on earth are you going to know whether what you’re developing is effective. And also, it’s not just what you’re developing. It’s who’s using it, how much of it’s getting used. You can go build this stuff, but if it’s never used, you got to measure the usage of it.

Mike Ferguson:
In terms of that I mean, I think there’s a number of ways to be able to do that. I mean, how many times do you execute the model? How many fraudulent transactions did it stop in flight so that you can measure these things and understand its contribution? And I don’t see that right now. I don’t see a way in which to be able to measure the contribution to the business outcomes of all of the stuff being built. What I see is all kinds of projects underway with very intelligent folks and lots of data. But still I think, I see organizations that get their management information pack sent to the board in Excel. Excel. Come on!

Tim Gasper:
Who’s driving this though? One of the things … I love everything that I’m hearing. I’m so excited about all of this. And like for me I’m like, “Data literacy, data culture.” Like, “Data usage.” We’re not talking about data usage enough in our organizations. But who’s driving it? Is it the job of the Chief Data Officer to be like, “Everybody, pay attention. Here’s the metrics. Here’s … I have the buy-in from the executive team.” Is it their job? Whose job is it to make sure we drive better data usage?

Mike Ferguson:
That’s a great point actually. I mean, I think it’s more than just a CDO. I mean, certainly I would have said … I happen to be chairman of a huge conference in Europe called Big Data London which gets about 10,000 delegates, and I would say about three years ago we did a survey and CDO was like the cool job. And I would say in those days, they were reporting into CEO. But I think three years down the line, a lot of CDOs that I know in Europe at least, and I must admit I don’t know the US’ as well are not reporting to CEOs, they’re reporting back into CIOs. And so whereas they were on a peer level with the CIO in the past, I don’t necessarily they think they are certainly in larger organizations. I don’t think that that’s necessarily the case anymore. But nevertheless-

Tim Gasper:
Is that a regression or is that a good thing?

Mike Ferguson:
No, I just think it’s just that the CIO has become just a more senior executive. I mean, I think it’s an okay thing as long as somebody has to take responsibility of alignment with business strategy. And I think frankly, CDO still is the right person to do that, to ensure that.

Mike Ferguson:
But then there’s another problem here which is if budget is out there with different departments and they’re all off doing their own thing, there’s got to be some dotted line into a CDO that kind of says how are these things tying into business strategies so that even line of business executives know that.

Mike Ferguson:
And some of my clients, I survey every year for the show. What I am seeing now is collective responsibility for this broadening across the executive manager. Rather than just the CEO, it’s now multiple executives taking up responsibility for this. And that’s a good sign in my opinion, because it’s not just one person at the table that’s accountable for this anymore. It’s the fact that this is spreading and people realize that this is a good thing that they’re taking up responsibility for trying to make this happen.

Mike Ferguson:
But I genuinely think that there’s still a lot more can be done to align it all with business strategy. And so yeah. I mean, all I’m saying is that I think there’s an organizational element to this, and I think there’s perhaps some rationalization element to this, and I think it’s then, okay, okay, if we can get that under control, then is there a way in which we can speed up the development process and get these artifacts built, get these models built, get these out there. And then can we tag them so that we know what they’re supposed to be trying to achieve. And then can we measure the effectiveness of them in order to see whether they’re actually doing that or not.

Juan Sequeda:
This is reminding me of … I’ve been watching the content of this guy called Gregor Hohpe. He has this book called The Software Architect … Sorry, The Software Architect Elevator. And I saw one of his talks, and it really, it was eye-opening for me. So understand kind of the role of IT. If IT reports to the CFO, it’s a cost center. Their focus is to reduce costs. But if they report to the COO, it’s about return on investment. But if they report to the CEO, it’s about being the speed of innovation. So you have to really understand where is your data, your IT organization according to that’s going to change. So that was one thing that reminded me.

Juan Sequeda:
And then kind of just to wrap up here because we need to go to our lightning round is, you brought up about the usage and cataloging. And this reminds me of another conversation I’ve had before with one of our lawyer listeners, with Mark Kitson. And he told me once is, I think he was in a chat once is, “If a company has a data catalog, that’s already indicated that they’re trying to at least make use or track the usage.” So if a company does not have a data catalog, they’re way behind. And actually, if you’re interested in acquiring a company, first ask if they have a data catalog and go get it, because if they don’t, it’s probably a company not worth buying because their data is going to be a big mess.

Mike Ferguson:
Well, I kind of think that there’s two uses of data catalog here. I mean, there’s one which is to just know what data you’ve got out there. And I think the problem that I’m finding with my large clients is that you’re not going to get a couple of thousand metadata entries in the data catalog. You’re getting billions of entries in a data catalog. And that’s a big concern. Because if you harvest a lot of data and connect and scan a lot of data and bring in the metadata about what exists out there, having billions of metadata entries isn’t particularly useful.

Mike Ferguson:
So there’s another problem here which is how do you map what you’ve got out there to something that’s meaningful from a business perspective? And that is things like customers and products and payments and returns and shipments and orders and these kinds of things. And I think the whole data concept model is a big thing. Getting business glossaries jump started with concept models, mapping and trying to use AI in order to automate as much of the mapping of disparate data that’s out there to common definitions speeds that whole thing up a lot.

Mike Ferguson:
And I genuinely believe that we are not too far away now from being able to automate as much … a lot of all of that mapping in order to be able to then say, “Okay, we know what data’s out there. We know what common definitions are here. We know what the mappings are.” Then, can we generate some of the pipelines to be able to produce this stuff rather than have it all done by humans?

Mike Ferguson:
But nevertheless, I think the other use of data catalog, if you like, is to explain what ready-to-go data exists, what ready-made trusted data exists, not the raw data. If I built these pipelines and produced this stuff, then what exists there? And here you’re seeing various terminology out there for this like data marketplaces or data exchanges or those kinds of things where it’s ready-made. It’s not just all the raw stuff. It’s ready-made.

Juan Sequeda:
I call this … Actually, I’ve been talking to a colleague at Indeed, Gary if he’s listening. He gave me this whole view of there’s two lenses to the catalog. One of them is for the lens of that raw data for the audience of that are going to be your data engineers, want to know that-

Mike Ferguson:
Yeah, they’re the producers.

Juan Sequeda:
Then the other lens of the data products and the consumers of that are going to be your data scientists, your data analysts. There are different views. It may be the same tool but just different audiences, different tools and stuff like that. So totally, totally agree.

Mike Ferguson:
Yeah, I agree. I mean, I think there’s information producers or data producers, there’s data consumers. And I think there’s two uses of the catalog in that regard. But I also think that there needs to be a catalog for all the things you’re building like the models, the reports, all of that, so that I can see that. And so therefore, the question really is, is this marketplace or whatever you call it, is the second use case of the data catalog, is that more than just a data catalog? It tells you about all the models that have been built. It tells you that are out there and running.

Mike Ferguson:
If I start getting out into IoT and edge-based analytics, you could be talking about something like an oil and gas company with tens of thousands of models deployed. You got to know about that and we’ve got in order to be able to automate it and manage it and everything. So I think all of this needs to get opened up and used, but it comes back to the end game. The end game is are they effective? Are they contributing? We’ve got to anchor it back to a business strategy and goals and strategic outcomes that help us measure that or not. It’s not just about …

Mike Ferguson:
And of course, the other aspect to this with machine learning which is kind of another wave which we may get to talk about, may not, is this whole wave of automation is about to lift off and beginning to, but we’re at the beginning of that curve. I mean, that thing is just going to really lift off, and then we can see real significant contribution to effectiveness in a business.

Tim Gasper:
But that’s actually a great segue because actually the first lightning round question we’d like to ask you is around this exact topic. And obviously this is yes or no answer. And feel free to give a quick little bit of context here. But the more AI and automation in analytics, the better that our data usage will get, yes or no?

Mike Ferguson:
Could you repeat the question?

Tim Gasper:
Yeah, sure. The more AI and automation in data and analytics, the better our data usage will get.

Mike Ferguson:
As long as they’ve been trained on the right data, I would hope so. But obviously we have to eliminate bias and all kinds of other things around that. But I think as long as those models are trained, then potentially yes. I mean, that’s the hope at least. But you can’t just assume it’s going to happen. You’ve got to measure it to make sure it is happening.

Juan Sequeda:
Got it. So next one. Are data literacy and data culture the biggest deterrents, obstacles to data usage? Yes or no?

Mike Ferguson:
I would think yes. I mean, I think data literacy is about confidence, isn’t it? I mean, it’s about there’s not enough people have enough confidence to take advantage of data. And I think, we need to get more of that out there. So I think data literacy, data culture is this hidden thing, it’s kind of … And then the question again is what’s data culture about? Data culture is about how do you persuade your organization to have mass confidence and change?

Mike Ferguson:
So I think that’s why I believe that measuring contribution is so important. Because if you put up a business strategy up here and here’s our strategic goals, here’s what we’re trying to achieve and here are our outcomes, what you’re doing is giving all these project teams something to aim at.

Juan Sequeda:
Yeah, that’s right.

Mike Ferguson:
So in a way what they’re looking at, is they’re going after it hungry. And every time those needles move and they see, that’s a buzz. And that’s in my mind what’s going to change the culture. They have to see the contribution of the data and what they’re building in order to be able to change a culture. Because then it’s not … Then it’s the people themselves that change the culture.

Juan Sequeda:
So final last one. Will an industry standard on metadata emerge in the next five years?

Mike Ferguson:
Frankly there is one right now, but no one wants to use it. And that’s the problem for me. There’s the-

Juan Sequeda:
Which is which one-

Mike Ferguson:
… Linux Foundation Egeria Project.

Juan Sequeda:
Egeria, okay.

Mike Ferguson:
Which has been going for about three, three and a half years now. The problem is the lack of vendors who want to pick it up and use it. In my opinion this is not a technical barrier. It’s a business decision. Vendors are deciding not to go that way. And for that reason, there won’t be a business … there won’t be a standard that’s being used to the maximum obviously, because I think it conflicts …

Mike Ferguson:
I’d love to see the masses go down the Egeria road. There’s only two ways I’m going to see that happen, because bearing in mind that I’ve seen I think this is the fourth attempt in my lifetime in my career at Metadata Standards. All three previous have failed. This is the fourth one. And I think the problem here is, yeah, small vendors are afraid to do it. If the bigger vendors did it, by that I mean the hyperscalers, IBM is doing it. But if I took somebody like Microsoft and Google and AWS, if they went down that road, you get there’s six major vendors I would say in our industry today. I think if four of them went, yeah, the industry would go.

Mike Ferguson:
The only other way it could happen is if let’s say I get 100 banks or 100 retailers or whatever join Egeria and then just say to the vendors, “Look, we’re not going to buy anything from you until you guys do this.”

Tim Gasper:
Yeah, there’s not enough pressure yet, right?

Mike Ferguson:
Then it’ll happen. There’s not enough pressure from the customer side I think, because quite honestly a lot of customers don’t know the about Egeria yet, still not very well out there. But yeah-

Tim Gasper:
Or the benefits of open metadata either, right?

Juan Sequeda:
Yeah.

Mike Ferguson:
Or the benefits, absolutely.

Tim Gasper:
They kind of know but they’re not really feeling the pain yet, right?

Juan Sequeda:
All right. So we’ve got two minutes left because we have to wrap this up. Tim, quick takeaways. I’ll go mine and then we’ll throw it back to Mike here. So quick, what are your top two takeaways-

Tim Gasper:
Sounds good. So Tim’s takeaways: data complexity is a huge issue. I think you really address that Mike. And there’s pressure for organizations to industrialize, to make this process better, from the people, from the technology, and from the overall process perspective. That’s why we’re seeing data product management, data mesh movements, all these things. We have to tie this to measurement, not just of sort of our governance and our data processes, but of business outcomes. So I really love sort of your talk track there.

Tim Gasper:
What about you, Juan?

Juan Sequeda:
And for me it’s like understanding that this last decade has been kind of like understanding things in an ad hoc measure, an ad hoc way. We need to really focus and connect it to the business. We need to … The test here is to go understand how it’s tied to a business strategy, go figure out what your data analytics product is tied to a business strategy. You have to be explicit about it. Otherwise, you’re not using it. That’s the way to go do that.

Juan Sequeda:
And I love this whole notion of there is no front of the jigsaw puzzle. That is what we’re building. Like we need to know what that front of that jigsaw puzzle is. I love, I love that thing.

Juan Sequeda:
Mike, throwing it back to you. One minute. What’s your advice and who should we invite next?

Mike Ferguson:
Okay. My advice is nothing necessarily to do with what we’ve been talking. It’s a life thing really. For me, my advice is, surround yourself with others who are better than you, and learn from them. And also, I have this thing in my head which is, always if in doubt, separate it out. That means if you don’t understand a problem, dismantle the problem and solve bits of it until you solve the whole problem.

Juan Sequeda:
Love that. Love that. Who should we invite next?

Mike Ferguson:
My nomination in the hot seat next time around would be Donald Farmer at TreeHive if you haven’t interviewed Donald yet. I can put you in touch with him. Donald’s great. He’s a Scotsman living in Seattle. And he has a great tree house, I mean a huge tree house out in his backyard which you can walk upstairs to, are steps to, it winds around the tree all the way up there. That’s his office.

Juan Sequeda:
Awesome.

Tim Gasper:
That sounds awesome.

Juan Sequeda:
Mike, thank you so much for this awesome conversation.

Mike Ferguson:
Okay.

Juan Sequeda:
There are so many brilliant points that we made here. Thank you. Thank you. Thank you. And have a good one.

Tim Gasper:
This has been awesome.

Mike Ferguson:
Juan, I’m honored, honored to be invited. Thank you very much for the time guys. Really appreciate it.

Juan Sequeda:
Cheers.

Tim Gasper:
Cheers Mike.

Mike Ferguson:
Cheers. Cheers.

Enter Content Here.