data.world CEO and Co-Founder Brett Hurt is sharing his thoughts on the theme of “People + Data” on a weekly basis in advance of our September 22 data.world fall summit.
The current chill sweeping through the economy is not your parents' downturn. Neither should be your response as you navigate through it with the tools of data we’ve been discussing in this series.
Earlier, we discussed why and how we need to get smarter about “smart” technology. In my second post in this series, I shared a few lessons on my success at Coremetrics using data to help clients survive and thrive in the dot-com bust. And in the third, I shared my experience helping 3M and other iconic companies use data engineering and analytics to align ever more closely with their customers during the Great Recession.
So today my aim is to explore why we need to reimagine our conventional assumptions about the best data practices as we journey through the uncharted territory of this unconventional downturn in the economy.
To my point on unprecedented economics, the last time we saw a comparable combination of high inflation and low unemployment was at least seven decades ago. Hovering around 8%, today’s inflation is the highest in 40 years. But in this split-screen moment the job market, with 3.5% unemployment at this writing, is stronger than at any time in the last 50 years.
Sure, there are flocks of black swans out there, from war-driven rises in energy costs to climate extremes, from stimulus hangovers to supply chain woes, and more. But a further head scratcher for conventional economic thinking is that our current business headwinds are almost a mirror image of the short but deep recession driven by the start of the pandemic in 2020.
Venturing into Uncharted Territory
Elsewhere, I’ve written extensively about navigating the pandemic at data.world. It was a difficult time of trauma, reorganization, and fear, to be sure. But ultimately, our company and most in the digital space prospered on surging valuations as Americans stayed home, eCommerce boomed, and life moved online thanks to the technologies of the 21st century. The service sector, meanwhile, all but collapsed and unemployment skyrocketed in the brick-and-mortar sphere to the highest levels since we began tracking joblessness in the 1940s, as airlines, energy companies, and the automotive sectors were hammered.
Now though, as vaccines have rolled out, as we’ve learned how to cope more or less with a calmer “endemic” virus, and as bills have come due, the two sides of the economy – broadly “old” and “new” – have almost traded places. Tech giants from Apple to Facebook have imposed hiring freezes, the NASDAQ has been mostly southbound since January, and Peloton with its bikes and digital exercise library, thinks a quarterly loss of $1.4 billion is “substantial progress.” Yet against this backdrop, my hometown airport in Austin had its busiest travel summer in history, manufacturing is surging, energy stocks are the new belle of Wall Street, and car dealerships, despite chip shortage-driven backlogs, can’t keep new cars on the lot.
Interesting times indeed, as the apocryphal cliche goes. Which is why we need to reorchestrate data storage, use, and organization in interesting new ways that depart from the conventional standards of central control – an approach we call “data mesh.” In an earlier post, I wrote about this in Data is from Mars. Data Science is from Venus. But to phrase this more prosaically, we need to level the data playing field, which is what the concept is all about.
The Principles of Data Mesh
Technical descriptions of data mesh abound and our own Principal Scientist Juan Sequeda and VP of Product Tim Gasper have explored its dimensions in detail with Zhamak Dehghani, who authored the concept. But the broad point is that the collection and use of data have evolved in ways that have most often made its governance the province of an institution’s upper IT echelon. A centralized data command made sense in an earlier era, before the “cloud,” IoT (the internet of things), and data volumes growing at rates that defy exaggeration. Not only have we siloed data in incompatible lakes and warehouses, we’ve also segregated domain expertise within our institutions from data expertise. No wonder in 2021 that some 97 percent of data engineers say they are suffering burnout and more than 70 percent say they hope to quit in the next year.
What we need is not a new foundational layer to the data stack, but a new approach to collective and collaborative thinking about data, its uses, and most critically at this moment of data’s transformative power in every business. Data mesh, of course, requires some reconfiguration of technical infrastructure. But it’s far less about technology than what my colleague and Co-founder Jon Loyens calls a “socio-technical transformation,” which really boils down to four principles:
- Domain ownership: Your sales team, your finance team, your creative team, your marketing team, etc. should own and control the data they create. This requires some training and familiarization, but it’s ultimately about empowering these teams with resources at their fingertips and not in the hands of a distant team for which they need to submit a query ticket.
- Data as a product: Imagine the way you search for products on Google or through an eCommerce platform. Data tools need to be similarly concrete. When data becomes a product of the team creating it, this empowers that team not just to be accountable, but to control its quality, representation, and utility. Data sets become comparable to the finance team’s P&L, the sales team’s conversion funnel, or the marketing team’s promotions calendar.
- Self-serve: Building on these principles of ownership and product, data resources need to be accessible on a self-serve basis on transparent platforms for the teams that need them for quick decision-making. No more exclusive reliance on data science chefs and waitstaff to deliver up the facts of the business per order, inevitably leading to “data breadlines” as described in the seminal book, Winning With Data. The new model is a self-service data kitchen. Or multiple kitchens.
- Federation: A ‘federated’ structure is the data mesh concept that brings this all together. For decentralizing data is not an invitation for various divisions to hoard information. Rather, with data experts at the ready to guide and counsel, this new model envisions distinct domain-specific “data platforms” linked together in a common, open, accessible choreography in which the entire enterprise shares.
This is the roadmap for the challenges ahead. And it’s all about People + Data. Next week we’ll discuss data lineage, historical veracity, and our new tool Eureka Explorer that turns this roadmap into the GPS system to deliver resilience to your business no matter what black swan comes next.