Season 6 of "Catalog & Cocktails" has now wrapped, and this season was a treasure trove of insights from the data world. Our mandate has always been to have “honest, no-BS conversations about data.” This year generated some of the best, most interesting, no-fluff, straight-shooting conversations to date. Featuring a lineup of data virtuosos: Andrew Jones, Ethan Aaron, Alexa Westlake, Chris Tabb, Wendy Turner-Williams, Simone Steel, Doris Lee, Krystin Kim, Aaron Wilkerson, Kat Greenbook, Tom Redman, Andrew Jones, Ari Kaplan, Jon Cooke, Mike Dillinger, Dean Allemang, and Joe Reis, we covered it all. Here’s our summary of the season's juiciest insights.
On business value
Ethan Aaron, CEO of Portable, outlined the importance of focusing on high-value, low-effort data projects. He talked about how work falls into 4 buckets for data teams:
1. Low value, high effort work
2. Low value, low effort work
3. High effort, high value work
4. High value, low effort work
That fourth bucket? That’s where they should focus.
Aaron Wilkerson, Sr. Manager of Data Strategy & Governance at Carhartt, emphasized the need for data teams to communicate how technology investments impact top-line growth, bottom-line growth, and risk mitigation.
Meanwhile, Alexa Westlake, senior data analyst at Okta, went as far as to say that data that doesn’t drive results is useless. To her, teams have to create a culture of joint ownership of success metrics. But don’t get too impatient for overnight results: you won’t even see outcomes each quarter. We’re building for the long-term. In this culture, strong leaders have empathy, self-awareness, and will push their teams to feel comfortable asking the hard questions.
Chris Tabb, COO and co-founder of LEIT DATA, talked to us straight from Big Data London. We all empathized together about how businesses sometimes don’t understand what the data team does. When you ask “What does HR do?”, everyone knows they manage employee benefits, hiring, and programs. When you ask, “What does Sales do?”, everyone knows they sell the product or service. But ask, “What does the data team do?”, you’ll see some head-scratching. Data teams are a service to help the other groups improve their business performance – they rarely are a function in and of themselves.
Overall, data is essential in driving business strategy, and it’s imperative that data teams can closely align with business objectives. We don’t see that changing any time soon!
On data fluency and culture
Wendy Turner-Williams, a former CDO and CEO of AssociationAI, stressed the importance of understanding what data culture truly means. It's about high-quality business decisions that align with business strategies. Data literacy is not just about knowing SQL; it's about understanding the business and asking the right questions.
Simone Steel, a Chief Data and Analytics Officer, delighted us with a chat on data sustainability. Because sustainability isn’t just about natural resources, she posed certain important questions, like:
Are we training professionals appropriately?
Do we have enough of them?
Is it sustainable to manage businesses with all the legacy debt and flaws we have?
Is it sustainable to keep consuming more?
We covered making data science more accessible with Doris Lee, the cofounder of Ponder (acquired by Snowflake). What does it mean to be a data scientist today? The line is blurring evermore between business analysts, data scientists, and machine learning engineers. In the future, will everyone be a data scientist, no matter their role or title?
Collaboration was front and center with Krystin Kim, Senior Director of Decision Science at Post Holdings. And data catalogs are the key to that collaboration. With catalogs, you end up with a treasure trove of individual use cases. Then you can stumble on ideas, you can amplify others, and that’s the true value. In other words, “The ‘what’, the ‘why’, and the ‘wow!’”
The story continued with Kat Greenbook, in a conversation about how we effectively communicate our data insights. The “three act” structure of storytelling originates in Greek theater: “And,” “But,” and “Therefore.” Those three words describe the story arc of data. While we tend to get stuck in “And” phase, when we move onto “But” as the contrast and then drive the story home concluding “Therefore…” that’s when we win. If a dashboard is a data story, are we all Artistotles of data 🤔? One thing is for certain: a good data story can have a big business impact (and that too many visuals = “chart vomit”).
On various technical data concepts
We talked modeling and semantics with author Joe Reis, who lamented that the art and philosophy behind data modeling have been lost. In the age of AI, we need to reconnect with these fundamentals if we’re going to fully realize its potential. We also covered how in data, you can always mask over debt with another query…but it ain’t smart!
We agreed with Andrew Jones of GoCardless that data contracts are all the rage. What wonderful benefits unlock when we shift responsibility to the left? Those who generate the data are responsible for it. So if a software engineer creates data, it should exist in places where they expect to work, like GitHub.
Data Doc Tom Redman stopped by to diagnose what’s broken and why we go in data circles. The doctor prescribed one remedy to a glaring malady: the vast majority of data management is being done by people without “data” in their titles. They’re marketing, ops, finance - they’re doing the work of confirming the data but they’re untrained, unsupported, and trying to live through the day. Let’s get these people some training! Plus, let’s get technical debt under control. Doctor’s orders.
We journeyed to the lakehouse with Ari Kaplan, the Head of Evangelism at Databricks. The lakehouse is a single approach to governance across both data modes: structured and unstructured. It’s critical that data teams talk to each other. Pull up a chair to the bonfire, roast a marshmallow, make some s’mores, and enjoy the collaboration at the lakehouse!
On AI
With Jon Cooke, founder of Dataception, we explored the impact of generative AI and data products on business value. He set the expectation that very few companies nail software product management, so it’s unlikely to ever truly nail data product management. That means set expectations accordingly! And start with the business problem first (if you start with data first, you’ll lose people).
We spoke with Mike Dillinger about how the “LLM is the API for humans.” While LLMs are great for language, Knowledge Graphs are required for reasoning and coherence. And currently, we’re underusing Knowledge Graphs, because they’re not just for Retrieval Augmented Generation (RAG). They help with model training in the first place. In other words, they add “adult supervision” to the process of LLM responses.
And we would be remiss not to include our chat with Dean Allemang, co-author of the 2023 data benchmark report on LLM accuracy and Knowledge Graphs. Enterprises must treat the business context and semantics as a first-class citizen. In this conversation, we make the case for why.
Final thoughts
This season’s guests offered a multifaceted view of the data landscape. We reconfirmed the importance of aligning data strategies with business goals. We talked about data culture and data fluency: the good, the bad, the not-data-driven. We celebrated the potential of emerging technologies in AI. As we continue to navigate the complex world of data, we’re grateful for the experts who are helping to steer this ship into the future. See you in Season 7!