And That’s a Wrap – Takeaways from Catalog and Cocktails Season Three

by | Jul 1, 2022 | 2022, data architecture, Data catalogs, Data community, data value, Data-driven cultures

Thursday’s Catalog and Cocktails podcast season three finale capped Tim and Juan’s most-popular season yet, with the hosts toasting their 95th episode with … Champagne? (Not a cocktail, we know. But, hey, it was a special occasion.)

The episode served as a retrospective of the third season, with Tim and Juan revisiting takeaways from past episodes and awarding “Best Quote” to Sanjeev Mohan for his terrible and excellent pun, “I never metadata I didn’t like.” Ouch.

Looking back at the season as a whole, the hosts identified six major themes that emerged, and highlighted key points and topics addressed by their roster of brilliant guests.

Those themes were:

  • Modern Data Stack
  • Semantics/Metrics/Knowledge
  • Leadership
  • Business Value
  • Data Governance
  • Data Mesh

In this list, we highlight only a small fraction of the excellent insights shared by the podcast’s guests. If you want to hear more on any of these topics, you can easily click through to the episodes.

Modern Data Stack

  • Sarah Catanzaro of Amplify Partners spoke to the pendulum of IT, and how we’ve gone from a focus on analysis to a focus on data monitoring and infrastructure.
  • Emily Hawkins of Drizly walked us through the critical components of a modern data stack.
  • Arjun Narayan of Materialize spoke to streaming vs batch data processing, and how extract and load will increasingly take place in real-time.
  • Bob Muglia, former CEO of Snowflake, gave a crisp and succinct definition of the Modern Data Stack: deliver analytics on SaaS, public cloud (Scale and cost), using SQL for modeling, however  SQL is not god’s gift to languages.
  • Chad Sanderson of Convoy bemoaned building the modern data stack on a messy, overflowing data lake — aka data swamp.
  • Sarah Krasnik of Versionable.io preached the value of open-source data tools, presuming your business has the resources to maintain them.
  • Sanjeev Mohan of SanjMo pointed to the micro-segmentation of the data industry… and how our main concern should be future-proofing our MDS.
    Juan Sequeda and Catalog and Cocktails guest Sanjeev Mohan

    Juan Sequeda and Catalog and Cocktails guest Sanjeev Mohan

Semantic Layer/Metrics/Knowledge

  • Benn Stancil proposed that a metrics layer should be added to your data architecture to ensure metrics are defined uniformly across your organization.
  • Karen Lopez of InfoAdvisors suggested careful consideration of your established best practices in order to figure out what tradeoffs you’re making. And don’t make exceptions as you model, because those exceptions might become the rule.
  • Bob Muglia predicted knowledge graphs are the future of data modeling and gave another crisp and succinct definition for metrics: a function applied to relationships between business entities. `
  • Chad Sanderson explained why “the knowledge layer” should represent the real world as closely as possible, and how the way FAANG companies govern their data might not be for everyone.
  • Nick Handel, CEO of Transform, pointed out that self-service data analysis isn’t possible without a semantic layer.
  • Francois Scharffe, Knowledge Graph Conference Chair, said that automated decision making will be dependent on knowledge graphs.
  • Ergest Xheblati, Lead Data Architect at EverQuote, stressed the importance of business literacy on data teams, and shared that he believes the two most important skills for data engineers are empathy and curiosity.

Leadership

  • Sarah Catanzaro said that simply putting data in the boardroom isn’t enough; your organization needs a data specialist in the room, too.
  • Dora Boussias of Stryker focused on clear communication and the importance of leadership setting the direction as you architect your data governance.
  • Luke Slotwinski, VP, Data & Analytics at Prologis, suggested organizations need to be continually asking “Why? Why? Why?”, and that executives should prioritize data literacy to add business value.
  • Steve Perry of Genius Sports said data professionals of all levels should lean into imposter syndrome and realize that nobody is really an expert.
  • Our own CEO and Co-Founder Brett Hurt shared his experience leading in an uncertain economy, and how data helps organizations be resilient in challenging times.
  • Ciaran Dynes, Chief Product Officer at Matillion, talked about the importance of surfacing information bias and creating a diverse culture within your organization.

Business Value

  • Head of Data Strategy & Governance, Data Quality, Product Management, and MDM at Waste Management Peter Kapur cautioned against a techno-centric approach to data management, and suggested a focus on the data’s value to your business.
  • Patricia Thaine, CEO at Private AI, suggested that privacy is a competitive advantage in the industry because customers appreciate it.
  • Sarah Krasnik hit on the importance of businesses establishing a “north star” to help pinpoint what data is most valuable.

Juan Sequeda and Catalog and Cocktails guest Luke Slotwinski

Juan Sequeda and Catalog and Cocktails guest Luke Slotwinski

Governance

  • Laura Madsen of Moxy Analytics reiterated that the goal of agile data governance is iterative improvement, not immediate perfection.
  • Bob Muglia explained that governance is the bridge between the modern data stack and the future of knowledge graphs, and that a data catalog/governance should be the first application in your data stack. 
  • Nong Li, CEO at Okera, pointed out that establishing an effective data policy allows users to get data faster while also diminishing risk.
  • Shane Gibson of AgileData.io opined that agile data governance is all about identifying beneficial patterns and then experimenting to improve them.

Data Mesh

  • Dora Boussias reminded us that the principles of data mesh have existed for years; what’s new is combining them. She also confirmed that centralized governance is a bad best practice.
  • Omar Khawaja, Head of BI at Roche,  emphasized that establishing a data mesh does not start with technology, but with people and process; technology is the enabler. He also said the definition of federated computational governance was “going nuts… in an organized manner.”
  • Shane Gibson shared his thoughts on the importance of documenting your data supply chain.
  • Chad Sanderson hit on the importance of establishing contracts and SLAs between your data engineers and data consumers to ensure each understands what the final product will be.

A knowledge-packed season comes to an end!

But have no fear; Catalog and Cocktails will return! The first episode of season four will be recorded August 24, live at the Gartner Data & Analytics Summit in Orlando, Florida!

We hope to see you there!

 

We're a Leader in The Forrester Wave™: Enterprise Data Catalogs for DataOps, Q2 2022 Report

See why data.world is a leader among data catalogs for the cloud-based modern data stack in the Forrester Wave™: Enterprise Data Catalogs for DataOps, Q2 2022 Report