In case you missed it…
On Thursday, April 7, we held the spring 2022 data.world summit, our live virtual event on data mesh, open data, knowledge-first, and so much more. Along with main-stage presentations like “Disrupting Data Governance,” from Moxy Analytics CEO Laura Madsen, we hosted a number of sessions focused on three breakout talk tracks:
- Knowledge First - A track for data leaders who know change is needed and want to make the leap from data-driven to knowledge first.
- Practitioner’s Paradise - A more technical discussion for practitioners seeking expert advice on data strategy and catalog implementation.
- Open Data For the Win - A showcase for the transformative power of open data in society, public projects, journalism, and the enterprise.
Here, we’re recapping the Knowledge-First track.
Translating AI from Concept to Reality: Five Keys to Implementing AI for Knowledge, Content, and Data
Lulit Tesfaye, Partner and Division Director of Data and Information Management at Enterprise Knowledge, LLC, kicked off our discussion of “knowledge first” with her talk, “Translating AI from Concept to Reality: Five Keys to Implementing AI for Knowledge, Content, and Data.”
As Lulit began her presentation, she explained that, “In order for any advanced solution or machine to deliver value from data or information, it needs to first understand context, knowledge, and how a person would describe, use, or make decisions from that data.”
She also shared the reason Enterprise Knowledge customers are adopting AI is to solve some of the oldest challenges in information management, including discoverability, and the ability to use data.
Lulit went on to share the secret sauce for enterprise AI success, the foundational principles of knowledge management: “people, process, content, culture, and technology.”
Lulit’s five foundational principles for a successful implementation of enterprise AI are:
- The problem or use case - A clear problem statement and definition of business value that AI will solve — one that reflects stakeholder interest.
- People and domain knowledge - Ready access to people and domain knowledge and the supporting organizational functions.
- Data organization and connectivity - Connecting data across multiple systems enables traceable analysis, allowing you to follow information across different sources, different locations, data types, and across time.
- Data enrichment - Enrich your content and data with domain knowledge and context via extraction of topics and text for taxonomy enrichment, auto-tagging, and classification of key concepts in your data. Engage SMEs and knowledge engineers to set the groundwork for human-in-the-loop development and getting knowledge into a standard, machine-readable format.
- An explainable solution - Combining the principles of semantic modeling with knowledge management and AI, a rich semantic layer captures key business facts and solutions and is understandable, reusable, and interoperable.
“A key success factor here is to understand that enterprise AI really goes beyond the data and platform teams,” she said. “It requires collaboration between data producers, consumers and the knowledge owners in order to provide consistent and reliable flow of data.”
And the data shared across your organization, said Lulit, is N.E.R.D-y: New, Essential, Reliable, and Dynamic.
Data and Knowledge Therapy
Next, it was time for a little therapy… a conversation about “Data and Knowledge Therapy,” between data.world Principal Scientist Juan Sequeda and Mohammed Aaser, Chief Data Officer (CDO) at Domo.
“We spend so much time talking, it's ended up being this therapy session that we have,” laughed Juan, explaining the pair’s friendship.
“Yeah,” agreed Mohammed. “I feel like we need one of those special chairs that someone can lie down on and then they can tell you all their problems. Except this time it can be data problems.”
They then dived into their conversation, with Juan posing the question, “How do we scale data usage and data access in an organization?”
Mohammed’s response: “I think we're going to start to see the emergence of a common knowledge layer.”
Juan and Mohammed also discussed how to avoid redundant efforts that result from divergent definitions of data types, and instead reusing data analysis that’s already been performed, avoiding the scourge of re-engineering.
“When you have different teams, they sometimes create different (data) definitions. They re-engineer different things. And then you get this proliferation problem, right?” said Mohammed. “Now the question in my mind is, ‘How do you truly create a common layer of this knowledge layer, where you're capturing all of the key elements, whether it's the sales data, the service history, etc. And I think the holy grail is, instead of people having to re-engineer every single time, we actually have this dynamic layer that handles the data integration for us.”
Juan agreed, highlighting the importance of establishing organization-wide data definitions: “I think ‘the knowledge layer,’ ‘the semantic layer,’ ‘the ontology,’ all these words that we see, it's really just understanding what are those core business entities and how those business entities are related to each other.”
The pair also touched on the idea of “data as a product,” with Juan explaining that data product owners need to think about, ‘Who's responsible. What is this thing? What, isn't it? What are the inputs? What are the outputs? What are the contracts and expectations around this? Who are the downstream consumers? How is this related to other things? We need to start bringing in that product thinking.’”
The question of top-down versus bottom-up governance also entered the discussion, with Mohammed saying, “You have to start thinking about, how do you design your data teams to be able to truly take advantage?” and Juan agreeing, “You really need to find the balance between centralization and decentralization that best works for your organization, and for your culture. There is no one-size-fits-all.”
Juan and Mohammed ended their conversation acknowledging that the vast majority of businesspeople might not know much about data science or analysis, but can still benefit immensely from its use. “We’re in a little bit of a bubble where we talk about all these great new tools,” admitted Mohammed. “But the true diffusion of this innovation to the entire market is still extreme, extremely, extremely low.”
Summarized Juan, “I would say we need to go from this world of ‘data first’ to a ‘knowledge first’ world. But what I really mean is that a “ knowledge first world” is people.”
If you’re interested in hearing more from these two industry savants, Juan and Mohammed have teamed up to produce “Data and Knowledge Therapy” on LinkedIn, where they discuss all things data and knowledge, and separate hype or idealistic new technologies from those that actually deliver impact.
Designing and Building Enterprise Knowledge Graphs
Juan returned for another session, this time with Ora Lassila, Principal Graph Technologist at Amazon Web Services. Juan and Ora co-authored the book “Designing and Building Enterprise Knowledge Graphs,” a topic they discussed in depth during their conversation.
“One of the things that we've been talking about for so long is how just in the enterprise data is so messed up,” began Juan. “Frankly, we kind of need some sort of a reboot. We should be thinking about semantics, and knowledge, and meaning.”
I often see people spending too much time on figuring out syntax, and they completely forget about semantics,” agreed Ora. “Syntax is a solved problem; It only distracts you from solving the real problems. Forget syntax, it doesn’t matter. Interpretation is the key.”
“We live in this world of ‘data first,’ continued Juan. “But why, why do we need all this data? In reality, we need to be able to interpret this. ‘What is this? What is the problem that I'm trying to solve? And what is the data that I need for that?’ Something needs to change. And for me, that change is that we need to start thinking about knowledge, meaning, and semantics.”
Ora pointed out that in order for people to gain knowledge, they need access to data. “But access doesn't mean just the physical bits; if you just give them the physical bits, that's the easy part. Without explaining what the data means, you have done really no service to these people. So ‘accessible data’ really means the physical bits plus semantics. And semantics means, ‘How do information systems interpret that data? How do they make use of that data in a meaningful manner?’”
Agreed Juan, “The people who are actually making decisions that are going to increase value to the organization, honestly, they don't care about how the data is moving. That’s just the means to the end. And the end is having questions answered and being able to make decisions.”
“I want people to realize that when we talk about a data catalog, it's actually more than just cataloging data. We need to be cataloging the actual knowledge,” he continued. “I want us to realize that we don't need to focus so much on tools, because the tools will come and go. But the really important concepts that are within the organization, they will always be there.”
“Until now, we've just focused on technology, and the expectation that we just need more technology and more data is driving us insane. What we need — the paradigm shift here — is to focus on the social aspects, the people, the process, and also on the knowledge. This is what we need to enter a knowledge first world.”
In case you missed it…
Missed the data.world spring summit? No sweat! You can watch every session on demand.