data.world has officially leveled up its integration with Snowflake’s new data quality capabilities
data.world enables trusted conversations with your company’s data and knowledge with the AI Context Engine™
Accelerate adoption of AI with the AI Context Engine™️, now generally available
Understand the broad spectrum of search and how knowledge graphs are enabling data catalog users to explore far beyond data and metadata.
Join our Demo Day to see how businesses are transforming the way they think about and use data with a guided tour through the extraordinary capabilities of data.world's data catalog platform.
Are you ready to revolutionize your data strategy and unlock the full potential of AI in your organization?
Come join us in our mission to deliver data for all and data for good!
Are you ready to revolutionize your data strategy and unlock the full potential of AI in your organization?
The AI Context Engine™ is a SaaS product comprising an ecosystem of Enterprise AI Agents that enables generative AI applications built on Large Language Models (LLMs) to generate more accurate, explainable answers from structured data, such as data housed in a Snowflake data warehouse. The first key capability helps teams build applications that answer natural language questions against structured data by producing a query that uses the context of your business and shows its work or “evidence” for the query and answer produced.
The AI Context Engine offers greater accuracy, explainability, and governance, providing answers that are more reliable. This is possible because the AI Context Engine is built on a data catalog constructed as a knowledge graph, utilizing semantic technologies such as RDF, OWL, and SPARQL. Whatever your organization stores in the data catalog, whether it’s data assets, decisions, processes, etc. that are important to your business, gets mapped and modeled via the knowledge graph and is machine-readable, so LLMs can capture the same context and knowledge about your data that your teams possess.
The AI Context Engine works as a layer between the data.world data catalog and your preferred user interface for GenAI, whether you’re using Slack and Teams or building a custom application or platform, so your teams can chat with your structured data (see our Slack and Teams SDKs to get started).
API First: The AI Context Engine is an API first product meant to be integrated and to deliver value to the user in the tools the user works with
Ecosystem of Agents: The AI Context Engine is not a single application. It is a growing ecosystem of agents that will continue to gain new capabilities such as assisting in the process of building out the semantic layer that powers the AI Context Engine product and even recommending and helping to manage governed and approved answers to ensure compliance. The foundation of this functionality enables the product to innovate and meet customers’ needs and adapt to advances in GenAI more quickly.
Can Scale with Generative AI: The AI Context Engine is LLM independent. data.world uses a self hosted open source model designed to ensure privacy and security. The AI Context Engine is not limited to using just one LLM, so as Generative AI continues to evolve at a pace faster than any technology before it, the AI Context Engine is designed to leverage such enhanced technologies.
Built On a Knowledge Graph Foundation: The AI Context Engine is built on the data.world governance and catalog platform, currently, the only knowledge graph driven data catalog built on semantic standards.
One of the biggest differentiators of the AI Context Engine is trust, specifically greater accuracy, explainability and governance. This is only possible through knowledge graphs, as our research papers have proven and the industry has validated.
Increase accuracy of AI-powered insights.
For structured data, GenAI that’s backed by a knowledge graph is 4.2x more accurate than if it relies solely on the underlying database.
Enhance trust in AI with explainability.
Gain visibility into an LLM’s “black box” by viewing the business concepts, glossary terms, queries, mappings and data sources used to generate each response.
Streamline and scale governance for GenAI.
Identify, approve, and reuse popular questions and answers to streamline analysis while reducing AI risks.
The catalog is where the "context" for the AI Context Engine comes from. The function of the AI Context Engine is to answer questions using your data - the way it does that is by consuming your metadata. In order to chat with your data, with trust, at scale, you need the AI Context Engine and its ability to read the catalog and use it as the roadmap to the data. "Explainability" is when we direct you back to the catalog for the audit trail on your answers.
To use the AI Context Engine, you need the data catalog platform, although it is possible to get started with the AI Context Engine with a very minimal data catalog setup. The minimum setup is a single virtualized data source and enough of a business glossary to define key terms the system needs to answer questions about a single domain.
Knowledge Tokens (KTs) are the billing units for the AI Context Engine. They measure and bill for the AI Context Engine usage, with each action consuming a certain number of KTs based on the complexity and resource requirements of that action. The AI Context Engine operates on a consumption-based SaaS model similar to Snowflake, Google Cloud, or Amazon Web Services, meaning clients pre-purchase Knowledge Tokens and consume them as they use the AI Context Engine. This model offers flexibility, scalability, and predictability.
Data security and privacy are top priorities for data.world and the AI Context Engine. We have security measures and policies in place designed to protect our customers’ data. For detailed information on our security protocols and privacy safeguards, please refer to our dedicated security and privacy documents.
The AI Context Engine leverages security tokens that can either be assigned to a service account or a named user. All permissions assigned to the account or user are then automatically enforced by the AI Context Engine through the existing data catalog security.
Privacy Policy
The AI Context Engine is a general-purpose technology applicable across industries. Any company with structured data can benefit from the AI Context Engine. This includes sectors like finance, healthcare, retail, manufacturing, technology, education, government, and energy. data.world’s goal is to understand your specific data challenges and goals and then demonstrate how the AI Context Engine's ability to unlock insights from structured data can address those unique needs.
The implementation time for the AI Context Engine varies based on the complexity of the use case and the readiness of the customer's data infrastructure. The AI Context Engine emphasizes a "start small and grow" approach, allowing for quicker initial implementation and iterative expansion. Initial setup for a basic implementation can typically be completed in a matter of weeks or months (depending on the complexity of the implementation), focusing on establishing a strong foundation with core concepts. Additional concepts and relationships can be added over time, as the understanding of the business language deepens and new requirements emerge.
data.world offers professional implementation services directly or through our partners, which are designed to help organizations efficiently adopt our platform. These services are designed so that our clients can integrate data.world into their existing data ecosystems, to enable agile data governance, improved data discovery, and more effective data collaboration across teams. Our or our partners’ professional services teams work closely with each organization to customize the implementation, training, and support to meet the organization’s unique needs and objectives.
The primary requirement for implementing the AI Context Engine is the implementation of the data.world data catalog platform which may be done before or during the AI Context Engine implementation. This implementation involves collecting and cataloging the assets and business concepts in their environment to create a foundational knowledge graph that the AI Context Engine will build upon. The AI Context Engine is designed to work with a variety of data sources and systems, and specific connectors or integrations may be required depending on the customer’s existing data infrastructure.
The AI Context Engine offers integration capabilities at two levels: via the data.world data `catalog, which provides access to a well-organized data foundation, and directly through the AI Context Engine API, designed for use with any programming language and optimized for advanced AI/ML scenarios. The AI Context Engine also provides ready-to-use starter kits for common communication platforms like Slack, Microsoft Teams, and Streamlit, offering quick user experiences with minimal configuration.
While data virtualization is not strictly required to benefit from the AI Context Engine, it is included (limited to certain tiers of connectors and often used for data questions). Data virtualization and federated query features are part of the AI Context Engine's capabilities, making it easier to run SQL queries against source data and retrieve results.
The AI Context Engine is highly customizable. The fact that it is based on a knowledge graph and its “schema” or ontology is defined in a highly expressive language (OWL), nearly any complex business concept or relationship can be modeled. Additionally, the R2RML mappings allow data to be reshaped and formed to fit the true language of the business which is represented as Concepts, Attributes and Relationships.
Yes. It can do this using data.world views. data.world views are aware of the current user using a function we have available to developers. Using this, the appropriate masking can be applied using the data.world security model and then be directly mapped using the R2RML. When the AI Context Engine queries the data, the rules of the view will be applied.
Yes.
We encourage human review of the AI Context Engine™’s output as the “human in the loop,” but the product does not currently support a “human in the loop” feedback cycle to improve answers.
The AI Context Engine builds enhanced trust through:
Increase accuracy of AI-powered insights.
For structured data, GenAI that’s backed by a knowledge graph is 4.2x more accurate than if it relies solely on the underlying database.
Enhance trust in AI with explainability.
Gain visibility into an LLM’s “black box” by viewing the queries and data sources used to generate each response.
Streamline and scale governance for GenAI.
Identify, approve, and reuse popular questions and answers to streamline analysis while reducing AI risks.
While the AI Context Engine provides powerful insights, over-reliance on any AI system can pose risks such as reduced critical thinking, misinterpretation of results, overlooking data quality issues, confirmation bias, and neglecting non-quantifiable factors. To mitigate these risks, it is recommended to regularly train on the AI Context Engine’s capabilities and limitations, encourage a culture of critical thinking, implement processes for human validation of key decisions, and continuously monitor and improve data quality.
The AI Context Engine is dependent on the quality of the underlying source data. It can also be used to expose quality issues by asking probing questions. We do not train our models on any of our customers’ data and use a generally trained model to power the AI Context Engine.
Significant changes in data structure or business rules can impact the AI Context Engine's performance, but the system is designed to be adaptable. Ontology and mapping updates can reflect new data structures or business rules. The AI Context Engine maintains version control of its ontology and models, allowing for easy rollback if needed.
The AI Context Engine coupled with the data.world data catalog can revolutionize decision-making cultures by democratizing data access, enhancing speed and agility, fostering cross-departmental collaboration, encouraging a culture of curiosity and continuous learning, reducing bias through objective data-driven insights, promoting proactive problem-solving, and fostering continuous improvement.
The AI Context Engine primarily focuses on structured data. However, since it is an API, it can be integrated with other AI solutions that focus on unstructured data such as Snowflake Cortex Search and Document.ai.
The AI Context Engine enhances accuracy and reliability through its knowledge graph, which provides evidence and governance for questions and responses. This interconnected system results in greater explainability and trustworthiness in the generated content.
Data privacy measures are managed by the source systems themselves and are outside of data.world. The AI Context Engine leverages these existing measures for data security and privacy.
The AI Context Engine uses a self-hosted LLM running on US-based servers. Prompts are processed in the US AI environment, which is a stateless service ensuring no long-term data storage. Plans to extend connectivity to EU-based production environments are in progress to support European customers.
Single-tenant customers will have secure connectivity set up between their environment and the AI environment. Their prompts and responses will traverse this boundary. Dedicated LLMs are not feasible due to cost, so a shared AI environment is used.