AI Context Engine™ FAQ

Introduction to AI Context Engine™

What is the AI Context Engine™?

The AI Context Engine™ is a SaaS product comprising an ecosystem of Enterprise AI Agents that enables generative AI applications built on Large Language Models (LLMs) to generate more accurate, explainable answers from structured data, such as data housed in a Snowflake data warehouse. The first key capability helps teams build applications that answer natural language questions against structured data by producing a query that uses the context of your business and shows its work or “evidence” for the query and answer produced.

The AI Context Engine offers greater accuracy, explainability, and governance, providing answers that are more reliable. This is possible because the AI Context Engine is built on a data catalog constructed as a knowledge graph, utilizing semantic technologies such as RDF, OWL, and SPARQL. Whatever your organization stores in the data catalog, whether it’s data assets, decisions, processes, etc. that are important to your business, gets mapped and modeled via the knowledge graph and is machine-readable, so LLMs can capture the same context and knowledge about your data that your teams possess.

The AI Context Engine works as a layer between the data.world data catalog and your preferred user interface for GenAI, whether you’re using Slack and Teams or building a custom application or platform, so your teams can chat with your structured data (see our Slack and Teams SDKs to get started).

How does the AI Context Engine™ differ from other AI-powered data solutions in the market?

API First: The AI Context Engine is an API first product meant to be integrated and to deliver value to the user in the tools the user works with
Ecosystem of Agents: The AI Context Engine is not a single application. It is a growing ecosystem of agents that will continue to gain new capabilities such as assisting in the process of building out the semantic layer that powers the AI Context Engine product and even recommending and helping to manage governed and approved answers to ensure compliance. The foundation of this functionality enables the product to innovate and meet customers’ needs and adapt to advances in GenAI more quickly.
Can Scale with Generative AI: The AI Context Engine is LLM independent. data.world uses a self hosted open source model designed to ensure privacy and security. The AI Context Engine is not limited to using just one LLM, so as Generative AI continues to evolve at a pace faster than any technology before it, the AI Context Engine is designed to leverage such enhanced technologies.
Built On a Knowledge Graph Foundation: The AI Context Engine is built on the data.world governance and catalog platform, currently, the only knowledge graph driven data catalog built on semantic standards.

One of the biggest differentiators of the AI Context Engine is trust, specifically greater accuracy, explainability and governance. This is only possible through knowledge graphs, as our research papers have proven and the industry has validated.

What are the key benefits of using the AI Context Engine™ for our customers?

Increase accuracy of AI-powered insights.
- For structured data, GenAI that’s backed by a knowledge graph is 4.2x more accurate than if it relies solely on the underlying database.
Enhance trust in AI with explainability.
- Gain visibility into an LLM’s “black box” by viewing the business concepts, glossary terms, queries, mappings and data sources used to generate each response.
Streamline and scale governance for GenAI.
- Identify, approve, and reuse popular questions and answers to streamline analysis while reducing AI risks.

The AI Context Engine™ and the data.world Catalog

How does the AI Context Engine™ interact with the catalog?

The catalog is where the "context" for the AI Context Engine comes from. The function of the AI Context Engine is to answer questions using your data - the way it does that is by consuming your metadata. In order to chat with your data, with trust, at scale, you need the AI Context Engine and its ability to read the catalog and use it as the roadmap to the data. "Explainability" is when we direct you back to the catalog for the audit trail on your answers.

Is the data.world Catalog required to use the AI Context Engine™

To use the AI Context Engine, you need the data catalog platform, although it is possible to get started with the AI Context Engine with a very minimal data catalog setup. The minimum setup is a single virtualized data source and enough of a business glossary to define key terms the system needs to answer questions about a single domain.

AI Context Engine™ technical details and implementation

Can you explain "Knowledge Tokens" in simple terms?

Knowledge Tokens (KTs) are the billing units for the AI Context Engine. They measure and bill for the AI Context Engine usage, with each action consuming a certain number of KTs based on the complexity and resource requirements of that action. The AI Context Engine operates on a consumption-based SaaS model similar to Snowflake, Google Cloud, or Amazon Web Services, meaning clients pre-purchase Knowledge Tokens and consume them as they use the AI Context Engine. This model offers flexibility, scalability, and predictability.

How does the AI Context Engine™ handle data security and privacy?

Data security and privacy are top priorities for data.world and the AI Context Engine. We have security measures and policies in place designed to protect our customers’ data. For detailed information on our security protocols and privacy safeguards, please refer to our dedicated security and privacy documents.

The AI Context Engine leverages security tokens that can either be assigned to a service account or a named user. All permissions assigned to the account or user are then automatically enforced by the AI Context Engine through the existing data catalog security.

Privacy Policy

data.world Security Page

What kind of companies or industries would benefit most from the AI Context Engine™?

The AI Context Engine is a general-purpose technology applicable across industries. Any company with structured data can benefit from the AI Context Engine. This includes sectors like finance, healthcare, retail, manufacturing, technology, education, government, and energy. data.world’s goal is to understand your specific data challenges and goals and then demonstrate how the AI Context Engine's ability to unlock insights from structured data can address those unique needs.

How long does it typically take to implement the AI Context Engine™?

The implementation time for the AI Context Engine varies based on the complexity of the use case and the readiness of the customer's data infrastructure. The AI Context Engine emphasizes a "start small and grow" approach, allowing for quicker initial implementation and iterative expansion. Initial setup for a basic implementation can typically be completed in a matter of weeks or months (depending on the complexity of the implementation), focusing on establishing a strong foundation with core concepts. Additional concepts and relationships can be added over time, as the understanding of the business language deepens and new requirements emerge.

data.world offers professional implementation services directly or through our partners, which are designed to help organizations efficiently adopt our platform. These services are designed so that our clients can integrate data.world into their existing data ecosystems, to enable agile data governance, improved data discovery, and more effective data collaboration across teams. Our or our partners’ professional services teams work closely with each organization to customize the implementation, training, and support to meet the organization’s unique needs and objectives.

What technical requirements does a company need to have in place to use the AI Context Engine™?

The primary requirement for implementing the AI Context Engine is the implementation of the data.world data catalog platform which may be done before or during the AI Context Engine implementation. This implementation involves collecting and cataloging the assets and business concepts in their environment to create a foundational knowledge graph that the AI Context Engine will build upon. The AI Context Engine is designed to work with a variety of data sources and systems, and specific connectors or integrations may be required depending on the customer’s existing data infrastructure.

How does the AI Context Engine™ integrate with existing data systems and tools?

The AI Context Engine offers integration capabilities at two levels: via the data.world data `catalog, which provides access to a well-organized data foundation, and directly through the AI Context Engine API, designed for use with any programming language and optimized for advanced AI/ML scenarios. The AI Context Engine also provides ready-to-use starter kits for common communication platforms like Slack, Microsoft Teams, and Streamlit, offering quick user experiences with minimal configuration.

Is data virtualization required to use the AI Context Engine™?

While data virtualization is not strictly required to benefit from the AI Context Engine, it is included (limited to certain tiers of connectors and often used for data questions). Data virtualization and federated query features are part of the AI Context Engine's capabilities, making it easier to run SQL queries against source data and retrieve results.

How customizable is the AI Context Engine™?

The AI Context Engine is highly customizable. The fact that it is based on a knowledge graph and its “schema” or ontology is defined in a highly expressive language (OWL), nearly any complex business concept or relationship can be modeled. Additionally, the R2RML mappings allow data to be reshaped and formed to fit the true language of the business which is represented as Concepts, Attributes and Relationships.

Can the AI Context Engine™ support masked fields?

Yes. It can do this using data.world views. data.world views are aware of the current user using a function we have available to developers. Using this, the appropriate masking can be applied using the data.world security model and then be directly mapped using the R2RML. When the AI Context Engine queries the data, the rules of the view will be applied.

Can the AI Context Engine™ support nested data such as BigQuery Nested Arrays?

Yes.

Does the AI Context Engine™ support “human in the loop” enhancement?

We encourage human review of the AI Context Engine™’s output as the “human in the loop,” but the product does not currently support a “human in the loop” feedback cycle to improve answers.

How does the AI Context Engine™ foster a sense of trust and confidence in AI-driven insights?

The AI Context Engine builds enhanced trust through:

Increase accuracy of AI-powered insights.
- For structured data, GenAI that’s backed by a knowledge graph is 4.2x more accurate than if it relies solely on the underlying database.
Enhance trust in AI with explainability.
- Gain visibility into an LLM’s “black box” by viewing the queries and data sources used to generate each response.
Streamline and scale governance for GenAI.
- Identify, approve, and reuse popular questions and answers to streamline analysis while reducing AI risks.

What are the potential risks of over-reliance on AI-generated insights from the AI Context Engine™?

While the AI Context Engine provides powerful insights, over-reliance on any AI system can pose risks such as reduced critical thinking, misinterpretation of results, overlooking data quality issues, confirmation bias, and neglecting non-quantifiable factors. To mitigate these risks, it is recommended to regularly train on the AI Context Engine’s capabilities and limitations, encourage a culture of critical thinking, implement processes for human validation of key decisions, and continuously monitor and improve data quality.

How does the AI Context Engine™ handle potential biases in the underlying data?

The AI Context Engine is dependent on the quality of the underlying source data. It can also be used to expose quality issues by asking probing questions. We do not train our models on any of our customers’ data and use a generally trained model to power the AI Context Engine.

Long-term success and adaptability with the AI Context Engine™

What are the implications if a customer's data structure or business rules change significantly post-implementation?

Significant changes in data structure or business rules can impact the AI Context Engine's performance, but the system is designed to be adaptable. Ontology and mapping updates can reflect new data structures or business rules. The AI Context Engine maintains version control of its ontology and models, allowing for easy rollback if needed.

How can the AI Context Engine™ potentially transform a company's decision-making culture?

The AI Context Engine coupled with the data.world data catalog can revolutionize decision-making cultures by democratizing data access, enhancing speed and agility, fostering cross-departmental collaboration, encouraging a culture of curiosity and continuous learning, reducing bias through objective data-driven insights, promoting proactive problem-solving, and fostering continuous improvement.

Technical performance and features for the AI Context Engine™

Does the AI Context Engine™ interact with unstructured data?

The AI Context Engine primarily focuses on structured data. However, since it is an API, it can be integrated with other AI solutions that focus on unstructured data such as Snowflake Cortex Search and Document.ai.

How does the AI Context Engine™ enhance the accuracy and reliability of the content it generates?

The AI Context Engine enhances accuracy and reliability through its knowledge graph, which provides evidence and governance for questions and responses. This interconnected system results in greater explainability and trustworthiness in the generated content.

What are the data privacy measures in place for the AI Context Engine™ and data virtualization?

Data privacy measures are managed by the source systems themselves and are outside of data.world. The AI Context Engine leverages these existing measures for data security and privacy.

Environmental considerations and deployment for the AI Context Engine™

Where does the AI Context Engine™ host its LLM?

The AI Context Engine uses a self-hosted LLM running on US-based servers. Prompts are processed in the US AI environment, which is a stateless service ensuring no long-term data storage. Plans to extend connectivity to EU-based production environments are in progress to support European customers.

How does the AI Context Engine™ handle single-tenant customers?

Single-tenant customers will have secure connectivity set up between their environment and the AI environment. Their prompts and responses will traverse this boundary. Dedicated LLMs are not feasible due to cost, so a shared AI environment is used.

Catalog

Explorer

Marketplace

Governance

Workbench

Catalog

Explorer

Marketplace

Governance

Workbench

Financial Services

Healthcare

Higher Education

Insurance

Federal

State and Local Government

Financial Services

Healthcare

Higher Education

Insurance

Federal

State and Local Government

Data Leaders

Data Engineers

Data Governance Professionals

Analysts & Business Users

Data Leaders

Data Engineers

Data Governance Professionals

Analysts & Business Users

Integrations

API Documentation

Reference Implementations

Support

Integrations

API Documentation

Reference Implementations

Support

Snowflake

Oracle Database

Postgres SQL

Databricks

dremio

Snowflake

Oracle Database

Postgres SQL

Databricks

dremio

Blog

Events

Podcasts

Webinars

Reports and Tools

Blog

Events

Podcasts

Webinars

Reports and Tools

Who We Are

Our Team

Our Partners

Why data.world

Who We Are

Our Team

Our Partners

Why data.world

Press & Media

Events

Careers

Legal

Contact us

Press & Media

Events

Careers

Legal

Contact us

Catalog

Explorer

Marketplace

Governance