> ## Documentation Index
> Fetch the complete documentation index at: https://camelai.com/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Knowledge Base Guide

> Learn how to configure and optimize your Knowledge Base for better camelAI performance

<Warning>
  **camelAI Legacy Product** — This documentation covers camelAI's embedded analytics offering, which is no longer being actively developed. We are migrating existing customers to the new camelAI platform. For the current product, visit [camelAI](https://camelai.dev).
</Warning>

The Knowledge Base is a critical feature that enhances camelAI's ability to understand and analyze your data accurately. It provides context-specific information that helps camelAI deliver consistent, relevant insights tailored to your business domain.

## What is the Knowledge Base?

The knowledge base is a text area where you define important context about your data, business logic, and terminology. This information helps camelAI:

* Maintain consistent metric definitions across all queries
* Navigate complex schemas by understanding which tables to prioritize or avoid
* Interpret ambiguous column names and relationships
* Apply proper data formatting and display preferences
* Handle time periods and date calculations correctly
* Understand locale-specific requirements (currency, language, regional formats)

## Persistent vs Session-Specific Knowledge Base Entries

CamelAI supports two types of knowledge base entries, each designed for different use cases:

### Persistent (Stateful) Entries

Persistent entries are created through the `/api/v1/knowledge-base/` API endpoint and are tied to your connection IDs. These entries:

* Persist across all iframes that use the associated connection ID
* Apply globally to all users and sessions
* Are ideal for context that applies universally across your organization

Use persistent entries for:

* Dataset descriptions and schema information
* Company-wide terminology and metric definitions
* Standard table relationships and joins
* Data quality notes that affect all users

### Session-Specific (Stateless) Entries

Session-specific entries are provided directly in the iframe creation request via the `knowledge_base_entries` parameter. These entries:

* Only apply to that specific iframe instance
* Do not persist beyond the iframe's lifecycle
* Work alongside any persistent entries you've already created

Use session-specific entries for:

* User-specific instructions (e.g., "This user prefers non-technical explanations")
* Organization-specific context when serving multiple tenants
* Temporary overrides or custom behavior for specific sessions
* Locale preferences (e.g., "Please respond in Spanish")

### Example: Using Session-Specific Entries

When creating an iframe, you can include temporary knowledge base entries that apply only to that session:

```python theme={null}
import requests

payload = {
    "uid": "<string>",
    "srcs": ["<string>"],
    "ttl": 900,
    "knowledge_base_entries": [
        "This user is new to the tool and has never used SQL before. Please keep answers non-technical",
        "Please speak in Spanish"
    ],
    "model": "gpt-5",
    "response_mode": "full",
    "show_sidebar": True
}

response = requests.post(
    "https://api.camelai.com/api/v1/iframe/create",
    headers={
        "Authorization": "Bearer <token>",
        "Content-Type": "application/json"
    },
    json=payload
)
```

These session-specific entries complement (not replace) any persistent knowledge base entries associated with your connection IDs.

## Best Practices

### 1. Always Include a Dataset Description

Every knowledge base should start with a clear description of what your dataset represents. This foundational context helps camelAI understand the overall purpose and structure of your data.

<CodeGroup>
  ```text Example theme={null}
  This dataset is a replica of our production e-commerce database. 
  It contains customer orders, product inventory, and shipping information 
  from our online retail platform serving the US market.
  ```
</CodeGroup>

### 2. Specify Standard Schemas

If your data follows a well-known schema or is a replica of a standard system, explicitly state this. camelAI can leverage its understanding of common schemas to provide better insights.

<Note>
  **Examples of standard schemas to mention:**

  * "This PostgreSQL database mirrors our Salesforce CRM data structure"
  * "Our MySQL database follows the Shopify schema for e-commerce data"
  * "This dataset implements the FHIR standard for healthcare records"
  * "Our analytics tables follow the Google Analytics 4 event schema"
</Note>

### 3. Define Company-Specific Terminology

Document any terms that have specific meanings within your organization, especially when they differ from industry standards or could be ambiguous.

<Warning>
  We recommend using multiple focused entries to improve RAG performance.
</Warning>

<CodeGroup>
  ```text Example theme={null}
  Entry 1: "Active User": A user who has logged in within the last 30 days AND completed at least one transaction
  Entry 2: "LTV" (Lifetime Value): The sum of total_purchases + subscription_revenue + addon_revenue columns
  Entry 3: "Churn": When a customer has no activity for 90+ days (not the standard 30-day definition)
  Entry 4: "Region": Refers to our custom sales territories, not geographic regions (see regions_mapping table)
  ```
</CodeGroup>

### 4. Clarify Complex Relationships

Help camelAI navigate joins and relationships by explaining non-obvious connections between tables.

<CodeGroup>
  ```text Example theme={null}
  Table Relationships:
  - orders.user_id links to users.id (primary relationship)
  - orders.promo_code links to both promotions.code AND partner_promotions.code
  - product_variants should be used instead of products table for inventory queries
  - Always join transactions through transaction_items, never directly to orders
  ```
</CodeGroup>

### 5. Specify Data Preferences

Include preferences for how data should be formatted, calculated, or displayed.

<CodeGroup>
  ```text Example theme={null}
  Data Handling Preferences:
  - When calculating percentages, round to 1 decimal place
  - Week starts on Monday for all weekly aggregations
  - Fiscal year begins April 1st
  ```
</CodeGroup>

### 6. Document Data Quality Issues

Be transparent about known data limitations or quality issues to prevent misleading analyses.

<CodeGroup>
  ```text Example theme={null}
  Data Quality Notes:
  - Revenue data before March 2022 may be incomplete due to migration
  - The user_demographics table has ~15% missing values for age field
  - Avoid using the legacy_orders table - use orders_v2 instead
  - Product categories were restructured in June 2023; use category_mapping for historical comparisons
  ```
</CodeGroup>

## Structuring Knowledge Base Entries

### Use Multiple Focused Entries

Due to RAG implementation, multiple smaller, focused entries perform better than one large entry.

<Tabs>
  <Tab title="✅ Good Practice">
    ```text theme={null}
    Entry 1: "Customer segments: Premium (>$1000/year), Standard ($100-999), Basic (<$100)"
    Entry 2: "Subscription tiers: Starter ($29/mo), Professional ($99/mo), Enterprise (custom)"
    Entry 3: "Geographic regions: NA (US/Canada), EU (European Union), APAC (Asia-Pacific)"
    ```
  </Tab>

  <Tab title="❌ Poor Practice">
    ```text theme={null}
    Customer and Business Context: Our customer segments include Premium customers who spend over $1000 per year, Standard customers who spend $100-999 annually,
    and Basic customers under $100. We also have subscription tiers with Starter at $29/month, Professional at $99/month, and Enterprise with custom pricing. 
    Our geographic regions cover NA which includes US and Canada, EU covering the European Union, and APAC for Asia-Pacific. Additionally, our fiscal year 
    starts April 1st, we calculate LTV as total_purchases plus subscription_revenue plus addon_revenue, and active users must have logged in within 30 days AND
    completed a transaction. Our churn definition is 90+ days of inactivity, and regions refer to sales territories not geographic areas.
    ```
  </Tab>
</Tabs>

## Managing Your Knowledge Base

You can create, read, update, and delete knowledge base entries through the API or through the developer console. Changes take effect immediately for all new conversations.
