> ## Documentation Index
> Fetch the complete documentation index at: https://camelai.com/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Reference Query Guide

> Learn how to create and use Reference Queries to teach camelAI your business metrics and SQL patterns

<Warning>
  **camelAI Legacy Product** — This documentation covers camelAI's embedded analytics offering, which is no longer being actively developed. We are migrating existing customers to the new camelAI platform. For the current product, visit [camelAI](https://camelai.dev).
</Warning>

Reference Queries are pre-defined SQL queries that teach camelAI how to calculate your business metrics correctly. They serve as building blocks that camelAI can reference, adapt, and combine to answer complex questions about your data.

## What are Reference Queries?

Reference Queries are SQL queries stored in a vectorized database that camelAI searches when processing user questions. Unlike the Knowledge Base which stores textual context, Reference Queries provide actual SQL implementations of your business logic.

## Persistent vs Session-Specific Reference Queries

CamelAI supports two types of reference queries, each designed for different use cases:

### Persistent (Stateful) Queries

Persistent queries are created through the `/api/v1/reference-queries/` API endpoint and are tied to your connection IDs. These queries:

* Persist across all iframes that use the associated connection ID
* Apply globally to all users and sessions
* Are ideal for metrics and patterns that apply universally across your organization

Use persistent queries for:

* Standard business metrics and KPIs
* Dashboard queries used by all users
* Common table joins and relationships
* Shared calculation patterns

### Session-Specific (Stateless) Queries

Session-specific queries are provided directly in the iframe creation request via the `reference_queries` parameter. These queries:

* Only apply to that specific iframe instance
* Do not persist beyond the iframe's lifecycle
* Work alongside any persistent queries you've already created

Use session-specific queries for:

* User-specific data filters (e.g., queries scoped to a specific organization or user)
* Temporary reference data for specific sessions
* Custom calculations for individual iframes
* Multi-tenant scenarios where each tenant needs unique query patterns

### Example: Using Session-Specific Reference Queries

When creating an iframe, you can include temporary reference queries that apply only to that session:

```python theme={null}
import requests

payload = {
    "uid": "<string>",
    "srcs": ["<string>"],
    "ttl": 900,
    "reference_queries": [
        {
            "title": "Recent Companies",
            "query": "SELECT * FROM companies ORDER BY batch DESC LIMIT 500;"
        },
        {
            "title": "Company Status Summary",
            "query": "SELECT status, COUNT(*) AS company_count FROM companies GROUP BY status ORDER BY company_count DESC;"
        }
    ],
    "model": "gpt-5",
    "response_mode": "full",
    "show_sidebar": True
}

response = requests.post(
    "https://api.camelai.com/api/v1/iframe/create",
    headers={
        "Authorization": "Bearer <token>",
        "Content-Type": "application/json"
    },
    json=payload
)
```

These session-specific queries complement (not replace) any persistent reference queries associated with your connection IDs.

## Why Use Reference Queries?

Reference Queries help camelAI:

<CardGroup cols={2}>
  <Card title="Standardize Metrics" icon="chart-line">
    Ensure consistent metric calculations across all analyses
  </Card>

  <Card title="Navigate Schema" icon="sitemap">
    Handle complex schema relationships through example joins
  </Card>

  <Card title="Replicate Logic" icon="copy">
    Maintain consistency with existing dashboard logic
  </Card>

  <Card title="Business Rules" icon="rules">
    Handle intricate business rules that require specific SQL patterns
  </Card>
</CardGroup>

* **Accelerate query generation** by providing proven templates

## When to Use Reference Queries vs Knowledge Base

<Tabs>
  <Tab title="Reference Queries">
    **Use Reference Queries for:**

    * Complex SQL logic with multiple joins
    * Specific metric calculations
    * Dashboard query replication
    * Table relationship examples
  </Tab>

  <Tab title="Knowledge Base">
    **Use Knowledge Base for:**

    * General context and definitions
    * Information that can't be expressed as queries
    * Critical business rules that must always be considered
    * Data formatting preferences
  </Tab>
</Tabs>

## Best Practices

### 1. Use SQL Comments for Context

Comments in your SQL queries are included when we vectorize them, improving search accuracy:

```sql theme={null}
-- Title: "Monthly Recurring Revenue (MRR)"
SELECT 
    DATE_TRUNC('month', subscription_date) as month,
    -- Convert all subscription types to monthly equivalent
    SUM(CASE 
        WHEN billing_period = 'monthly' THEN amount
        WHEN billing_period = 'yearly' THEN amount / 12
    END) as mrr
FROM subscriptions
WHERE status = 'active'  -- Only active subscriptions count toward MRR
GROUP BY 1;
```

<Info>
  The comments help camelAI understand not just what the query does, but why certain logic is applied.
</Info>

### 2. Replicate Dashboard Queries

If your users already have dashboards, add those queries as references to ensure consistency:

```sql theme={null}
-- Title: "Customer Churn Rate"
SELECT 
    COUNT(CASE WHEN last_order < CURRENT_DATE - 90 THEN 1 END) / 
    COUNT(*) * 100 as churn_rate
FROM customers;
```

### 3. Document Complex Joins

Show camelAI how your tables relate through example queries:

```sql theme={null}
-- Title: "Customer Order Summary"
SELECT c.email, COUNT(o.order_id) as order_count, SUM(o.total) as lifetime_value
FROM customers c
JOIN orders o ON c.customer_id = o.customer_id
WHERE o.status = 'completed'
GROUP BY c.email;
```

### 4. Include Business Logic

Capture complex business rules in your reference queries:

```sql theme={null}
-- Title: "Active Users"
-- Active = logged in within 30 days AND has valid subscription
SELECT COUNT(DISTINCT user_id) as active_users
FROM users
WHERE last_login >= CURRENT_DATE - 30
  AND subscription_status = 'active';
```

### 5. Title your Reference Queries

Titles of queries are optional fields, but we strongly recommend creating a descriptive, to the point title. It strongly improves RAG performance.

<Tip>
  Use language your users will use to ask questions.
</Tip>

**Example:** If your dashboard shows "Customer Acquisition Cost", title your reference query exactly that way, not "CAC Calculation" or "Marketing Efficiency Query".

**Good Titles:**

* "Weekly Active Users"
* "Customer Churn Rate"
* "Average Order Value by Region"
* "Product Performance Metrics"

**Poor Titles:**

* "Query\_1"
* "complicated\_join\_v2"
* "SELECT statement for users"

## Ideas for Getting Started

<Steps>
  <Step title="Start with Core Metrics">
    Add queries for your 5-10 most important metrics or user questions
  </Step>

  <Step title="Add Dashboard Queries">
    Include all queries powering existing dashboards
  </Step>

  <Step title="Document Relationships">
    Add example joins between commonly used tables
  </Step>

  <Step title="Include Edge Cases">
    Add queries handling special business logic
  </Step>

  <Step title="Iterate Based on Usage">
    Monitor what users ask and add missing patterns
  </Step>
</Steps>
