Best Practices
Reference Query Guide
Learn how to create and use Reference Queries to teach camelAI your business metrics and SQL patterns
Reference Queries are pre-defined SQL queries that teach camelAI how to calculate your business metrics correctly. They serve as building blocks that camelAI can reference, adapt, and combine to answer complex questions about your data.
What are Reference Queries?
Reference Queries are SQL queries stored in a vectorized database that camelAI searches when processing user questions. Unlike the Knowledge Base which stores textual context, Reference Queries provide actual SQL implementations of your business logic.
Persistent vs Session-Specific Reference Queries
CamelAI supports two types of reference queries, each designed for different use cases:
Persistent (Stateful) Queries
Persistent queries are created through the /api/v1/reference-queries/ API endpoint and are tied to your connection IDs. These queries:
- Persist across all iframes that use the associated connection ID
- Apply globally to all users and sessions
- Are ideal for metrics and patterns that apply universally across your organization
Use persistent queries for:
- Standard business metrics and KPIs
- Dashboard queries used by all users
- Common table joins and relationships
- Shared calculation patterns
Session-Specific (Stateless) Queries
Session-specific queries are provided directly in the iframe creation request via the reference_queries parameter. These queries:
- Only apply to that specific iframe instance
- Do not persist beyond the iframe's lifecycle
- Work alongside any persistent queries you've already created
Use session-specific queries for:
- User-specific data filters (e.g., queries scoped to a specific organization or user)
- Temporary reference data for specific sessions
- Custom calculations for individual iframes
- Multi-tenant scenarios where each tenant needs unique query patterns
Example: Using Session-Specific Reference Queries
When creating an iframe, you can include temporary reference queries that apply only to that session:
import requests
payload = {
"uid": "<string>",
"srcs": ["<string>"],
"ttl": 900,
"reference_queries": [
{
"title": "Recent Companies",
"query": "SELECT * FROM companies ORDER BY batch DESC LIMIT 500;"
},
{
"title": "Company Status Summary",
"query": "SELECT status, COUNT(*) AS company_count FROM companies GROUP BY status ORDER BY company_count DESC;"
}
],
"model": "gpt-5",
"response_mode": "full",
"show_sidebar": True
}
response = requests.post(
"https://api.camelai.com/api/v1/iframe/create",
headers={
"Authorization": "Bearer <token>",
"Content-Type": "application/json"
},
json=payload
)
These session-specific queries complement (not replace) any persistent reference queries associated with your connection IDs.
Why Use Reference Queries?
Reference Queries help camelAI:
Ensure consistent metric calculations across all analyses
Handle complex schema relationships through example joins
Maintain consistency with existing dashboard logic
Handle intricate business rules that require specific SQL patterns
- Accelerate query generation by providing proven templates
When to Use Reference Queries vs Knowledge Base
Reference Queries
Use Reference Queries for:
- Complex SQL logic with multiple joins
- Specific metric calculations
- Dashboard query replication
- Table relationship examples
Knowledge Base
Use Knowledge Base for:
- General context and definitions
- Information that can't be expressed as queries
- Critical business rules that must always be considered
- Data formatting preferences
Best Practices
1. Use SQL Comments for Context
Comments in your SQL queries are included when we vectorize them, improving search accuracy:
-- Title: "Monthly Recurring Revenue (MRR)"
SELECT
DATE_TRUNC('month', subscription_date) as month,
-- Convert all subscription types to monthly equivalent
SUM(CASE
WHEN billing_period = 'monthly' THEN amount
WHEN billing_period = 'yearly' THEN amount / 12
END) as mrr
FROM subscriptions
WHERE status = 'active' -- Only active subscriptions count toward MRR
GROUP BY 1;
2. Replicate Dashboard Queries
If your users already have dashboards, add those queries as references to ensure consistency:
-- Title: "Customer Churn Rate"
SELECT
COUNT(CASE WHEN last_order < CURRENT_DATE - 90 THEN 1 END) /
COUNT(*) * 100 as churn_rate
FROM customers;
3. Document Complex Joins
Show camelAI how your tables relate through example queries:
-- Title: "Customer Order Summary"
SELECT c.email, COUNT(o.order_id) as order_count, SUM(o.total) as lifetime_value
FROM customers c
JOIN orders o ON c.customer_id = o.customer_id
WHERE o.status = 'completed'
GROUP BY c.email;
4. Include Business Logic
Capture complex business rules in your reference queries:
-- Title: "Active Users"
-- Active = logged in within 30 days AND has valid subscription
SELECT COUNT(DISTINCT user_id) as active_users
FROM users
WHERE last_login >= CURRENT_DATE - 30
AND subscription_status = 'active';
5. Title your Reference Queries
Titles of queries are optional fields, but we strongly recommend creating a descriptive, to the point title. It strongly improves RAG performance.
Example: If your dashboard shows "Customer Acquisition Cost", title your reference query exactly that way, not "CAC Calculation" or "Marketing Efficiency Query".
Good Titles:
- "Weekly Active Users"
- "Customer Churn Rate"
- "Average Order Value by Region"
- "Product Performance Metrics"
Poor Titles:
- "Query_1"
- "complicated_join_v2"
- "SELECT statement for users"
Ideas for Getting Started
- Start with Core Metrics
Add queries for your 5-10 most important metrics or user questions
- Add Dashboard Queries
Include all queries powering existing dashboards
- Document Relationships
Add example joins between commonly used tables
- Include Edge Cases
Add queries handling special business logic
- Iterate Based on Usage
Monitor what users ask and add missing patterns
