Prerequisites
Before you begin, you’ll need:
- An existing dbt project with well-structured data models
- A Buster account (get started at buster.so)
If you use SQLMesh, AtScale, or Cube, please
contact us for personalized assistance. If you don’t use any data modeling tool, you’ll need to follow our
manual onboarding process.
Step 1: Install the CLI
brew tap buster-so/buster
brew install buster
Step 2: Authenticate with Buster
Before you can use Buster, you’ll need to authenticate with your Buster account:
This command will prompt you for an API key. You can find it in the Buster platform.
If you are using Buster locally, you can use the --local
flag to authenticate with your local Buster instance.
Step 3: Initialize Your Project & Connect Data Source
Let’s initialize your Buster project and connect to your data source in one step using the buster init
command:
This interactive command:
- Asks you to select your data warehouse type (Postgres, BigQuery, Snowflake, etc.)
- Prompts for connection details like host, port, credentials, etc.
- Tests the connection to ensure everything works
- Detects dbt project configurations if present
- Recognizes your dbt project structure
- Finds model paths automatically
- Discovers your dbt catalog for semantic model generation
- Creates a
buster.yml
file with your project configuration:
projects:
- data_source_name: demo_db
schema: analytics
database: buster
model_paths:
- models
semantic_model_paths:
- models
The command handles both project initialization and data source onboarding in a single workflow, making setup much easier.
See our Data Sources guide for specific database connection instructions and our Init Command documentation for more details on the setup process.
Step 4: (Optional) Copy the AI Agent Documentation
This documentation provides comprehensive guidance for AI agents (like Claude, GPT, Cursor, etc.) when working with Buster’s semantic layer and configuration files. If you plan to use AI agents to help you build or manage your Buster project, you may find it useful to familiarize yourself with these guidelines.
You can find the AI Agent Documentation here: AI Agent Documentation.
Step 5: Create Semantic Models
The buster init
command (from Step 3) can generate your initial set of semantic models. However, if you skipped that part or need to generate base semantic models for new dbt models added after initialization, you can use the buster generate
command. This command analyzes your dbt project’s catalog to create the foundational semantic model files.
Buster analyzes your SQL models (like the examples below for models/orders.sql
and models/customers.sql
) and creates semantic model files for each one:
-- Example model: models/orders.sql
WITH base_orders AS (
SELECT * FROM {{ ref('stg_orders') }} -- Assuming stg_orders has order_id, customer_id, order_date, amount
),
customers AS (
SELECT id AS customer_id, user_created_at FROM {{ ref('customers') }} -- Use customers model
)
SELECT
bo.order_id,
bo.customer_id,
bo.order_date,
bo.amount AS order_amount,
(bo.order_date <= DATE_ADD(c.user_created_at, INTERVAL 30 DAY)) AS ordered_within_30_days_of_signup -- Use DATE_ADD if needed
FROM
base_orders bo
LEFT JOIN
customers c ON bo.customer_id = c.customer_id
-- Example model: models/customers.sql
SELECT
id, -- Primary key for customers
name, -- Customer's full name
email, -- Customer's email address
user_created_at, -- Timestamp of account creation
EXTRACT(YEAR FROM user_created_at) AS signup_year, -- Year of account creation
country -- Customer's country
FROM
{{ ref('stg_customers') }} -- Assumes stg_customers provides these base columns
These semantic model files add business context to your SQL models. Buster will create a separate YAML file for each model (for example, models/customers.yml
shown below based on models/customers.sql
):
# models/customers.yml
name: customers
description: >
Represents individual customer entities. Contains contact information, registration details,
location, and derived metrics like lifetime value. This is a central model for
understanding customer attributes and behavior segments. Essential for joining customer
data with transactions, support interactions, etc.
dimensions:
- name: id
description: >
The unique identifier for a customer, typically corresponds to the primary key
in the source customer table. Essential for joining with event models like orders.
type: string
- name: name
description: The full name of the customer. Used for display and personalization.
type: string
- name: email
description: >
The customer's primary email address. Used for communication, login identification,
and potentially linking across different systems. Assumed to be unique per customer.
type: string
- name: signup_year
description: >
The year the customer first registered or created their account. Useful for cohort
analysis based on tenure (e.g., comparing behavior of 2022 vs 2023 signups).
type: integer
- name: user_created_at
description: >
The precise date and time (UTC recommended) when the customer account was created.
Provides granular detail for time-based cohorting or analyzing initial user activity.
type: timestamp
searchable: true
- name: country
description: >
The country where the customer is located, typically derived from registration
or geo-IP lookup. Often stored as ISO 3166-1 alpha-2 code (e.g., 'US', 'GB'). Used for
geographical segmentation and analysis.
type: string
metrics:
- name: lifetime_value
description: >
The total cumulative revenue generated from this customer across all their orders.
Represents the historical monetary value of the customer. Assumes 'amount' is available from a joined 'orders' model.
expr: "SUM(orders.amount)" # Requires 'orders' relationship
- name: average_order_value_over_50
description: >
The average value of orders placed by this customer where the individual order amount
exceeded $50. Requires joining with the 'orders' model via the 'orders' relationship.
expr: "AVG(CASE WHEN orders.amount > 50 THEN orders.amount ELSE NULL END)" # Requires 'orders' relationship
relationships:
- name: orders # Link TO orders model
source_col: id # Key in customers (current model)
ref_col: customer_id # Key in orders (related model)
description: >
Links this customer to all their associated order records in the 'orders' model.
Enables analysis of purchasing history and behavior for each customer.
cardinality: one-to-many # One customer can have many orders
type: LEFT
Step 6: Deploy Your Semantic Layer
Once you’ve created and configured your semantic models, deploy them to make them available for querying:
This command:
- Validates your semantic models
- Deploys them to your Buster instance
- Makes them immediately available for querying
Step 7: Chat with Your Data
Congrats! You can now chat with your data in the Buster. You can ask ad-hoc questions, generate reports and dashboards, and more.