Prerequisites

Before you begin, you’ll need:

  • An existing dbt project with well-structured data models
  • A Buster account (get started at buster.so)
If you use SQLMesh, AtScale, or Cube, please contact us for personalized assistance. If you don’t use any data modeling tool, you’ll need to follow our manual onboarding process.

Step 1: Install the CLI

brew tap buster-so/buster
brew install buster
Don’t use Brew? Check out the installation guide for other options.

Step 2: Authenticate with Buster

Before you can use Buster, you’ll need to authenticate with your Buster account:

buster auth

This command will prompt you for an API key. You can find it in the Buster platform.

If you are using Buster locally, you can use the --local flag to authenticate with your local Buster instance.

Step 3: Initialize Your Project & Connect Data Source

Let’s initialize your Buster project and connect to your data source in one step using the buster init command:

buster init

This interactive command:

  1. Asks you to select your data warehouse type (Postgres, BigQuery, Snowflake, etc.)
  2. Prompts for connection details like host, port, credentials, etc.
  3. Tests the connection to ensure everything works
  4. Detects dbt project configurations if present
    • Recognizes your dbt project structure
    • Finds model paths automatically
    • Discovers your dbt catalog for semantic model generation
  5. Creates a buster.yml file with your project configuration:
projects:
  - data_source_name: demo_db
    schema: analytics
    database: buster
    model_paths:
      - models
    semantic_model_paths:
      - models

The command handles both project initialization and data source onboarding in a single workflow, making setup much easier.

See our Data Sources guide for specific database connection instructions and our Init Command documentation for more details on the setup process.

Step 4: (Optional) Copy the AI Agent Documentation

This documentation provides comprehensive guidance for AI agents (like Claude, GPT, Cursor, etc.) when working with Buster’s semantic layer and configuration files. If you plan to use AI agents to help you build or manage your Buster project, you may find it useful to familiarize yourself with these guidelines.

You can find the AI Agent Documentation here: AI Agent Documentation.

Step 5: Create Semantic Models

The buster init command (from Step 3) can generate your initial set of semantic models. However, if you skipped that part or need to generate base semantic models for new dbt models added after initialization, you can use the buster generate command. This command analyzes your dbt project’s catalog to create the foundational semantic model files.

buster generate

Buster analyzes your SQL models (like the examples below for models/orders.sql and models/customers.sql) and creates semantic model files for each one:

-- Example model: models/orders.sql
WITH base_orders AS (
    SELECT * FROM {{ ref('stg_orders') }} -- Assuming stg_orders has order_id, customer_id, order_date, amount
),
customers AS (
    SELECT id AS customer_id, user_created_at FROM {{ ref('customers') }} -- Use customers model
)
SELECT
    bo.order_id,
    bo.customer_id,
    bo.order_date,
    bo.amount AS order_amount,
    (bo.order_date <= DATE_ADD(c.user_created_at, INTERVAL 30 DAY)) AS ordered_within_30_days_of_signup -- Use DATE_ADD if needed
FROM
    base_orders bo
LEFT JOIN
    customers c ON bo.customer_id = c.customer_id
-- Example model: models/customers.sql
SELECT
    id,                     -- Primary key for customers
    name,                   -- Customer's full name
    email,                  -- Customer's email address
    user_created_at,        -- Timestamp of account creation
    EXTRACT(YEAR FROM user_created_at) AS signup_year, -- Year of account creation
    country                 -- Customer's country
FROM
    {{ ref('stg_customers') }} -- Assumes stg_customers provides these base columns

These semantic model files add business context to your SQL models. Buster will create a separate YAML file for each model (for example, models/customers.yml shown below based on models/customers.sql):

# models/customers.yml
name: customers
description: >
  Represents individual customer entities. Contains contact information, registration details,
  location, and derived metrics like lifetime value. This is a central model for
  understanding customer attributes and behavior segments. Essential for joining customer
  data with transactions, support interactions, etc.
dimensions:
  - name: id
    description: >
      The unique identifier for a customer, typically corresponds to the primary key
      in the source customer table. Essential for joining with event models like orders.
    type: string
  - name: name
    description: The full name of the customer. Used for display and personalization.
    type: string
  - name: email
    description: >
      The customer's primary email address. Used for communication, login identification,
      and potentially linking across different systems. Assumed to be unique per customer.
    type: string
  - name: signup_year
    description: >
      The year the customer first registered or created their account. Useful for cohort
      analysis based on tenure (e.g., comparing behavior of 2022 vs 2023 signups).
    type: integer
  - name: user_created_at
    description: >
      The precise date and time (UTC recommended) when the customer account was created.
      Provides granular detail for time-based cohorting or analyzing initial user activity.
    type: timestamp
    searchable: true
  - name: country
    description: >
      The country where the customer is located, typically derived from registration
      or geo-IP lookup. Often stored as ISO 3166-1 alpha-2 code (e.g., 'US', 'GB'). Used for
      geographical segmentation and analysis.
    type: string
metrics:
  - name: lifetime_value
    description: >
      The total cumulative revenue generated from this customer across all their orders.
      Represents the historical monetary value of the customer. Assumes 'amount' is available from a joined 'orders' model.
    expr: "SUM(orders.amount)" # Requires 'orders' relationship
  - name: average_order_value_over_50
    description: >
      The average value of orders placed by this customer where the individual order amount
      exceeded $50. Requires joining with the 'orders' model via the 'orders' relationship.
    expr: "AVG(CASE WHEN orders.amount > 50 THEN orders.amount ELSE NULL END)" # Requires 'orders' relationship
relationships:
  - name: orders # Link TO orders model
    source_col: id           # Key in customers (current model)
    ref_col: customer_id    # Key in orders (related model)
    description: >
      Links this customer to all their associated order records in the 'orders' model.
      Enables analysis of purchasing history and behavior for each customer.
    cardinality: one-to-many # One customer can have many orders
    type: LEFT

Step 6: Deploy Your Semantic Layer

Once you’ve created and configured your semantic models, deploy them to make them available for querying:

buster deploy

This command:

  1. Validates your semantic models
  2. Deploys them to your Buster instance
  3. Makes them immediately available for querying

Step 7: Chat with Your Data

Congrats! You can now chat with your data in the Buster. You can ask ad-hoc questions, generate reports and dashboards, and more.