Generate Semantic Models

The buster generate command creates or updates semantic model YAML definitions from your dbt project. This powerful automation tool significantly reduces the manual work required to set up and maintain your semantic layer.

Basic Usage

buster generate

Command Options

OptionDescription
--path PATHOptional path to a specific dbt model .sql file or directory of models to process. If not provided, uses the model_paths in buster.yml.
--output-file FILE, -o FILEOptional path to the semantic model YAML file to update. If not provided, uses semantic_models_file from buster.yml.

How It Works

The generate command:

  1. Checks if a buster.yml file exists in your current directory
  2. Runs dbt docs generate to refresh the dbt catalog (if you confirm)
  3. Parses the catalog.json file to extract model metadata
  4. Intelligently creates dimensions and measures based on column data types
  5. Generates YAML semantic model files that are compatible with Buster

Catalog Processing

The command processes your dbt catalog by:

  1. Finding all SQL files in specified model paths
  2. Matching each SQL file with its corresponding entry in the catalog
  3. Extracting metadata like column names, types, and descriptions
  4. Generating semantic model YAML content for each model

Type Classification

The command uses these rules to classify columns:

  • Measures are created for:

    • Integer types (int, numeric, decimal)
    • Floating-point types (real, double, float)
    • Money types (money, number)
  • Dimensions are created for other types:

    • String types
    • Date/time types
    • Boolean types
    • And any other non-numeric types

Output Modes

The command supports two modes for placing generated YAML files:

  1. Side-by-side: Places YAML files next to their corresponding SQL files (default)
  2. Dedicated Directory: Places all YAML files in a specified directory structure

When using side-by-side mode, the command automatically updates your .dbtignore file to exclude the generated YAML files from dbt processing.

Examples

Basic Generation

Generate semantic models using paths from buster.yml:

buster generate

Process Specific Path

Generate models for SQL files in a specific directory:

buster generate --path models/marketing

Specify Output File

Generate models and place them in a specific file:

buster generate --output-file semantic_models/marketing.yml

Updating Existing Models

When you run the command on SQL files that already have corresponding YAML models:

  • New columns will be added as dimensions or measures
  • Existing dimensions and measures will be preserved with their current settings
  • The command will not remove custom configurations like metrics or relationships

This makes it safe to run the command regularly as your dbt models evolve.

Integration with buster init

The generate command is integrated with the init command, which can automatically generate semantic models during the project initialization process.

Example Output

Here’s an example of what a generated semantic model looks like:

name: orders
description: Customer order data
database: analytics
schema: public
dimensions:
  - name: order_id
    description: Unique identifier for the order
    type: string
  - name: customer_id
    description: Customer who placed the order
    type: string
  - name: order_date
    description: Date the order was placed
    type: timestamp
measures:
  - name: total_amount
    description: Total order amount
    type: number
  - name: item_count
    description: Number of items in the order
    type: integer

Success Metrics

After running the command, you’ll see a summary of:

  • Number of SQL models processed
  • Number of new semantic models generated
  • Number of existing models updated
  • Number of columns added, updated, or removed

Best Practices

  1. Run Regularly: Use the command whenever your dbt models change
  2. Customize After Generation: Add metrics, filters, and relationships after the initial generation
  3. Version Control: Commit both SQL and YAML files to track changes over time
  4. Validate Models: Run buster parse after generation to ensure all models are valid