Manual onboarding is necessary if you do not have a dbt project. In this case, you will need to build your Buster project manually. This means you won’t be able to use buster init
, buster generate
, or any command that relies on dbt metadata.
To onboard manually, follow these steps:
Manual Onboarding Steps
-
Connect your data source in the Buster UI:
You can do this at https://platform.buster.so/app/settings/datasources/add.
For more detailed instructions on connecting data sources, please refer to our Data Sources Overview.
-
Build your buster.yml
file:
After connecting your data source, you’ll need to create and configure your buster.yml
file.
-
Organize your project folders:
It’s recommended to create a folder for each database or schema. Each of these will be configured as a separate project for Buster to reference in the buster.yml
file.
-
Create your Semantic Models:
Within each project folder, you will need to create your Semantic Models. These models should align with the tables in your database or data warehouse.
-
Deploy your models:
Once your data source is connected and your buster.yml
and semantic models are configured, you can parse and deploy your project. Use the following commands:
buster parse
buster deploy
Even with manual onboarding, you can still utilize the buster parse
and buster deploy
commands.
Example Project Structure
Here is an example of how you might structure your project:
your-buster-project/
├── buster.yml
├── marketing/ # Folder for the 'marketing_analytics' schema
│ └── semantic_models/
│ ├── customers.yml
│ └── campaigns.yml
├── finance/ # Folder for the 'finance' schema
│ └── semantic_models/
│ ├── transactions.yml
│ └── accounts.yml
└── sales/ # Folder for the 'sales_data' database (if it's a separate data source)
└── semantic_models/
├── orders.yml
└── products.yml
Example buster.yml
This buster.yml
file defines three projects, corresponding to the folders in the example structure above.
projects:
- path: ./marketing
data_source_name: your_data_warehouse # e.g., snowflake_prod, bigquery_main
schema: marketing_analytics
database: main_db # or the relevant database for this schema
semantic_model_paths:
- semantic_models/
- path: ./finance
data_source_name: your_data_warehouse
schema: finance
database: main_db
semantic_model_paths:
- semantic_models/
- path: ./sales # Assuming 'sales' is a different database or requires separate handling
data_source_name: sales_specific_source # Could be the same or different from above
database: sales_data
schema: public # Or the relevant schema for the sales_data database
semantic_model_paths:
- semantic_models/
Example Semantic Models
Here are a couple of simplified semantic model examples that would reside in the respective semantic_models
folders.
marketing/semantic_models/customers.yml
name: customers
description: Model representing customer data from the marketing analytics schema.
# data_source_name, database, and schema will be inherited from buster.yml
dimensions:
- name: customer_id
description: Unique identifier for the customer
type: VARCHAR
- name: email
description: Customer's email address
type: VARCHAR
searchable: true
- name: acquisition_date
description: Date the customer was acquired
type: DATE
measures:
- name: total_spent
description: Total amount spent by the customer
type: DECIMAL # Assuming this comes from a related orders/transactions table
relationships:
- name: campaigns
source_col: customer_id
ref_col: customer_id # Assuming campaigns model has a customer_id
description: Links to marketing campaigns the customer was part of.
finance/semantic_models/transactions.yml
name: transactions
description: Model for financial transactions.
# data_source_name, database, and schema will be inherited from buster.yml
dimensions:
- name: transaction_id
description: Unique identifier for the transaction
type: INTEGER
- name: transaction_date
description: Date of the transaction
type: TIMESTAMP
- name: account_id
description: Identifier for the associated account
type: VARCHAR
measures:
- name: transaction_amount
description: The amount of the transaction
type: DECIMAL
metrics:
- name: average_transaction_value
expr: "SUM(transaction_amount) / COUNT(transaction_id)"
description: Average value per transaction
relationships:
- name: accounts
source_col: account_id
ref_col: account_id # Assuming accounts model has an account_id
description: Links to the accounts model.