Creating Agents

An agent is a YAML configuration file that defines when and how to automate a specific data engineering task. Agents can respond to pull requests, run on schedules, react to data stack events, and take actions like creating PRs, posting comments, or sending notifications. This guide covers everything you need to create, configure, and deploy effective agents for your workflows.

Getting Started with Agents

To create an agent, you need three core elements: a name, at least one trigger, and a prompt. The simplest agent looks like this:

agent.yaml

name: my-first-agent
triggers:
  - type: pull_request
prompt: |
  Review this PR for dbt best practices.
  Comment if you find any issues.

Start with a single, specific task. You can always create additional agents for other workflows.

Reference

This section contains the complete reference for agent configuration files.

Core Fields

name

string

required

The unique identifier for your agent. Used in logs, the web interface, and file naming.Must be lowercase with hyphens (kebab-case). Example: schema-change-handler

description

string

A brief explanation of what the agent does. Appears in the agent list and helps team members understand the agent’s purpose.Example: Adapts staging models to source schema changes

triggers

array

required

Defines when the agent runs. You can specify multiple triggers for the same agent.

Show Triggers

type

"pull_request" | "scheduled" | "event" | "manual"

required

The trigger type determines when the agent activates.

pull_request: Runs when PRs are opened, updated, or commented on
scheduled: Runs on a recurring schedule using cron syntax
event: Runs when specific events occur in your data stack
manual: Only runs when explicitly triggered via API or web interface

cron

string

Cron expression for scheduled triggers. Required when type is scheduled.Example: "0 9 * * *" runs daily at 9 AM.

Use crontab.guru to validate your cron expressions.

timezone

string

Timezone for scheduled triggers. Uses IANA timezone names. Defaults to UTC.Example: "America/New_York"

event_name

string

The specific event to listen for. Required when type is event.Common events:

schema_change_detected
data_quality_failure
model_build_failed

See the Triggers & Scheduling guide for a complete list of event types and configuration options.

source

string

The system or integration that emits the event. Required when type is event.Examples: fivetran, airbyte, dbt_cloud, custom

filters

object

Additional conditions that must be met for the trigger to activate.

Show Filters

schema

string

Filter events by schema name. Example: raw.salesforce

table

string

Filter events by table name. Example: accounts

branch

string

Filter PR triggers by branch pattern. Example: main or feature/*

paths

array

Filter PR triggers by changed file paths. Example: ["models/staging/", "models/marts/"]

prompt

string

required

Instructions for the agent. Write this like you’re instructing a colleague—be clear about goals but let the agent determine implementation details.The agent has access to:

Your repository files (can read, analyze, and modify)
Your data warehouse (can run queries and analyze data)
dbt project metadata (models, tests, documentation)
Git operations (can create branches, commits, and PRs)

Structure your prompt with clear steps when you want a specific workflow. Use open-ended instructions when you want the agent to figure out the best approach.

Actions and Notifications

actions

object

Define what the agent can do when it runs. By default, agents can create PRs, post comments, and make file changes.

Show Actions

create_pr

boolean

Whether the agent can create pull requests. Defaults to true.

comment

boolean

Whether the agent can post comments on PRs or issues. Defaults to true.

commit

boolean

Whether the agent can make direct commits to branches. Defaults to true.

Disable this for agents that should only suggest changes, not make them directly.

run_sql

boolean

Whether the agent can execute SQL queries against your warehouse. Defaults to true.

notifications

object

Configure where and how the agent sends notifications about its activities.

Show Notifications

slack

object

Send notifications to Slack channels.

Show Slack

channel

string

required

Slack channel ID or name. Example: #data-alerts or C01234ABCDE

on_success

boolean

Send notification when agent completes successfully. Defaults to true.

on_failure

boolean

Send notification when agent encounters an error. Defaults to true.

template

string

Custom message template using variables like {agent_name}, {trigger}, {outcome}.

object

Send email notifications to specified addresses.

Show Email

array

required

List of email addresses. Example: ["[email protected]"]

on_success

boolean

Send email when agent completes successfully. Defaults to false.

on_failure

boolean

Send email when agent encounters an error. Defaults to true.

Restrictions and Permissions

restrictions

object

Limit what the agent can access and modify. Useful for constraining agents to specific parts of your project.

Show Restrictions

files

object

Control which files the agent can read and modify.

Show Files

allow

array

List of paths the agent can access. Supports glob patterns.Example: ["models/staging/", "models/marts/sales/"]

deny

array

List of paths the agent cannot access. Takes precedence over allow.Example: ["models/sensitive/", "*.env"]

read_only

array

List of paths the agent can read but not modify.Example: ["dbt_project.yml", "profiles.yml"]

git_operations

object

Control what Git operations the agent can perform.

Show Git Operations

can_push

boolean

Whether the agent can push commits. Defaults to true.

can_create_branch

boolean

Whether the agent can create new branches. Defaults to true.

branch_prefix

string

Required prefix for any branches the agent creates.Example: "agent/" results in branches like agent/schema-update

sql

object

Control what SQL operations the agent can execute.

Show SQL

read_only

boolean

Restrict agent to SELECT queries only. Defaults to true.

Only disable this if your agent needs to create or modify tables. Most agents should stay read-only.

allowed_schemas

array

List of schemas the agent can query.Example: ["raw", "staging", "analytics"]

max_query_time

integer

Maximum query execution time in seconds. Defaults to 300.

Context and Configuration

context

object

Provide additional information and configuration to the agent.

Show Context

files

array

List of files to always include in the agent’s context. Useful for style guides, conventions, or reference documentation.Example: [".github/STYLE_GUIDE.md", "docs/conventions.md"]

variables

object

Custom variables accessible in the prompt using {variable_name} syntax.

context:
  variables:
    team_slack: "#data-team"
    review_threshold: 3

examples

array

Example scenarios to help guide the agent’s behavior. Each example should show input and expected output.

enabled

boolean

Whether the agent is active. Set to false to temporarily disable without deleting. Defaults to true.

timeout

integer

Maximum execution time in seconds before the agent is stopped. Defaults to 900 (15 minutes).

Examples

Schema Change Handler
PR Code Review
Scheduled Documentation Audit
Downstream Impact Analyzer

schema-change-handler.yaml

name: schema-change-handler
description: Adapts staging models to source schema changes

triggers:
  - type: event
    event_name: schema_change_detected
    source: fivetran
    filters:
      schema: raw.salesforce

prompt: |
  A schema change was detected in raw.salesforce.

  1. Check what changed (new columns, removed columns, renamed tables)
  2. Find the corresponding staging model in models/staging/salesforce/
  3. Update the model to match our naming conventions:
     - Snake case for all columns
     - Prefix boolean columns with "is_" or "has_"
     - Use full words, no abbreviations
  4. Run dbt parse to validate the changes
  5. Create a PR titled "fix(staging): Adapt to Salesforce schema change"

  In the PR description, list what changed and why you made each update.
  If a column was removed, explain the impact on downstream models.

restrictions:
  files:
    allow: ["models/staging/salesforce/"]
  git_operations:
    branch_prefix: "agent/schema-change-"

notifications:
  slack:
    channel: "#data-alerts"
    on_success: true
    on_failure: true

Best Practices

Writing Effective Prompts

Be specific about the desired outcome

Instead of “check for issues,” specify what issues to look for and what to do when found. Include examples of good and bad patterns when relevant.

# Vague
prompt: Check this PR

# Specific
prompt: |
  Check if new columns in staging models follow our naming convention:
  - Boolean columns must start with "is_" or "has_"
  - Date columns must end with "_date" or "_at"
  - Amount columns must end with "_amount" or "_value"

Provide context and examples

Reference your team’s style guides, documentation standards, or example implementations. The agent can read these files and apply the patterns consistently.

context:
  files:
    - ".github/DBT_STYLE_GUIDE.md"
    - "models/_example_mart.sql"

Structure complex prompts with numbered steps

For multi-step workflows, use numbered lists. This helps the agent understand the sequence and dependencies between actions.

prompt: |
First, identify all changed models
Then, run dbt parse to validate syntax
If validation passes, check downstream dependencies
Finally, create a summary comment with findings

Specify the output format

Tell the agent how to present results—should it create a PR, post a comment, update a file, or send a notification? Include what information to include.

prompt: |
  Generate a markdown report with:
  - Summary statistics at the top
  - Detailed findings grouped by severity
  - Code snippets showing the issues
  - Suggested fixes for each item
  
  Post this report as a PR comment.

Choosing the Right Trigger

Match your trigger to the workflow pattern:

pull_request: Code review, validation, documentation checks
scheduled: Regular audits, monitoring, reporting
event: Reactive responses to data stack changes
manual: Ad-hoc tasks or workflows that need approval before running

Testing and Iterating

After deploying an agent, monitor its first few runs carefully:

Check the run logs

Go to the Runs page in the web interface to see exactly what the agent did. Review the files it accessed, queries it ran, and decisions it made.

Refine the prompt

If the agent didn’t behave as expected, update the prompt with more specific instructions. Look for places where your instructions were ambiguous or missing important details.

Adjust restrictions if needed

If the agent tried to access files or perform actions it shouldn’t, add restrictions to constrain its behavior. Start permissive and tighten as you understand the agent’s needs.

Monitor outcomes over time

Track how often the agent succeeds vs. fails, and whether its actions are helpful. Adjust the prompt and configuration based on real-world usage patterns.

Every agent run is logged with full transparency. You can always audit what happened and why.

Deployment

Save your agent configuration to .github/buster/agents/ in your repository. The filename should match your agent name:

.github/buster/agents/
  ├── schema-change-handler.yaml
  ├── pr-review.yaml
  └── docs-audit.yaml

Agent configurations must be on your default branch (usually main) to be active. Changes in feature branches won’t trigger agents until merged.

Once pushed to your default branch, agents activate automatically. You’ll see them listed in the web interface and they’ll start responding to their configured triggers.

Introduction

Guides

Examples

Resources

Creating Agents

Getting Started with Agents

Reference

Core Fields

Actions and Notifications

Restrictions and Permissions

Context and Configuration

Examples

Best Practices

Writing Effective Prompts

Choosing the Right Trigger

Testing and Iterating

Deployment

Introduction

Guides

Examples

Resources

​Getting Started with Agents

​Reference

​Core Fields

​Actions and Notifications

​Restrictions and Permissions

​Context and Configuration

​Examples

​Best Practices

​Writing Effective Prompts

​Choosing the Right Trigger

​Testing and Iterating

​Deployment

Getting Started with Agents

Reference

Core Fields

Actions and Notifications

Restrictions and Permissions

Context and Configuration

Examples

Best Practices

Writing Effective Prompts

Choosing the Right Trigger

Testing and Iterating

Deployment