Skip to main content
Writing good prompts is the key to building reliable agents. This guide covers patterns, anti-patterns, and strategies for crafting prompts that produce consistent, high-quality results.

The Spectrum of Specificity

Prompts can range from highly directive to open-ended. The right level depends on your use case.

Directive Prompts (High Specificity)

Use when you want consistent, predictable behavior and know exactly what should happen.
prompt: |
  When models change:
  1. Profile each model (row count, column types, null rates)
  2. Update the YAML file with this exact format:
     - Model description: "[Purpose] [Grain] [~Row count]"
     - Column descriptions: "[Business meaning] [Data type]"
  3. Run `dbt parse` to validate
  4. If valid: commit with message "docs: Update [model_name]"
  5. If invalid: comment with the exact parse error
When to use directive prompts:
  • Documentation generation with specific formats
  • Code review with defined standards
  • Test generation following rules
  • Compliance checks with requirements

Open-Ended Prompts (Low Specificity)

Use when you want the agent to adapt to different situations and make intelligent decisions.
prompt: |
  Review this PR for data modeling issues.
  
  Consider:
  - Are joins efficient?
  - Is the grain clear?
  - Are naming conventions followed?
  - Will this break downstream models?
  
  Provide specific, actionable feedback.
When to use open-ended prompts:
  • Code review for varied issues
  • Investigating data quality problems
  • Impact analysis
  • Root cause diagnosis

Numbered Steps vs. Natural Language

Use Numbered Steps When:

The workflow is sequential
prompt: |
  1. Extract JSON schema from recent records
  2. Compare with baseline in .buster/schemas/
  3. If changes detected:
     a) Classify severity (critical/high/medium)
     b) Update staging model
     c) Create PR with changes
  4. If no changes: update baseline only
You need consistent ordering
prompt: |
  For each changed model:
  1. First, profile the model  # Must happen first
  2. Then, generate documentation
  3. Finally, validate with dbt parse
The task has clear phases
prompt: |
  Phase 1: Analysis
  1. Identify all changed models
  2. Check for downstream dependencies
  
  Phase 2: Action
  3. Based on findings, create issue OR PR
  4. Send appropriate Slack notification

Use Natural Language When:

The agent needs flexibility
prompt: |
  Investigate why the null rate in orders.revenue increased.
  
  Query recent data, compare with historical patterns, and 
  determine if this is a data quality issue or expected behavior.
  
  If it's a problem, create an issue with your findings and
  suggested remediation steps.
The logic is conditional
prompt: |
  Check for breaking changes in this PR.
  
  If you find removed columns, search for downstream usage
  and assess impact. High-impact changes need immediate
  attention with Slack alerts. Low-impact changes just need
  a comment noting the change.
  
  Use your judgment on severity thresholds.

Common Patterns

Pattern 1: Analyze → Decide → Act

Many successful agents follow this three-phase structure:
prompt: |
  ## Analyze
  Profile the changed models to understand:
  - Row counts and column structure
  - Data types and null rates
  - Common values for categoricals
  
  ## Decide
  Determine what documentation updates are needed:
  - New models: Full documentation
  - Modified models: Update changed columns
  - Deleted models: Remove YAML
  
  ## Act
  Make the appropriate updates:
  - Update YAML files
  - Validate with dbt parse
  - Commit if valid, comment if errors
This pattern works well for most automation tasks.

Pattern 2: Check → Branch

Use conditional branching for different scenarios:
prompt: |
  Check the PR size and complexity.
  
  If PR has >20 changed files OR >1000 lines:
    - Run comprehensive review
    - Check for performance issues
    - Verify test coverage
    - Require manual approval
  
  Else if PR has 5-20 changed files:
    - Run standard review
    - Flag obvious issues
    - Auto-approve if checks pass
  
  Else (small PR):
    - Quick validation only
    - Auto-approve immediately

Pattern 3: Detect → Investigate → Remediate

For responding to problems:
prompt: |
  ## Detect
  Schema change detected in {{event.table}}.
  Change type: {{event.change_type}}
  Affected columns: {{event.columns}}
  
  ## Investigate
  Find which staging models reference this source.
  For each affected model:
  - Will it break with this change?
  - Is the change additive (safe) or breaking (risky)?
  - What downstream models depend on it?
  
  ## Remediate
  Based on severity:
  - Breaking change: Create issue, send urgent Slack alert
  - Additive change: Create PR with updated model, normal notification
  - Metadata only: Log it, no action needed

Anti-Patterns to Avoid

❌ Anti-Pattern 1: Assuming Context

# Bad - Agent doesn't know what "the right format" means
prompt: |
  Update documentation in the right format.

# Good - Explicit format specification
prompt: |
  Update documentation with this format:
  - Model description: "[Purpose] [Grain] [~Row count]"
  - Column descriptions: "[Business meaning] [Data type] [Null % if >5%]"
  Use sentence case and proper punctuation.

❌ Anti-Pattern 2: Vague Quality Criteria

# Bad - What constitutes "bad" code?
prompt: |
  Flag bad code in this PR.

# Good - Specific criteria
prompt: |
  Flag these issues:
  - SELECT * in staging models
  - Joins on non-indexed columns
  - Models without tests on ID columns
  - Hardcoded dates or IDs
  - CamelCase column names

❌ Anti-Pattern 3: Missing Error Handling

# Bad - No guidance on failures
prompt: |
  Run dbt test and report results.

# Good - Handles multiple outcomes
prompt: |
  Run dbt test on changed models.
  
  If all tests pass:
    - Post success comment
    - Set status to passing
  
  If tests fail:
    - Post comment with failure details
    - Include failed test names and sample failing records
    - Set status to failing
    - Send Slack alert if >5 tests fail
  
  If tests error (can't run):
    - Post comment explaining the error
    - Set status to error (don't block merge)
    - Create issue for infrastructure team

❌ Anti-Pattern 4: Doing Too Much

# Bad - One agent doing everything
prompt: |
  Review this PR for:
  - SQL quality issues
  - Documentation completeness
  - Test coverage
  - Performance problems
  - Naming conventions
  - Breaking changes
  - Security issues
  And fix any issues found automatically.

# Good - Focused agents
# Agent 1: PR Review (read-only)
prompt: |
  Review SQL quality and flag issues.
  Comment with findings but don't modify code.

# Agent 2: Docs Updater (write)
prompt: |
  Update documentation for changed models.
  
# Agent 3: Test Generator (write)
prompt: |
  Generate tests for new models.
Split complex tasks into multiple focused agents rather than one mega-agent.

Debugging Prompts That Don’t Work

Strategy 1: Add Explicit Reasoning

Ask the agent to explain its thinking:
prompt: |
  Review this PR for breaking changes.
  
  For each model you analyze, explain:
  1. What you're checking
  2. What you found
  3. Why it's a concern (or not)
  4. What action you're taking
  
  Include this reasoning in your PR comment.
This makes it easier to see where the agent’s logic goes wrong.

Strategy 2: Start Minimal, Add Incrementally

If a complex prompt isn’t working, simplify:
# Start here
prompt: |
  List all models that changed in this PR.

# Then add
prompt: |
  List all models that changed in this PR.
  For each model, check if it has a YAML file.

# Then add more
prompt: |
  List all models that changed in this PR.
  For each model, check if it has a YAML file.
  If missing, create a basic YAML with model name and empty description.

# Finally
prompt: |
  List all models that changed in this PR.
  For each model, check if it has a YAML file.
  If missing:
    1. Profile the model to get row count and columns
    2. Create YAML with model and column descriptions
    3. Validate with dbt parse
    4. Commit if valid
Add complexity only after each step works.

Strategy 3: Provide Examples

Show the agent what you want:
prompt: |
  Generate documentation for changed models.
  
  Format example:
  ```yaml
  - name: fct_orders
    description: |
      Order transaction fact table. One row per order.
      Grain: One row per order_id.
      Contains: ~1.2M orders from 2022-present.
    columns:
      - name: order_id
        description: Unique identifier for each order. Primary key.
      - name: customer_id
        description: Foreign key to dim_customers. Null rate: 0.1% (guest orders).
Match this style and level of detail.

---

### Strategy 4: Add Constraints

If the agent is doing the wrong thing, add explicit constraints:

```yaml
# Problem: Agent was modifying source files
prompt: |
  Review models for issues.
  
  DO NOT modify any SQL files.
  DO NOT modify source definitions.
  Only add comments to the PR.

# Problem: Agent was creating too many PRs
prompt: |
  Update documentation for changed models.
  
  Create ONE PR with all changes.
  DO NOT create separate PRs per model.

# Problem: Agent was being too verbose
prompt: |
  Review this PR and comment with findings.
  
  Keep comments under 500 words.
  Focus on HIGH and MEDIUM priority issues only.
  Don't mention LOW priority formatting issues.

Progressive Refinement Examples

Example 1: Documentation Updater

Version 1: Too Vague
prompt: Update docs for changed models.
Problem: Agent didn’t know what to update or how Version 2: Added Steps
prompt: |
  1. Find changed models
  2. Update their YAML files
  3. Commit changes
Problem: Didn’t specify what to put in YAML Version 3: Added Format
prompt: |
  For each changed model:
  1. Profile it (row count, columns)
  2. Update YAML with:
     - Model description with row count
     - Column descriptions with types
  3. Validate and commit
Problem: Documentation quality was inconsistent Version 4: Added Standards (Final)
prompt: |
  For each changed model:
  
  1. Profile using retrieve_metadata:
     - Row count
     - Column types and null rates
     - Distinct counts for categoricals
  
  2. Update YAML following our standards:
     - Model description format: "[Purpose]. [Grain]. [~Row count]."
     - Column descriptions: "[Business meaning]. [Data type]."
     - Note null rates if >5%
     - Use sentence case, end with period
     - Preserve existing descriptions if still accurate
  
  3. Validate with `dbt parse`
  
  4. If valid: commit with message "docs: Update [model_name]"
     If invalid: comment with parse error and suggested fix
Result: Consistent, high-quality documentation

Example 2: Breaking Change Detector

Version 1: Too Broad
prompt: Check for issues in this PR.
Problem: Found generic issues, missed breaking changes Version 2: More Specific
prompt: Check for breaking changes in model columns.
Problem: Didn’t know what constitutes “breaking” Version 3: Defined Breaking
prompt: |
  Breaking changes to check:
  - Removed columns
  - Renamed columns
  - Changed column types
  
  Flag if found.
Problem: Flagged everything, even safe changes Version 4: Added Impact Analysis (Final)
prompt: |
  Detect breaking changes:
  1. Removed columns
  2. Type changes (VARCHAR → INT, etc.)
  3. Filter additions (fewer rows)
  
  For each change:
  - Search downstream models for usage
  - Classify severity:
    * HIGH: Column removed + used downstream
    * MEDIUM: Type change + used in calculations
    * LOW: Column added (non-breaking)
  
  Comment with:
  - List of changes and severity
  - Downstream models affected
  - Recommended fixes
  
  Only flag HIGH and MEDIUM severity.
Result: Accurate detection with actionable feedback

Best Practices Summary

Begin prompts with what success looks like, not implementation details.Good: “Generate comprehensive documentation that includes purpose, grain, and column meanings.”Bad: “Query the table, count rows, list columns, write YAML…”
Tell the agent how to handle unusual situations.Good: “If the model is >10M rows, sample 10K rows for profiling. If it’s a view, note that in the description.”Bad: Assume the agent knows what to do with large models.
Show examples of what you want the agent to produce.Good: Include example YAML, example PR comments, example Slack messages.Bad: Say “create appropriate output” without examples.
Reference specific tools when you need them.Good: “Use retrieve_metadata to profile columns. Use grep to find downstream references.”Bad: Assume the agent will choose the right tool.
Start simple, validate, then add complexity.Good: Build the prompt step by step, testing each addition.Bad: Write a huge prompt all at once without testing.

Next Steps

Creating Agents

Complete guide to agent configuration

Testing & Debugging

Test and refine your prompts locally

Tools & Permissions

Understand what tools agents can use