Use this file to discover all available pages before exploring further.
Writing good prompts is the key to building reliable agents. This guide covers patterns, anti-patterns, and strategies for crafting prompts that produce consistent, high-quality results.
Use when you want consistent, predictable behavior and know exactly what should happen.
prompt: | When models change: 1. Profile each model (row count, column types, null rates) 2. Update the YAML file with this exact format: - Model description: "[Purpose] [Grain] [~Row count]" - Column descriptions: "[Business meaning] [Data type]" 3. Run `dbt parse` to validate 4. If valid: commit with message "docs: Update [model_name]" 5. If invalid: comment with the exact parse error
Use when you want the agent to adapt to different situations and make intelligent decisions.
prompt: | Review this PR for data modeling issues. Consider: - Are joins efficient? - Is the grain clear? - Are naming conventions followed? - Will this break downstream models? Provide specific, actionable feedback.
prompt: | 1. Extract JSON schema from recent records 2. Compare with baseline in .buster/schemas/ 3. If changes detected: a) Classify severity (critical/high/medium) b) Update staging model c) Create PR with changes 4. If no changes: update baseline only
✅ You need consistent ordering
prompt: | For each changed model: 1. First, profile the model # Must happen first 2. Then, generate documentation 3. Finally, validate with dbt parse
✅ The task has clear phases
prompt: | Phase 1: Analysis 1. Identify all changed models 2. Check for downstream dependencies Phase 2: Action 3. Based on findings, create issue OR PR 4. Send appropriate Slack notification
prompt: | Investigate why the null rate in orders.revenue increased. Query recent data, compare with historical patterns, and determine if this is a data quality issue or expected behavior. If it's a problem, create an issue with your findings and suggested remediation steps.
✅ The logic is conditional
prompt: | Check for breaking changes in this PR. If you find removed columns, search for downstream usage and assess impact. High-impact changes need immediate attention with Slack alerts. Low-impact changes just need a comment noting the change. Use your judgment on severity thresholds.
Many successful agents follow this three-phase structure:
prompt: | ## Analyze Profile the changed models to understand: - Row counts and column structure - Data types and null rates - Common values for categoricals ## Decide Determine what documentation updates are needed: - New models: Full documentation - Modified models: Update changed columns - Deleted models: Remove YAML ## Act Make the appropriate updates: - Update YAML files - Validate with dbt parse - Commit if valid, comment if errors
This pattern works well for most automation tasks.
Use conditional branching for different scenarios:
prompt: | Check the PR size and complexity. If PR has >20 changed files OR >1000 lines: - Run comprehensive review - Check for performance issues - Verify test coverage - Require manual approval Else if PR has 5-20 changed files: - Run standard review - Flag obvious issues - Auto-approve if checks pass Else (small PR): - Quick validation only - Auto-approve immediately
prompt: | ## Detect Schema change detected in {{event.table}}. Change type: {{event.change_type}} Affected columns: {{event.columns}} ## Investigate Find which staging models reference this source. For each affected model: - Will it break with this change? - Is the change additive (safe) or breaking (risky)? - What downstream models depend on it? ## Remediate Based on severity: - Breaking change: Create issue, send urgent Slack alert - Additive change: Create PR with updated model, normal notification - Metadata only: Log it, no action needed
# Bad - Agent doesn't know what "the right format" meansprompt: | Update documentation in the right format.# Good - Explicit format specificationprompt: | Update documentation with this format: - Model description: "[Purpose] [Grain] [~Row count]" - Column descriptions: "[Business meaning] [Data type] [Null % if >5%]" Use sentence case and proper punctuation.
# Bad - What constitutes "bad" code?prompt: | Flag bad code in this PR.# Good - Specific criteriaprompt: | Flag these issues: - SELECT * in staging models - Joins on non-indexed columns - Models without tests on ID columns - Hardcoded dates or IDs - CamelCase column names
# Bad - No guidance on failuresprompt: | Run dbt test and report results.# Good - Handles multiple outcomesprompt: | Run dbt test on changed models. If all tests pass: - Post success comment - Set status to passing If tests fail: - Post comment with failure details - Include failed test names and sample failing records - Set status to failing - Send Slack alert if >5 tests fail If tests error (can't run): - Post comment explaining the error - Set status to error (don't block merge) - Create issue for infrastructure team
prompt: | Review this PR for breaking changes. For each model you analyze, explain: 1. What you're checking 2. What you found 3. Why it's a concern (or not) 4. What action you're taking Include this reasoning in your PR comment.
This makes it easier to see where the agent’s logic goes wrong.
# Start hereprompt: | List all models that changed in this PR.# Then addprompt: | List all models that changed in this PR. For each model, check if it has a YAML file.# Then add moreprompt: | List all models that changed in this PR. For each model, check if it has a YAML file. If missing, create a basic YAML with model name and empty description.# Finallyprompt: | List all models that changed in this PR. For each model, check if it has a YAML file. If missing: 1. Profile the model to get row count and columns 2. Create YAML with model and column descriptions 3. Validate with dbt parse 4. Commit if valid
prompt: | Generate documentation for changed models. Format example: ```yaml - name: fct_orders description: | Order transaction fact table. One row per order. Grain: One row per order_id. Contains: ~1.2M orders from 2022-present. columns: - name: order_id description: Unique identifier for each order. Primary key. - name: customer_id description: Foreign key to dim_customers. Null rate: 0.1% (guest orders).
Match this style and level of detail.
---### Strategy 4: Add ConstraintsIf the agent is doing the wrong thing, add explicit constraints:```yaml# Problem: Agent was modifying source filesprompt: | Review models for issues. DO NOT modify any SQL files. DO NOT modify source definitions. Only add comments to the PR.# Problem: Agent was creating too many PRsprompt: | Update documentation for changed models. Create ONE PR with all changes. DO NOT create separate PRs per model.# Problem: Agent was being too verboseprompt: | Review this PR and comment with findings. Keep comments under 500 words. Focus on HIGH and MEDIUM priority issues only. Don't mention LOW priority formatting issues.
Problem: Didn’t specify what to put in YAMLVersion 3: Added Format
prompt: | For each changed model: 1. Profile it (row count, columns) 2. Update YAML with: - Model description with row count - Column descriptions with types 3. Validate and commit
Problem: Documentation quality was inconsistentVersion 4: Added Standards (Final)
prompt: | For each changed model: 1. Profile using retrieve_metadata: - Row count - Column types and null rates - Distinct counts for categoricals 2. Update YAML following our standards: - Model description format: "[Purpose]. [Grain]. [~Row count]." - Column descriptions: "[Business meaning]. [Data type]." - Note null rates if >5% - Use sentence case, end with period - Preserve existing descriptions if still accurate 3. Validate with `dbt parse` 4. If valid: commit with message "docs: Update [model_name]" If invalid: comment with parse error and suggested fix
prompt: | Detect breaking changes: 1. Removed columns 2. Type changes (VARCHAR → INT, etc.) 3. Filter additions (fewer rows) For each change: - Search downstream models for usage - Classify severity: * HIGH: Column removed + used downstream * MEDIUM: Type change + used in calculations * LOW: Column added (non-breaking) Comment with: - List of changes and severity - Downstream models affected - Recommended fixes Only flag HIGH and MEDIUM severity.
Result: Accurate detection with actionable feedback
Begin prompts with what success looks like, not implementation details.Good: “Generate comprehensive documentation that includes purpose, grain, and column meanings.”Bad: “Query the table, count rows, list columns, write YAML…”
Be explicit about edge cases
Tell the agent how to handle unusual situations.Good: “If the model is >10M rows, sample 10K rows for profiling. If it’s a view, note that in the description.”Bad: Assume the agent knows what to do with large models.
Specify output format
Show examples of what you want the agent to produce.Good: Include example YAML, example PR comments, example Slack messages.Bad: Say “create appropriate output” without examples.
Use tools effectively
Reference specific tools when you need them.Good: “Use retrieve_metadata to profile columns. Use grep to find downstream references.”Bad: Assume the agent will choose the right tool.
Test incrementally
Start simple, validate, then add complexity.Good: Build the prompt step by step, testing each addition.Bad: Write a huge prompt all at once without testing.