Skip to main content
This agent runs nightly to update dbt model documentation across all PRs merged during the day. Instead of updating docs in each individual PR (which can create noise), this agent batches all documentation updates into a single nightly PR, keeping your dbt project documentation comprehensive and up-to-date without interrupting development workflows.

Simple Example

name: nightly-docs-updater
description: Batch update documentation for all models changed today

triggers:
  - type: scheduled
    cron: "0 2 * * *"  # 2 AM daily
    timezone: "America/New_York"

tools:
  preset: standard

prompt: |
  Find all PRs merged in the last 24 hours that changed dbt models.
  
  For each changed model:
  1. Profile the model (row count, columns, types)
  2. Update the YAML documentation
  3. Validate with dbt parse
  
  Create a single PR with all documentation updates titled:
  "docs: Nightly documentation sync for [date]"

More Robust Example

Production-ready with comprehensive profiling and smart updates:
name: nightly-docs-updater
description: Comprehensive nightly documentation sync across all merged PRs

triggers:
  - type: scheduled
    cron: "0 2 * * *"  # 2 AM daily
    timezone: "America/New_York"
    only_on_weekdays: true

tools:
  preset: standard

restrictions:
  files:
    allow: ["models/**/*.sql", "models/**/*.yml"]
  sql:
    read_only: true
    max_rows_returned: 1000
  git_operations:
    branch_prefix: "docs/nightly-"

notifications:
  slack:
    channel: "#data-documentation"
    on_success: true
    on_failure: true

prompt: |
  ## 1. Find changed models
  Get all PRs merged to main in the last 24 hours:
  
  gh pr list --state merged --limit 100 \
    --search "merged:>$(date -d '24 hours ago' -Iseconds)" \
    --json files
  
  Extract all .sql files in models/ directory that changed.
  
  ## 2. Profile each changed model
  For each model that was modified or added:
  
  a) Skip if documentation was already updated today
     - Check git log for recent YAML changes
     - Skip if YAML was modified in the same PR
  
  b) Profile the model using retrieve_metadata:
     - Total row count (approximate)
     - Column list with data types
     - Null percentage per column
     - For categorical columns (<20 unique values): list common values
     - For numeric columns: min, max, median ranges
     - For date columns: min and max dates
  
  c) Analyze model purpose from:
     - Model name and location in project
     - SELECT statement structure
     - Joins and CTEs used
     - Comments in the SQL
  
  ## 3. Generate or update documentation
  
  For new models (no YAML exists):
  - Create complete documentation with all columns
  - Infer grain from GROUP BY or model structure
  - Estimate purpose from model name and SQL logic
  
  For existing models (YAML exists):
  - Preserve existing descriptions if still accurate
  - Add descriptions for new columns
  - Update data types if changed
  - Add null rates if > 5%
  - Flag removed columns in a comment
  
  Documentation standards:
  - Model descriptions: Include purpose, grain, and ~row count
  - Column descriptions: Business meaning + data type
  - Note null behavior if > 5% or if nulls are meaningful
  - Use sentence case, proper grammar, end with period
  - For metrics/calculations: explain the formula
  - For foreign keys: note which model they reference
  
  ## 4. Validate all changes
  Run `dbt parse` to validate YAML syntax.
  
  If errors:
    - Try to fix common issues (indentation, quotes)
    - If still failing, skip that model and document the error
  
  ## 5. Create PR
  If any documentation was updated:
  
  a) Create PR with:
     - Title: "docs: Nightly documentation sync for {date}"
     - Body: 
       """
       Automated documentation updates for models changed in the last 24 hours.
       
       ## Models Updated
       {list of models with links to files}
       
       ## Summary
       - {X} models documented
       - {Y} new columns described
       - {Z} columns updated
       
       ## Source PRs
       {links to the merged PRs that triggered these updates}
       """
     - Labels: "documentation", "automated"
     - Reviewers: @data-platform-team
  
  b) Set auto-merge if checks pass
  
  ## 6. Report results
  Send Slack message to #data-documentation:
  
  If updates made:
    📚 **Nightly Docs Update Complete**
    
    Updated documentation for {X} models
    PR: {pr_url}
    
    Models: {model_list}
  
  If no updates needed:
    ✅ **No Documentation Updates Needed**
    
    All models merged yesterday already have current documentation.
  
  If errors occurred:
    ⚠️  **Documentation Update Had Issues**
    
    Updated {X} models successfully
    Failed on {Y} models: {error_summary}
    
    PR: {pr_url}
    Review needed: {failed_models}