Skip to main content
This agent compares data output between the PR branch and the main branch for changed dbt models, catching logic errors and unexpected data changes that might not be caught by compilation or tests. It runs queries against both versions and reports meaningful differences in row counts, column values, or distributions.

Simple Example

name: data-diff-checker
description: Compare data output for changed models

triggers:
  - type: pull_request
    on_changed_files: "models/**/*.sql"

tools:
  preset: standard

prompt: |
  For each changed model:
  1. Run the model on PR branch
  2. Compare with production version
  3. Report row count changes and sample differences
  4. Comment on PR with findings

More Robust Example

Production-ready with statistical comparison and intelligent diff analysis:
name: data-diff-checker
description: Comprehensive data comparison for changed dbt models

triggers:
  - type: pull_request
    events: [opened, synchronize]
    paths:
      include: ["models/**/*.sql"]
    conditions:
      - type: pr_labels
        none_of: ["skip-diff"]

tools:
  preset: standard

restrictions:
  sql:
    read_only: false  # Need to create temp tables for comparison
    allowed_schemas: ["raw", "staging", "marts", "dbt_pr_build"]
    max_query_time: 600  # 10 min for large comparisons
  git_operations:
    can_commit: false  # Read-only, just comment

notifications:
  slack:
    channel: "#data-quality"
    on_failure: true

prompt: |
  ## 1. Identify models to compare
  
  For each SQL file changed in this PR:
  
  a) Skip if:
     - Model is brand new (no prod version to compare)
     - Changes are only to comments or whitespace
     - PR has "skip-diff" label
     - Model is in seeds/ or snapshots/
  
  b) Prioritize by model type:
     - **High priority**: Mart models (fct_, dim_)
     - **Medium priority**: Intermediate models
     - **Low priority**: Staging models (usually 1-1 with source)
  
  c) Check if model is safe to run:
     - No dependencies on unreleased models
     - Not incremental (or can run full-refresh)
     - Not too expensive (check for table size hints)
  
  ## 2. Build PR version
  
  For each model to compare:
  
  a) Check if already built in PR schema:
     ```sql
     SELECT COUNT(*) 
     FROM information_schema.tables 
     WHERE table_schema = 'dbt_pr_build_{pr_number}'
     AND table_name = '{model_name}'
b) If not built, run dbt:
dbt build --select {model_name} --target pr_build --full-refresh
c) Verify build succeeded before comparing

3. Run comparison queries

For each model, run comprehensive comparisons:

A. Row Count Comparison

SELECT 
  'production' as version,
  COUNT(*) as row_count
FROM production.{schema}.{model_name}

UNION ALL

SELECT 
  'pr' as version,
  COUNT(*) as row_count
FROM dbt_pr_build_{pr_number}.{schema}.{model_name}
Calculate:
  • Absolute difference
  • Percentage change
  • Flag if change > 10% (configurable threshold)

B. Column-Level Statistics

For each column, compare:
WITH prod_stats AS (
  SELECT 
    '{column}' as column_name,
    COUNT(*) as total_rows,
    COUNT({column}) as non_null_count,
    COUNT(DISTINCT {column}) as distinct_count,
    -- For numerics
    MIN({column}) as min_val,
    MAX({column}) as max_val,
    AVG({column}) as avg_val,
    STDDEV({column}) as stddev_val
  FROM production.{schema}.{model_name}
),
pr_stats AS (
  SELECT 
    '{column}' as column_name,
    COUNT(*) as total_rows,
    COUNT({column}) as non_null_count,
    COUNT(DISTINCT {column}) as distinct_count,
    MIN({column}) as min_val,
    MAX({column}) as max_val,
    AVG({column}) as avg_val,
    STDDEV({column}) as stddev_val
  FROM dbt_pr_build_{pr_number}.{schema}.{model_name}
)
SELECT 
  p.*,
  pr.*
FROM prod_stats p
CROSS JOIN pr_stats pr
Flag significant changes:
  • Null rate change > 5%
  • Distinct count change > 20%
  • Avg value change > 15% (for numerics)
  • Min/max outside expected range

C. Sample Row Comparison

Find rows that exist in one version but not the other:
-- Rows in PROD but not in PR (deleted)
SELECT 'deleted' as diff_type, p.*
FROM production.{schema}.{model_name} p
LEFT JOIN dbt_pr_build_{pr_number}.{schema}.{model_name} pr
  ON p.{primary_key} = pr.{primary_key}
WHERE pr.{primary_key} IS NULL
LIMIT 100

-- Rows in PR but not in PROD (added)
SELECT 'added' as diff_type, pr.*
FROM dbt_pr_build_{pr_number}.{schema}.{model_name} pr
LEFT JOIN production.{schema}.{model_name} p
  ON pr.{primary_key} = p.{primary_key}
WHERE p.{primary_key} IS NULL
LIMIT 100

-- Rows with changed values
SELECT 
  'modified' as diff_type,
  p.{primary_key},
  -- Show key columns that changed
  p.{column} as prod_value,
  pr.{column} as pr_value
FROM production.{schema}.{model_name} p
JOIN dbt_pr_build_{pr_number}.{schema}.{model_name} pr
  ON p.{primary_key} = pr.{primary_key}
WHERE p.{column} != pr.{column} 
   OR (p.{column} IS NULL AND pr.{column} IS NOT NULL)
   OR (p.{column} IS NOT NULL AND pr.{column} IS NULL)
LIMIT 100

D. Aggregate Checks

For fact tables, compare key metrics:
-- Compare revenue totals, for example
SELECT 
  'production' as version,
  SUM(revenue) as total_revenue,
  AVG(revenue) as avg_revenue,
  COUNT(DISTINCT customer_id) as unique_customers
FROM production.{schema}.{model_name}

UNION ALL

SELECT 
  'pr' as version,
  SUM(revenue) as total_revenue,
  AVG(revenue) as avg_revenue,
  COUNT(DISTINCT customer_id) as unique_customers
FROM dbt_pr_build_{pr_number}.{schema}.{model_name}
Flag if business metrics changed unexpectedly.

4. Analyze differences

For each detected difference, determine severity:

Expected Changes (Low Severity):

  • New column added (not in prod)
  • Formatting changes (trim, case)
  • Rounding differences (0.0001 tolerance)
  • Timestamp precision changes
  • Order changes (if model not ordered)

Potentially Breaking (Medium Severity):

  • Row count change >10%
  • Null rate change >5%
  • Column removed
  • Distinct count significant change

Likely Bugs (High Severity):

  • Row count change >50%
  • Key metric change (revenue, counts off significantly)
  • Many rows deleted
  • All values changed for important column
  • Primary key no longer unique

5. Investigate root causes

For unexpected differences, try to determine why: a) Compare SQL changes:
  • Check git diff for changed joins
  • Look for modified WHERE clauses
  • Find changed aggregations or GROUP BY
b) Check for data dependency changes:
  • Did upstream models change?
  • Is source data different in time windows?
c) Verify itโ€™s not an infrastructure issue:
  • Did PR build use correct refs?
  • Are there data quality issues in staging?

6. Generate comparison report

Post detailed comment on PR:
## ๐Ÿ“Š Data Diff Report

Compared **{model_count}** changed models between production and this PR.

---

### โœ… Models with Expected Changes

**`fct_orders`** - Row count increased
- **Production**: 1,245,890 rows
- **PR Branch**: 1,248,103 rows (+2,213 rows, +0.18%)
- **Analysis**: Increase expected from new orders in time window โœ“

---

### โš ๏ธ  Models with Unexpected Differences

**`fct_customer_lifetime_value`** - Row count decreased significantly
- **Production**: 45,678 rows
- **PR Branch**: 38,290 rows (-7,388 rows, **-16.2%**)
- **Severity**: HIGH

**Detailed Analysis:**

| Metric | Production | PR | Change |
|--------|------------|----|----|
| Total Customers | 45,678 | 38,290 | -16.2% |
| Avg LTV | $3,245 | $3,892 | +19.9% |
| Total LTV | $148.2M | $149.0M | +0.5% |
| Null LTV Count | 0 | 0 | - |

**Root Cause Investigation:**

Looking at the SQL diff, you added a filter:
```sql
WHERE total_orders >= 2  -- NEW FILTER
This excludes customers with only 1 order, removing 7,388 customers (16.2%). Impact:
  • Average LTV increased because we removed low-value customers
  • Total LTV barely changed (one-time customers contributed little)
  • This appears intentional based on the PR description
Sample Removed Customers (first 5):
customer_idtotal_ordersrevenue
cust_1002341$45.00
cust_1008911$23.50
cust_1012451$67.25
Recommendation: โœ“ Verify this filter aligns with business definition of โ€œLTV customersโ€
dim_products - Column value changes detected
  • Severity: MEDIUM
  • Changes: 234 products have different category values
Sample Changes:
product_idprod_categorypr_category
prod_1001ElectronicsConsumer Electronics
prod_1002ElectronicsConsumer Electronics
prod_2031Home & GardenHome Goods
Root Cause: Updated category mapping logic in staging layer Downstream Impact:
  • 5 models use dim_products.category for grouping
  • Dashboards will show new category names
  • Historical comparisons may be affected
Recommendation: โš ๏ธ Consider creating a category mapping to maintain historical continuity

๐Ÿ” Technical Details


๐Ÿ“ˆ Summary

  • Models Compared: 6
  • No Differences: 3
  • Expected Differences: 1
  • Unexpected Differences: 2 (1 high, 1 medium)
Action Items:
  1. โš ๏ธ Verify the customer filter in fct_customer_lifetime_value is intentional
  2. โš ๏ธ Review category name changes in dim_products and downstream impact
  3. โœ… Other changes look good!

๐Ÿ’ก Tip: Add โ€œskip-diffโ€ label to skip data comparison for this PR.

## 7. Handle edge cases

- **Large models**: Sample comparison instead of full table (>10M rows)
- **Slow models**: Set timeout, partial results if exceeded
- **Build failures**: Report failure, don't compare
- **No primary key**: Use row hashing for comparison
- **Too many differences**: Provide summary, link to full results
- **Incremental models**: Compare full-refresh version or skip

## 8. Clean up

After comparison:
- Drop temporary tables created for comparison
- Clear PR build schema if not needed
- Log execution time and costs for monitoring