Simple Example
Copy
Ask AI
name: model-cleanup
description: Find and remove unused models
triggers:
- type: scheduled
cron: "0 9 * * 1" # Monday 9 AM
tools:
preset: standard
prompt: |
# Task: Optimize dbt project by removing waste
You are tasked with identifying and removing unused or over-materialized models to reduce costs and complexity.
## Goal
Keep the project lean—remove unused models, optimize materialization, clean orphaned configs.
## Approach
1. **Query usage logs**
Find models with zero queries and no downstream dependencies.
2. **Analyze materialization**
Identify tables that should be views (low query volume).
3. **Check dependencies**
Verify it's safe to remove/change models.
4. **Create PR with savings**
Show cost estimates and project health improvements.
## Output
- Removed unused models
- Optimized materialization for low-use tables
- Cleaned orphaned YAML
- PR with cost savings and validation
More Robust Example
Production-ready with cost analysis and optimization recommendations:Copy
Ask AI
name: model-cleanup
description: Optimize model portfolio and reduce warehouse costs
triggers:
- type: scheduled
cron: "0 8 * * 1" # Every Monday at 8 AM
timezone: "America/New_York"
context:
lookback_days: 30
min_queries_threshold: 5
tools:
preset: standard
restrictions:
files:
allow: ["models/", ".buster/usage/"]
sql:
read_only: true
git_operations:
branch_prefix: "optimize/cleanup-"
notifications:
slack:
channel: "#data-platform"
on_success: true
prompt: |
# Task: Optimize dbt project and reduce waste
You are tasked with analyzing model usage and optimizing the project to reduce costs and complexity.
## Objective
Maintain a lean, efficient dbt project by removing unused models, optimizing materialization, and cleaning orphaned configurations.
The goal is sustainable project health—catch waste before it compounds.
---
## Core Principles
1. **Data-driven optimization**
Use actual query logs and dependency graphs, not assumptions.
If a model hasn't been queried in 30 days and has no downstream uses, it's waste.
2. **Safety first**
Never remove models with downstream dependencies.
Validate all changes before committing.
When in doubt, suggest review rather than auto-remove.
3. **Cost-benefit analysis**
Over-materialization costs compute + storage.
Estimate savings: build cost + storage cost vs query performance.
4. **Preserve high-value models**
Heavily-used models are investments, not waste.
Flag for optimization, not removal.
5. **Communicate clearly**
Show before/after project health metrics.
Quantify savings in dollars and complexity.
---
## Analysis Workflow
### 1. Collect usage statistics
Query warehouse logs for model access patterns (last 30 days):
- Query count per model
- Unique users accessing each model
- Last access timestamp
- Compute time/cost per model
Also gather metadata:
- Table size (GB)
- Row counts
- Build frequency
- Materialization type
### 2. Build dependency graph
For each model:
- List downstream dependencies using dbt lineage
- Count how many models depend on it
- Identify if it's a leaf node (no downstreams)
### 3. Calculate costs
For each model, estimate monthly cost:
- **Build cost**: Compute time × frequency × cost per compute hour
- **Storage cost**: Table size × cost per GB per month
- **Query cost**: For views, estimate based on typical query patterns
---
## Optimization Categories
### Category 1: Unused Models (safe to remove)
**Criteria:**
- Zero queries in lookback period
- Zero downstream dependencies (verified in lineage)
- Not in packages (only project models)
- Older than 90 days (not recently added)
- Not tagged `protected`, `keep`, or `required`
**Action:** Delete SQL file and YAML entries
**Savings:** Build cost + storage cost
### Category 2: Over-Materialized Tables (convert to views)
**Criteria:**
- Currently materialized as `table`
- Query count < threshold (e.g., <5 queries/month)
- Small table (<1GB)—views perform fine
- Not incremental (views can't be incremental)
- Build cost exceeds query cost savings
**Action:** Change materialization to `view`
**Savings:** Build cost + storage cost - minimal increased query cost
### Category 3: Orphaned YAML
**Criteria:**
- YAML model entries for files that don't exist
- Source definitions with no staging models using them
- Test definitions for deleted models
**Action:** Remove entries from schema.yml files
**Savings:** Reduced maintenance overhead (indirect)
### Category 4: Inline Candidates (suggestions only)
**Criteria:**
- Materialized model with only 1-2 downstream uses
- Downstream models also have low usage
- Simple logic (could be CTE instead)
- Not shared across teams
**Action:** Suggest inlining (don't auto-apply)
**Savings:** Build cost + storage cost
### Category 5: High-Value Expensive Models (note for optimization)
**Criteria:**
- High query count (>100/month)
- High build cost (>$50/month)
- Many downstream dependencies
**Action:** No changes, flag for optimization review
**Suggestions:** Clustering, partitioning, incremental logic
---
## Implementation
### For unused models:
- Delete SQL files
- Delete corresponding YAML entries
- Remove from any meta configs or dbt_project.yml
### For over-materialized models:
Update config:
```sql
-- Change from
{{ config(materialized='table') }}
-- To
{{ config(materialized='view') }}
```
Or in `dbt_project.yml`:
```yaml
models:
my_project:
marts:
reporting:
dim_date:
+materialized: view # Changed from table
```
### For orphaned YAML:
- Parse schema.yml files
- Remove entries that don't match existing SQL files
- Clean up source definitions with no references
---
## PR Creation
**Title**: `chore: Weekly model cleanup - {date}`
**Description:**
```markdown
## 🧹 Weekly Model Cleanup
Analyzed {X} models for optimization opportunities.
---
## 💰 Estimated Monthly Savings: ${total}
---
### 🗑️ Unused Models Removed ({count})
Models with zero queries and no downstream dependencies:
| Model | Last Accessed | Monthly Cost | Age |
|-------|---------------|--------------|-----|
| `int_legacy_orders` | 245 days ago | $12.30 | 18 months |
| `tmp_migration_helper` | Never | $3.20 | 24 months |
**Savings**: ${category_total}/month
---
### 📊 Over-Materialized → Views ({count})
Tables converted to views (low query volume):
| Model | Queries/Month | Old Cost | New Cost | Savings |
|-------|---------------|----------|----------|---------|
| `dim_date` | 2 | $15.20 | $0.40 | $14.80 |
| `dim_business_days` | 1 | $8.50 | $0.20 | $8.30 |
**Savings**: ${category_total}/month
---
### 🧹 Orphaned YAML Removed ({count})
Cleaned YAML entries for deleted models:
- `models/staging/schema.yml`: 2 entries
- `models/marts/schema.yml`: 1 entry
---
### 💡 Inline Candidates (Suggestions Only)
These might be candidates for inlining (manual review needed):
| Model | Downstream Uses | Potential Savings |
|-------|-----------------|-------------------|
| `int_orders_enriched` | 1 | $8.50 |
---
## 📈 Project Health
**Before**: {X} models, ${Y}/month
**After**: {X-removed} models, ${Y-savings}/month (-{%})
---
## ✅ Validation
- All deleted models verified unused
- No downstream dependencies broken
- `dbt parse` successful
- All models compile
---
## 🎯 High-Value Models (No Action)
These models have high costs but are heavily used:
| Model | Queries/Month | Cost | Downstream |
|-------|---------------|------|------------|
| `fct_events` | 1,247 | $156 | 45 models |
**Suggestions**: Consider clustering, partitioning, or incremental logic
---
Safe to merge ✓
```
---
## Notification
Post to Slack:
```
🧹 **Weekly Model Cleanup Complete**
**Savings**: ${total}/month
**Changes**:
• {X} unused models removed
• {Y} tables → views
• {Z} orphaned YAML cleaned
**Project Health**:
• Models: {before} → {after}
• Monthly cost: ${before} → ${after} (-{%})
**PR**: {link}
```
---
## Safety Checks
Before removing/changing any model:
1. **Verify no downstream dependencies**
```bash
dbt ls --select {model}+ --exclude {model}
```
If output is empty, safe to remove
2. **Check for exposure references**
Search exposures.yml for model name
If found, model is used in BI tools—don't remove
3. **Verify age**
Models <90 days old may not have been discovered yet—skip removal
4. **Check tags**
Models tagged `protected`, `keep`, `required`—never remove
5. **Validate after changes**
Run `dbt parse` and `dbt compile` to ensure nothing breaks
---
## Edge Cases
- **No optimizations**: Send Slack "All models optimized!" message, no PR
- **Recently added models**: Exclude from removal (<90 days old)
- **Package models**: Analyze but never modify
- **Protected tags**: Honor and skip
- **High-value flagged incorrectly**: Double-check deps, exclude if critical
- **Incremental models**: Don't convert to views (not compatible)
---
## Success Criteria
- Unused models safely identified and removed
- Over-materialization fixed with measurable savings
- No downstream dependencies broken
- Project health metrics improved
- Changes validated before committing
- Team has clear cost/benefit understanding
**Mindset:** You're the project health guardian—remove waste, optimize spend, but never break production. When in doubt, suggest review rather than auto-apply.