Skip to main content Buster maintains a context layer for your data stack—a repository that serves as the source of truth for everything agents need to know to do their work.
Why it exists
Data engineering tasks require context that isn’t easy to gather from code alone:
The data is often the source of truth, not the code
Tooling is siloed (your warehouse doesn’t know about your dashboards, your airflow jobs don’t know about your dbt models)
Critical knowledge lives in people’s heads and never gets documented
What it captures
The context layer aggregates information across your entire stack:
Structure : Tables, columns, relationships, models, pipelines, dashboards
Lineage : How data flows across systems, upstream and downstream dependencies
Business logic : Rules, constraints, and nuances specific to your data (e.g., “exclude data center signups from city-level metrics”)
Tribal knowledge : Context that usually lives in someone’s head
How it’s structured
The context layer lives in a Git repository as structured files:
YML files for structured metadata (datasets, columns, relationships, lineage, DAGs, etc)
MD files for unstructured context (business logic, nuances, documentation)
This file-based approach means agents can search, read, and update the context using standard tools.
How it stays up to date
The context layer is continuously maintained through multiple channels:
Initial setup : When you connect your data stack, Buster automatically documents your data stack and builds the initial context
Change triggers : PRs, schema changes, failed jobs, and other events trigger agents to update relevant context
Agent discoveries : While doing work, agents may discover nuances about your stack and document them
Human feedback : When you correct an agent’s work or provide feedback, that gets captured so it’s not lost
How agents use it
Every Buster agent has access to the context layer, regardless of the task. Whether it’s reviewing a PR, triaging a failed job, etc the agent can reference the context to understand:
What it’s working on
What’s upstream and downstream
What business rules apply
What’s been learned from past work