GETTING STARTED
How does it work?
Buster operates through a systematic process that mirrors how a data analyst might approach a request, leveraging a specialized data catalog and an intelligent execution engine. Here’s a breakdown of the key stages:
1. Data Catalog Foundation
At its core, Buster relies on a self-maintained Data Catalog. This isn’t just a passive inventory; it’s the knowledge base Buster uses to understand your data landscape.
- Catalog Components: The catalog integrates various data artifacts, primarily:
- Models: These often originate from data modeling tools like dbt, representing the structured tables and views containing your data. Buster offers native integration with dbt to seamlessly transform your dbt models into semantic models.
- Semantic Definitions: Accompanying these models are definition files (commonly YAML). These files enrich the raw models with business context and semantic meaning, defining elements like:
- Dimensions and Measures
- Metrics
- Filters
- Enumerations (valid values for fields)
- Other common semantic layer components.
- Deployment: This rich context, forming the data catalog, is actively deployed to Buster. Currently, this is done using a Command Line Interface (CLI) tool. Typically, you’ll run this deployment from within the repository where you manage your data models (e.g., your dbt project directory), ensuring Buster always has the latest understanding of your data structure and semantics.
2. Answering User Questions
When presented with a user request, Buster follows a structured approach to find and plan the necessary actions:
- Intelligent Search: Buster doesn’t just do keyword matching. It performs a semantic search across its data catalog. This process is designed to mimic how a human analyst would intuitively search for the most relevant tables, columns, metrics, and definitions needed to address the specific question asked.
- Planning Phase: Once Buster identifies the potentially relevant information from the catalog, it enters a planning mode.
- Analysis & Clarification: Buster analyzes the user’s request against the context it found. At this critical juncture, it might determine that more information is needed and ask clarifying questions to the user. It can also recognize if the required context isn’t available in its catalog and will explicitly state that it cannot answer the question.
- Plan Formulation: If Buster determines it can answer the question, it formulates a step-by-step plan. This plan acts as an internal “todo list”, outlining the specific actions required. Examples include “build metric X using dimension Y and measure Z,” or “create a new dashboard and add metrics A, B, and C.”
3. Execution Loop
With a clear plan in place, Buster enters an iterative execution loop to fulfill the request:
- Iterative Creation: Buster systematically works through its todo list. This involves generating the necessary data artifacts:
- Visualizations (Metrics): What Buster refers to as “Metrics” are often akin to configured visualizations or specific data calculations ready for display.
- Dashboards & Reports: Buster constructs dashboards or reports to present the generated metrics and insights.
- Safe & Semantic Query Generation: When accessing data, Buster intelligently parses SQL. It leverages the semantic layer definitions to enforce correct join paths between tables and inject predefined expressions (like custom dimensions or measures). Furthermore, it incorporates safeguards to protect against potentially malicious queries and enforces access controls, ensuring users only see data they are permitted to access based on dataset permissions.
- Version Controlled Artifacts: Crucially, Buster creates these metrics and dashboards as files. This output allows them to be easily integrated into standard version control systems (like Git), enabling tracking, collaboration, and reproducibility.
- Review & Progress: During the execution loop, Buster includes an intermediary review step. It checks its todo list, identifies the tasks it has successfully completed, and marks them off.
- Completion: Buster continues this cycle of creation and review until every item on its initial plan (the todo list) has been successfully accomplished.