Relationships define how different semantic models connect to each other, effectively specifying join conditions for queries. Defining these connections is essential for enabling analyses that combine data from multiple related models.
By establishing relationships, you empower the semantic layer to understand the logical links between your data entities. This allows metrics, dimensions, and filters to reference fields across different models seamlessly, provided the necessary joins are included in the query.
Here’s the basic structure for defining relationships within a model’s YAML configuration:
Copy
Ask AI
# Inside a model definition (e.g., orders.yml)relationships: - name: related_model_name # Required: Name of the model being linked TO (e.g., customers) source_col: join_key_in_current_model # Required: Key field in the *current* model (e.g., customer_id) ref_col: join_key_in_related_model # Required: Key field in the *related* model (e.g., id) description: "Business context" # Required: Clear explanation of the link type: LEFT # Optional: Join type (LEFT, INNER, RIGHT, FULL OUTER). Default: LEFT cardinality: many-to-one # Optional: Nature of the link (one-to-one, one-to-many, many-to-one, many-to-many)
The relationships key defines a list of connections from the current model to other models.
source_col: Refers to the column in the current model’s underlying table/view that holds the key value. Often a foreign key.
ref_col: Refers to the column in the related model’s underlying table/view that holds the corresponding key value. Often a primary key.
Example: Orders to Customers
Assuming an orders model needs to link to a customers model:
Copy
Ask AI
# In orders.ymlname: ordersdescription: > Represents individual customer order transactions. Contains details about each order, including identifier, customer linkage, timing, status, and financial value. This is an event-based model crucial for sales analysis, revenue tracking, and understanding purchase patterns.dimensions: - name: order_id description: Unique identifier for the order transaction. type: string - name: customer_id description: Foreign key linking to the 'customers' model. type: string - name: order_date description: Date the order was placed. type: date # ... other dimensions like status ...measures: - name: amount # Renamed from total_amount description: Total monetary value of the order (e.g., in USD). type: number # ... other measures like item_count ...relationships: - name: customer # Link TO customers model source_col: customer_id # Key in orders (current) ref_col: id # Key in customers (related) description: Links this order to the specific customer who placed it. cardinality: many-to-one type: LEFT# ... other relationships ...metrics: - name: revenue_from_recent_customers description: > Total revenue from orders placed by customers who signed up in the current calendar year. Requires joining with 'customers' model via 'customer' relationship. # References 'signup_year' from the related 'customers' model via 'customer' relationship # Assumes 'amount' is a measure in this model. expr: "SUM(CASE WHEN customer.signup_year = EXTRACT(YEAR FROM CURRENT_DATE()) THEN orders.amount ELSE 0 END)"# ... other metrics ...
Copy
Ask AI
# In customers.yml (Referenced by the metric above)name: customersdescription: > Represents individual customer entities. Contains contact information, registration details, location, and potentially derived metrics. Used as a dimension table for analyzing orders.dimensions: - name: id description: The unique identifier for the customer (primary key). type: string - name: signup_year description: > The year the customer registered. Used in the cross-model metric example to analyze revenue based on customer tenure. type: integer # ... other dimensions like name, email, country ...# ... measures / metrics ...# ... relationships ...
This field dictates the SQL join behavior when this relationship is used in a query:
LEFT (Default): Includes all records from the current model (where the relationship is defined) and matching records from the related model. Fields from the related model will be NULL if no match is found.
INNER: Includes only records where a match exists in both models based on the join keys.
RIGHT: Includes all records from the related model and matching records from the current model. Fields from the current model will be NULL if no match is found. (Use with caution, often LEFT from the other model is clearer).
FULL OUTER: Includes all records from both models. Fields will be NULL where no match exists in the corresponding model.
This describes the expected numerical relationship between records connected by the join keys:
one-to-one: Each record in the current model maps to at most one record in the related model, and vice-versa (e.g., users to user_profiles).
one-to-many: Each record in the current model can map to multiple records in the related model (e.g., customers to orders).
many-to-one: Multiple records in the current model can map to a single record in the related model (e.g., orders to customers). This is the inverse of one-to-many.
many-to-many: Records in one model can map to multiple records in the other, and vice-versa (e.g., products to orders). This typically requires a junction model.
Direct many-to-many links are best handled via an intermediate junction model. Each primary model then defines a one-to-many relationship to the junction model. The junction model itself defines many-to-one relationships back to each primary model.
Example: Products and Orders via order_items Junction
Copy
Ask AI
# In products.ymlname: productsdescription: > Information about products offered for sale. Contains product identifiers and descriptive attributes like name and category.dimensions: - name: id description: The unique identifier (primary key) for the product. type: string - name: name description: The display name of the product. type: string - name: category description: The category the product belongs to (e.g., 'Electronics', 'Apparel'). type: stringrelationships: - name: order_items # Link TO the junction model source_col: id # Key in products (current model) ref_col: product_id # Key in order_items (related model) description: > Links this product to its corresponding line items across all orders via the 'order_items' junction model. Allows analysis of which orders contain this product. cardinality: one-to-many # One product can be in many order items type: LEFT# ... other relationships ...
Copy
Ask AI
# In orders.ymlname: ordersdescription: > Represents individual customer order transactions. Contains details about each order, including identifier, customer linkage, timing, status, and financial value. This is an event-based model crucial for sales analysis, revenue tracking, and understanding purchase patterns.dimensions: - name: order_id description: > The unique identifier for each order transaction. Serves as the primary key for this model. Used for looking up specific orders or counting unique orders. type: string - name: customer_id # Changed from user_id for consistency description: > Foreign key referencing the 'id' in the 'customers' model. Links the order back to the specific customer who placed it. Essential for customer-centric analysis. type: string - name: order_date # Changed from order_timestamp for consistency description: > The date the order was placed. Used for time-series analysis, filtering by date ranges (e.g., daily, weekly, monthly revenue), and cohort analysis based on order time. type: date searchable: true # Common to filter/search by date - name: status description: > The current fulfillment or payment status of the order (e.g., 'pending', 'shipped', 'delivered', 'cancelled'). Used for operational reporting and filtering orders by state. type: string options: ["pending", "shipped", "delivered", "cancelled"] # Example predefined valuesmeasures: - name: amount # Changed from total_amount/amount_col description: > The total monetary value of the order, typically in a specific currency (e.g., USD). This is the primary value used for revenue calculations (SUM, AVG). type: number # ... other measures like item_count ...relationships: - name: customer # Renamed from 'customers' source_col: customer_id # Key in orders (current) ref_col: id # Key in customers (related) description: > Links this order back to the specific customer who placed it, using the 'customers' model. Enables retrieving customer details (e.g., name, email, country) for order analysis. cardinality: many-to-one # Many orders belong to one customer type: LEFT # Ensures all orders are kept even if customer lookup fails# ... other relationships ...metrics: - name: revenue_from_recent_customers description: > Total revenue from orders placed by customers who signed up in the current calendar year. Requires joining with 'customers' model via 'customer' relationship. # References 'signup_year' from the related 'customers' model via 'customer' relationship # Assumes 'amount' is a measure in this model. expr: "SUM(CASE WHEN customer.signup_year = EXTRACT(YEAR FROM CURRENT_DATE()) THEN orders.amount ELSE 0 END)"# ... other metrics ...
Copy
Ask AI
# In order_items.yml (The Junction Model)name: order_itemsdescription: > Junction model linking individual products to specific orders. Each record represents a line item within an order, connecting one order to one product and specifying the quantity. Essential for resolving the many-to-many relationship between orders and products.dimensions: - name: order_id # Foreign key to orders description: Identifier linking this line item back to its parent order in the 'orders' model. type: string - name: product_id # Foreign key to products description: Identifier linking this line item back to the specific product in the 'products' model. type: stringmeasures: - name: quantity description: The quantity of the specific product included in this line item of the order. type: integerrelationships: - name: orders # Link back TO orders model source_col: order_id # Key in order_items (current model) ref_col: order_id # Key in orders (related model) - Note: Changed ref_col for consistency description: Links this order line item back to its parent order record. cardinality: many-to-one # Many line items belong to one order type: LEFT - name: products # Link back TO products model source_col: product_id # Key in order_items (current model) ref_col: id # Key in products (related model) description: Links this order line item back to the specific product record. cardinality: many-to-one # Many line items can refer to the same product type: LEFT# ... other relationships ...
Once defined, relationships enable referencing fields (dimensions, measures) from related models within metrics or filters using dot notation: relationship_name.field_name.
Example: Cross-Model Metric Reference
Copy
Ask AI
# In orders.ymlname: ordersdescription: Contains order information.# ... dimensions (including order_id, customer_id, amount) ...# ... measures ...relationships: - name: customer # Renamed from 'customers' for consistency source_col: customer_id ref_col: id description: Links this order to the customer who placed it. cardinality: many-to-one type: LEFT# ... other relationships ...metrics: - name: revenue_per_customer_signup_year # 'amount' is a measure in this 'orders' model. # 'customer.signup_year' references 'signup_year' dimension # from the related 'customers' model via the 'customer' relationship defined above. expr: "SUM(orders.amount) / NULLIF(COUNT(DISTINCT customer.signup_year), 0)" description: "Average revenue per unique customer signup year represented in the orders. Requires joining via the 'customer' relationship."# ... other metrics ...
Copy
Ask AI
# In customers.yml (Referenced by the metric above)name: customersdescription: Contains customer information.dimensions: - name: id # Primary key referenced by orders.customer_id type: string - name: signup_year # Dimension used in the cross-model metric type: integer# ... other dimensions ...# ... measures ...# ... relationships ...
Ensure source_col and ref_col correctly identify the join keys in the respective models. Mismatches lead to failed or incorrect joins.
Correct Cardinality & Type
Define cardinality and type accurately based on the data’s structure and intended join logic. Defaults are convenient but should be verified.
Mandatory Descriptions
Always provide clear, mandatory descriptions explaining the business purpose of the relationship. This is crucial for understanding the model.
Bidirectional Definitions
For clarity and completeness, consider defining relationships in both directions between connected models (e.g., orders -> customers and customers -> orders).
Junction Models for M:N
Use explicit junction models to resolve many-to-many relationships clearly. Avoid attempting direct many-to-many definitions.
Consistency
Use consistent naming conventions for relationships and related fields across your models.
Illustrating bidirectional relationships and referencing related data.
Copy
Ask AI
# models/orders.ymlname: ordersdescription: > Represents individual customer order transactions. Contains details about each order, including identifier, customer linkage, timing, status, and financial value. This is an event-based model crucial for sales analysis, revenue tracking, and understanding purchase patterns.dimensions: - name: order_id description: > The unique identifier for each order transaction. Serves as the primary key for this model. Used for looking up specific orders or counting unique orders. type: string - name: customer_id # Changed from user_id for consistency description: > Foreign key referencing the 'id' in the 'customers' model. Links the order back to the specific customer who placed it. Essential for customer-centric analysis. type: string - name: order_date # Changed from order_timestamp for consistency description: > The date the order was placed. Used for time-series analysis, filtering by date ranges (e.g., daily, weekly, monthly revenue), and cohort analysis based on order time. type: date searchable: true # Common to filter/search by date - name: status description: > The current fulfillment or payment status of the order (e.g., 'pending', 'shipped', 'delivered', 'cancelled'). Used for operational reporting and filtering orders by state. type: string options: ["pending", "shipped", "delivered", "cancelled"] # Example predefined valuesmeasures: - name: amount # Changed from total_amount/amount_col description: > The total monetary value of the order, typically in a specific currency (e.g., USD). This is the primary value used for revenue calculations (SUM, AVG). type: number # ... other measures like item_count ...relationships: - name: customer # Renamed from 'customers' source_col: customer_id # Key in orders (current) ref_col: id # Key in customers (related) description: > Links this order back to the specific customer who placed it, using the 'customers' model. Enables retrieving customer details (e.g., name, email, country) for order analysis. cardinality: many-to-one # Many orders belong to one customer type: LEFT # Ensures all orders are kept even if customer lookup fails# ... other relationships ...metrics: - name: revenue_from_recent_customers description: > Total revenue from orders placed by customers who signed up in the current calendar year. Requires joining with 'customers' model via 'customer' relationship. # References 'signup_year' from the related 'customers' model via 'customer' relationship # Assumes 'amount' is a measure in this model. expr: "SUM(CASE WHEN customer.signup_year = EXTRACT(YEAR FROM CURRENT_DATE()) THEN orders.amount ELSE 0 END)"# ... other metrics ...
Copy
Ask AI
# models/customers.ymlname: customersdescription: > Represents individual customer entities. Contains contact information, registration details, location, and derived metrics. This is a central model for understanding customer attributes and behavior, and for linking to their order history.dimensions: - name: id description: The unique identifier for a customer (primary key). type: string - name: name description: The full name of the customer. type: string - name: signup_year description: > The year the customer first registered. Referenced by metrics in the 'orders' model to analyze behavior based on customer tenure. type: integer # ... other dimensions like email, country ...# ... measures / metrics ...relationships: - name: orders # Link back TO orders model source_col: id # Key in customers (current model) ref_col: customer_id # Key in orders (related model) description: > Links this customer to all their associated order records in the 'orders' model. Enables analysis of purchasing history and behavior for each customer. cardinality: one-to-many # One customer can have many orders type: LEFT # Standard join type when starting analysis from the customer# ... other relationships ...