1. Introduction to Metrics

Metrics are calculations based on measures or dimensions within your semantic models. They represent key business performance indicators (KPIs) and other frequently used calculations, encapsulating business logic centrally.

Here’s the basic structure for defining a metric:

metrics:
  - name: metric_name     # Required - A unique identifier for the metric
    expr: "SQL_expression" # Required - SQL-like expression defining the calculation
    description: "Business definition" # Required - Clear explanation of the metric
    args:                   # Optional - Used for parameterized metrics
      - name: parameter_name
        type: parameter_type
        description: "Parameter description"
        # ... other parameter fields (e.g., default)

2. Defining Basic Metrics

At their core, metrics combine a name, a calculation expression (expr), and a description.

Required Fields

  • name: A unique identifier for the metric within the model. Use clear, descriptive names.
  • expr: A SQL-like expression that defines the calculation. This expression forms the basis of the metric’s value.
  • description: A clear explanation of what the metric represents in business terms. This is mandatory and crucial for understanding.

Writing the expr

The expr field is where you define the metric’s calculation using SQL syntax compatible with your data warehouse.

  • Referencing Fields: You can directly reference dimensions and measures defined within the same model (e.g., amount, user_id).
  • Aggregations: Use standard SQL aggregation functions: SUM, AVG, COUNT, COUNT DISTINCT, MIN, MAX, etc.
  • Operators: Arithmetic (+, -, *, /), logical (AND, OR, NOT), comparison (=, >, <, etc.) are supported.
  • Functions: Most standard SQL functions like CASE, COALESCE, NULLIF, date functions, string functions, etc., can be used.

When referencing fields within the same model in the expr, you typically don’t need a table prefix (e.g., SUM(amount) is usually sufficient if amount is a measure in the current model). However, using prefixes (e.g., SUM(orders.amount)) can enhance clarity, especially in complex models. See Best Practices for more on prefixes when using related models.

Example: Basic Metric

metrics:
  - name: total_revenue
    expr: "SUM(amount)"
    description: Total revenue from all orders in the queried context.

  - name: order_count
    expr: "COUNT(DISTINCT order_id)"
    description: The total number of unique orders.

3. Advanced Metric Types

Beyond basic calculations, you can create more dynamic and interconnected metrics.

Parameterized Metrics

Create flexible metrics that accept user inputs at query time. Define parameters using the args field and reference them in the expr using double curly braces {{parameter_name}}.

  • args: A list defining each parameter.
    • name: The parameter name used in the expr (e.g., {{days}}) (required).
    • type: Data type (string, integer, number, date, boolean) (required).
    • description: Explanation for the user (required).
    • default (Optional): A default value if the user doesn’t provide one.

Example: Parameterized Metric

metrics:
  - name: revenue_in_last_n_days
    description: Calculates total revenue over the last N days, specified by the user.
    # Assumes 'order_date' and 'amount' are fields in the current model
    expr: "SUM(CASE WHEN order_date >= DATE_SUB(CURRENT_DATE(), INTERVAL {{days}} DAY) THEN amount ELSE 0 END)"
    args:
      - name: days
        type: integer
        description: Number of days to include in the revenue calculation.
        default: 30 # Example default value

Cross-Model Metrics

Reference fields (dimensions or measures) from related models using dot notation (related_model_name.field_name). This requires an entity relationship to be defined between the models involved.

Example: Cross-Model Metric

# In the 'orders' model, assuming an entity named 'customer' links to a 'customers' model
metrics:
  - name: revenue_per_customer
    # Assumes 'amount' is in 'orders', 'customer.id' references the 'id' in the related 'customers' model
    expr: "SUM(amount) / NULLIF(COUNT(DISTINCT customer.id), 0)" # Use NULLIF for safety
    description: "Average revenue per unique customer. Requires joining with the 'customer' entity."

Using cross-model references requires careful consideration of joins when querying. See “Handling Joins and Related Models” below.

4. Best Practices & Considerations

Follow these guidelines to create robust, understandable, and maintainable metrics.

General Best Practices

Business Focus

Define metrics that represent meaningful business KPIs, translating raw data into actionable insights.

Clear Naming & Descriptions

Use consistent, descriptive names and provide mandatory, clear descriptions explaining the calculation and business purpose.

Leverage Existing Components

Build metrics by referencing existing measures and dimensions within the model rather than duplicating complex logic.

Document Dependencies

If a metric relies on parameters or data from related models (via entities), clearly state these requirements in its description.

Expression Best Practices

  • Type Consistency: Ensure metric expressions resolve to appropriate data types (usually numeric) suitable for aggregation or further calculations.
  • Use Table Prefixes for Related Models: Always use table-qualified names (e.g., customer.id, orders.amount) in your expr when referencing fields from related models (via entities) to prevent ambiguity. Prefixes are optional but recommended for clarity even for fields in the primary model.
  • SQL Conventions:
    • Wrap complex expressions or calculations in parentheses () to ensure correct operator precedence.
    • Handle potential division-by-zero errors using NULLIF(denominator, 0). Example: SUM(profit) / NULLIF(SUM(revenue), 0).
    • Be mindful of nulls. Use COALESCE or other relevant SQL functions to handle them appropriately within calculations, especially when dealing with outer joins.
  • Avoid Circular References: Do not define metrics that refer back to themselves directly or indirectly.

Good Expression Example (Parameterized):

metrics:
  - name: avg_order_value_by_date
    description: Average order value for orders within a specified date range. Requires 'orders.created_at' and 'orders.amount' fields.
    # Uses table prefixes for clarity
    expr: "SUM(CASE WHEN orders.created_at BETWEEN '{{start_date}}' AND '{{end_date}}' THEN orders.amount ELSE 0 END) / NULLIF(COUNT(CASE WHEN orders.created_at BETWEEN '{{start_date}}' AND '{{end_date}}' THEN orders.id ELSE NULL END), 0)"
    args:
      - name: start_date
        type: date
        description: The start date (YYYY-MM-DD). # Required
      - name: end_date
        type: date
        description: The end date (YYYY-MM-DD). # Required

Parameter Best Practices

  • Explicit Typing: Always define the type for each parameter (string, integer, number, date, boolean). This enables validation and correct query generation.
  • Clear Documentation: Provide a meaningful description for each parameter.
  • Placeholders: Reference parameters in the expr using double curly braces: {{parameter_name}}.
  • Default Values: Provide default values for optional parameters to simplify usage.
  • Validation Notes:
    • string: Will be automatically quoted if needed during query generation.
    • number/integer: Must be valid numeric values.
    • date: Should ideally be provided in ISO format (YYYY-MM-DD).
    • boolean: Should be true/false (or potentially 1/0 depending on backend).

Parameter Definition Pattern to Avoid:

# Bad: Parameter type is missing, hindering validation and usage.
args:
  - name: min_amount
    # type: number <-- Missing!
    description: Minimum order amount to consider.
    default: 100
  • Primary Association: Each metric belongs primarily to the model where it’s defined.
  • Explicit Joins Required in Query: If a metric’s expr references fields from a related model (e.g., customer.id), the query using that metric must explicitly include the necessary join to that related model via its defined entity.
  • No Automatic Joins: The semantic layer does not automatically infer or add joins based solely on the metric’s expr. It validates that joins used in a query correspond to defined entities.
  • Robustness with Joins: When dealing with related models (especially via outer joins), consider using CASE statements or checking for nulls on fields from those models to make the metric resilient to missing data.

5. Comprehensive Example

Here’s an example within an orders model context, incorporating various metric types and practices:

models:
  - name: orders
    description: Customer order data.
    dimensions:
      - name: order_id
        type: string
      - name: order_date # Renamed for clarity
        type: date
      - name: status
        type: string
      - name: customer_id # Foreign key to a customers model
        type: string
    measures:
      - name: amount
        description: Order amount in USD
        type: number
      - name: item_count
        description: Number of items in the order
        type: integer
    entities:
      - name: customer # Defines relationship to a 'customers' model
        primary_key: id # Assumes 'id' is the PK in 'customers'
        foreign_key: customer_id # FK in 'orders'
        # ... other relationship details (e.g., join type)
    metrics:
      # Basic Metrics
      - name: total_revenue
        expr: "SUM(orders.amount)" # Prefix added for consistency
        description: Total revenue from all orders in the queried context.

      - name: average_order_value
        expr: "AVG(orders.amount)" # Prefix added
        description: The average value of a single order.

      - name: order_count
        expr: "COUNT(DISTINCT orders.order_id)" # Prefix added
        description: The total number of unique orders.

      - name: average_items_per_order
        expr: "AVG(orders.item_count)" # Prefix added
        description: Average number of items included in each order.

      # Parameterized Metric
      - name: revenue_by_period
        expr: "SUM(CASE WHEN orders.order_date >= DATE_SUB(CURRENT_DATE(), INTERVAL {{days}} DAY) THEN orders.amount ELSE 0 END)"
        description: Revenue for a specified time period (N days back from today).
        args:
          - name: days
            type: integer
            description: Number of days to look back.
            default: 30 # Default to 30 days

      # Cross-Model Metric
      - name: revenue_per_customer
        expr: "SUM(orders.amount) / NULLIF(COUNT(DISTINCT customer.id), 0)" # Prefixes used
        description: "Average revenue per unique customer. Requires joining with the 'customer' entity."

      # More Complex Parameterized Metric
      - name: revenue_growth_vs_prior_periods
        description: "Calculates the percentage growth in revenue compared to a specified number of prior periods. Requires 'orders.order_date' and 'orders.amount'. Note: Exact LAG syntax depends on SQL dialect."
        expr: "(SUM(orders.amount) - LAG(SUM(orders.amount), {{periods}}) OVER (ORDER BY orders.order_date)) / NULLIF(LAG(SUM(orders.amount), {{periods}}) OVER (ORDER BY orders.order_date), 0)"
        args:
          - name: periods
            type: integer
            description: Number of periods to look back for comparison (e.g., 1 for previous period). # Required

6. Next Steps

With your metrics defined, consider:

  1. Defining Filters for reusable query conditions.
  2. Establishing Relationships (Entities) to link different models together, enabling cross-model metrics.
  3. Reviewing Complete Examples of semantic model definitions.

By following these structures and practices, your metrics will be powerful, understandable, and easier to manage within your semantic layer.