Metrics
Canonical metric definitions compiled into a queryable metadata table — precise enough to derive tool-specific semantic layer schemas, and rich enough to inform LLMs and analysts writing SQL.
Metrics are named business calculations — Revenue, GBV, Take Rate,
Conversion Rate — defined once in configuration and compiled into a queryable
nexus_metrics_metadata table. For the rationale behind this approach, see
Metric Definitions Should be Maintained in DBT.
A metric definition serves three purposes:
- Qualitative information for LLMs and analysts — descriptions, aliases, what questions a metric answers
- Lineage — which tables and columns a metric depends on, how it's derived
- Precise SQL — the exact expression needed to compute the metric, ensuring consistency without reinventing the wheel
The definitions are structured and precise enough to deterministically derive tool-specific schemas (see Generating Semantic Layer Schemas), and rich enough to inform an LLM writing SQL directly.
Schema
| Field | Required | Type | Description |
|---|---|---|---|
name |
yes | string | Machine-readable identifier |
label |
no | string | Human-readable display name |
aliases |
no | string[] | Alternative names people use for this metric |
type |
yes | enum | measure, count, or derived |
tables |
conditional | string[] | All tables needed (required for measure/count) |
metric_sql |
yes | string | Column expression (measure/count) or formula (derived) |
filter |
no | string[] | Fully qualified predicates, ANDed together |
format |
no | enum | currency, integer, percentage, decimal |
unit |
no | string | USD, EUR, degrees_fahrenheit, etc. |
polarity |
no | enum | positive (higher is better) or negative (lower is better) |
precision |
no | integer | Decimal places to display |
tags |
no | string[] | Organizational categories |
description |
yes | string | What it means and how it's derived |
example_questions |
no | string[] | Business questions this metric answers |
Naming Conventions
name:snake_case, lowercase. This is the machine-readable identifier used in code, SQL references, and derived metric formulas. Examples:revenue,gbv,take_rate,converting_persons,ad_spend.label: Title Case, human-readable. Used in dashboards, documentation, and LLM responses. Examples:Revenue,Gross Booking Value,Take Rate.aliases: Lowercase, natural language. How people actually refer to the metric in conversation. Examples:total revenue,gross bookings.tags: Lowercase, single words or short phrases. Organizational categories for filtering and grouping. Examples:financial,marketplace,acquisition.
Metrics are grouped by their primary model — the table that provides join context and serves as the query anchor:
vars:
nexus:
metrics:
<model_name>:
- { name, type, tables, metric_sql, filter, description, ... }
Metric Types
measure — Numeric Columns
A numeric column that the consumer aggregates. The definition specifies what to aggregate and what to filter on; the consumer chooses the aggregation function (SUM, AVG, MEDIAN, MIN, MAX).
nexus_event_measurements:
- name: revenue
label: Revenue
aliases: [total revenue, earned revenue, platform revenue]
type: measure
tables: [nexus_event_measurements, nexus_event_dimensions]
metric_sql: nexus_event_measurements.revenue
filter:
- nexus_event_dimensions.is_revenue_earned = true
format: currency
unit: USD
polarity: positive
precision: 2
tags: [financial, marketplace]
description: >
Platform revenue on paid bookings.
example_questions:
- What was our revenue last month?
- How much did we earn?
- What is average revenue per booking?
An LLM asked "what was average revenue per booking?" reads the definition and
writes AVG(m.revenue) ... WHERE d.is_revenue_earned. An analyst sees the
canonical column, filter, and table dependencies without guessing.
count — Countable Columns
A column to count or count-distinct. The aggregation is inherently COUNT — but the resulting count is a number that can be further aggregated (max daily count, median weekly count, etc.).
nexus_events:
- name: converting_persons
label: Converting Persons
type: count
tables: [nexus_events, nexus_entity_participants, nexus_event_dimensions]
metric_sql: DISTINCT nexus_entity_participants.entity_id
filter:
- nexus_event_dimensions.is_conversion = true
- nexus_entity_participants.entity_type = "person"
polarity: positive
tags: [acquisition]
description: >
Unique persons who had a conversion event.
- name: total_persons
label: Total Persons
type: count
tables: [nexus_events, nexus_entity_participants]
metric_sql: DISTINCT nexus_entity_participants.entity_id
filter:
- nexus_entity_participants.entity_type = "person"
tags: [acquisition]
description: >
Total unique persons across all events.
derived — Formulas
A formula referencing other metrics by name, or concrete SQL with aggregations baked in. The formula IS the metric — the consumer can meta-aggregate the result (e.g. median daily conversion rate) but cannot swap internal aggregation functions.
nexus_events:
- name: conversion_rate
label: Conversion Rate
type: derived
metric_sql: converting_persons / NULLIF(total_persons, 0)
format: percentage
polarity: positive
precision: 4
tags: [acquisition]
description: >
Fraction of persons who converted.
example_questions:
- What percentage of visitors convert?
- What is our conversion rate by channel?
nexus_event_measurements:
- name: take_rate
label: Take Rate
type: derived
tables: [nexus_event_measurements, nexus_event_dimensions]
metric_sql: >
(SUM(nexus_event_measurements.service_fee)
+ SUM(nexus_event_measurements.koala_commission))
/ NULLIF(SUM(nexus_event_measurements.gross_merchandise_value), 0)
filter:
- nexus_event_dimensions.is_revenue_earned = true
format: percentage
polarity: positive
precision: 4
tags: [financial, marketplace]
description: >
Platform take rate as a fraction of GBV.
example_questions:
- What percentage of GBV does the platform keep?
- How has our take rate changed over time?
Derived metrics that reference other metrics by name (like conversion_rate)
solve the filter scoping problem: each component carries its own filters, so the
numerator and denominator can have different filter conditions without ambiguity.
Design Principles
Fully Qualified References
Every column reference in metric_sql and filter includes its table name.
Zero ambiguity about what comes from where:
metric_sql: nexus_event_measurements.revenue # not just "revenue"
filter:
- nexus_event_dimensions.is_revenue_earned = true # not just "is_revenue_earned"
Tables Declare Dependencies, Not Join Paths
tables says "you need these tables to compute this metric." The consumer
knows how to join them — from nexus schema knowledge, semantic layer join
definitions, or the nexus metadata tables. The metric definition doesn't repeat
join logic.
Filters Are Predicates, Not WHERE Clauses
Each filter entry is a precise, fully qualified boolean expression. Multiple entries are ANDed. OR conditions go inside a single entry or — better — into a dimension that encodes the business logic at build time.
Complexity Belongs in the Transformation Layer
If a metric filter is getting complex (OR across sources, temporal boundaries,
data-level reasoning), push that complexity into a dimension model in dbt. The
dimension model encodes the logic once, the metric filter becomes
dimension.is_x = true. Both live in the same dbt project — one PR creates the
dimension and uses it in the metric.
Metrics Grouped by Primary Model
The model key determines:
- Query anchor — where an LLM starts the FROM clause
- Organization — related metrics live together
- Schema generation — which semantic layer object the metric belongs to
Metrics Must Live on Event-Layer Models
Metrics must be defined on tables that participate in the event layer —
tables with event_id and occurred_at columns. This includes nexus core
tables (nexus_events, nexus_event_measurements, nexus_event_dimensions,
nexus_entity_participants) and source event tables (koala_app_events,
stripe_events, etc.).
Do not define metrics on normalized or snapshot models (e.g.
koala_app_listings, koala_app_users). The semantic layer's query engine
assumes every metric can be:
- Joined via
event_id— normalized models don't have one - Filtered by
occurred_at— normalized models don't have a time axis - Grouped by day — meaningless without
occurred_at
Metrics on non-event tables will appear in the metadata catalog but cannot produce sparklines, trend charts, or period-over-period comparisons.
If you need a metric over data that currently lives in a normalized table, the
fix is upstream: create an event model in the source's intermediate/ layer
that transforms the normalized records into Nexus-compatible events with
event_id and occurred_at, then define the metric on that event table.
# Good — metric on an event-layer table
nexus_events:
- name: active_listings
type: count
tables: [nexus_events, koala_app_events]
metric_sql: DISTINCT koala_app_events.listing_id
filter:
- nexus_events.event_name = 'listing approved'
- koala_app_events.status = 'active'
# Bad — metric on a normalized snapshot table
koala_app_listings:
- name: active_listings
type: count
tables: [koala_app_listings]
metric_sql: koala_app_listings.listing_id
# No event_id, no occurred_at — breaks time-series queries
Metrics are not limited to the nexus compressed schema, however. Source-specific event tables are valid targets:
stripe_events:
- name: stripe_fees
label: Stripe Processing Fees
type: measure
tables: [stripe_events]
metric_sql: stripe_events.fee
format: currency
unit: USD
polarity: negative
precision: 2
tags: [financial]
description: >
Stripe processing fees across all payment events.
example_questions:
- How much are we paying Stripe?
- What are our payment processing costs?
Why Not a Traditional Semantic Layer?
Traditional semantic layers do several things:
- Define which tables exist and how they join
- Declare dimensions with types
- Declare metrics with aggregation rules
- Translate business questions into deterministic SQL
- Expose an API, pre-aggregations, documentation, etc.
Nexus already handles (1) and (2) through its core tables, facets, and metadata. The join paths are fixed by the nexus schema. The metadata tables catalog every available measurement, dimension, trait, and identifier.
Nexus metrics handle (3) — canonical metric declarations with enough precision to derive (4) and (5) deterministically.
LLMs Change the Equation
The traditional semantic layer's primary value is (4): translating business questions into SQL through a constrained query API. LLMs change this fundamentally — they can write SQL directly given good context.
The shift is from constraining outputs (a query API that only allows valid queries) to informing outputs (a rich schema and metric catalog that helps an LLM write correct SQL).
Nexus metrics are informing: here's the canonical definition, adapt as needed.
An LLM can compute MEDIAN(daily_conversion_rate) — a nested aggregation that
many semantic layers can't express — because the metric definition gave it
everything it needs.
The Right Boundary
| Responsibility | Nexus | Query layer |
|---|---|---|
| Cross-source extraction and labeling | measurements, dimensions | — |
| Join paths and entity graph | core schema | — |
| Discoverability catalog | metadata tables | — |
| Canonical metric definitions | nexus_metrics_metadata |
— |
| Deterministic schema generation | metric definitions | semantic layer tools |
| SQL generation from questions | — | LLM, semantic layer, analyst |
| Time intelligence (YoY, MTD) | — | LLM, semantic layer, analyst |
| Pre-aggregation / caching | — | semantic layer, BI tool |
| Visualization | — | dashboard, artifact |
Configuration
Metrics are defined in dbt_project.yml under vars.nexus.metrics, grouped by
primary model:
vars:
nexus:
metrics:
nexus_event_measurements:
- name: revenue
label: Revenue
aliases: [total revenue, earned revenue]
type: measure
tables: [nexus_event_measurements, nexus_event_dimensions]
metric_sql: nexus_event_measurements.revenue
filter:
- nexus_event_dimensions.is_revenue_earned = true
format: currency
unit: USD
polarity: positive
precision: 2
tags: [financial]
description: >
Platform revenue on paid bookings.
example_questions:
- What was our revenue last month?
- How much did we earn?
- name: gbv
label: Gross Booking Value
aliases: [gross bookings, total bookings value]
type: measure
tables: [nexus_event_measurements, nexus_event_dimensions]
metric_sql: nexus_event_measurements.gross_merchandise_value
filter:
- nexus_event_dimensions.is_revenue_earned = true
format: currency
unit: USD
polarity: positive
precision: 2
tags: [financial, marketplace]
description: >
Total gross merchandise value on paid booking events.
Includes guest fees, owner payout, and platform commission.
example_questions:
- What was total GBV this quarter?
- How much merchandise value did we process?
- name: take_rate
label: Take Rate
type: derived
tables: [nexus_event_measurements, nexus_event_dimensions]
metric_sql: >
(SUM(nexus_event_measurements.service_fee)
+ SUM(nexus_event_measurements.koala_commission))
/ NULLIF(SUM(nexus_event_measurements.gross_merchandise_value), 0)
filter:
- nexus_event_dimensions.is_revenue_earned = true
format: percentage
polarity: positive
precision: 4
tags: [financial, marketplace]
description: >
Platform take rate as a fraction of GBV.
example_questions:
- What percentage of GBV does the platform keep?
- name: ad_spend
label: Ad Spend
type: measure
tables: [nexus_event_measurements]
metric_sql: nexus_event_measurements.spend
format: currency
unit: USD
polarity: negative
precision: 2
tags: [marketing]
description: >
Total advertising spend across Facebook Ads and Google Ads.
nexus_events:
- name: converting_persons
label: Converting Persons
type: count
tables: [nexus_events, nexus_entity_participants, nexus_event_dimensions]
metric_sql: DISTINCT nexus_entity_participants.entity_id
filter:
- nexus_event_dimensions.is_conversion = true
- nexus_entity_participants.entity_type = "person"
polarity: positive
tags: [acquisition]
description: >
Unique persons who had a conversion event.
- name: total_persons
label: Total Persons
type: count
tables: [nexus_events, nexus_entity_participants]
metric_sql: DISTINCT nexus_entity_participants.entity_id
filter:
- nexus_entity_participants.entity_type = "person"
tags: [acquisition]
description: >
Total unique persons across all events.
- name: conversion_rate
label: Conversion Rate
type: derived
metric_sql: converting_persons / NULLIF(total_persons, 0)
format: percentage
polarity: positive
precision: 4
tags: [acquisition]
description: >
Fraction of persons who converted.
example_questions:
- What percentage of visitors convert?
stripe_events:
- name: stripe_fees
label: Stripe Processing Fees
type: measure
tables: [stripe_events]
metric_sql: stripe_events.fee
format: currency
unit: USD
polarity: negative
precision: 2
tags: [financial]
description: >
Stripe processing fees across all payment events.
When dbt's vars.yml feature ships (v1.12+), metrics can be extracted to their
own file with zero changes to the model.
The Model: nexus_metrics_metadata
The nexus_metrics_metadata model reads metric definitions from vars and
compiles them into a queryable table. The model iterates over the model-grouped
structure and flattens it into rows:
{{ config(materialized='table') }}
{% set all_metrics = var('nexus', {}).get('metrics', {}) %}
{% set has_metrics = all_metrics | length > 0 %}
{% if has_metrics %}
{% set ns = namespace(first=true) %}
{% for model_name, metrics_list in all_metrics.items() %}
{% for metric in metrics_list %}
{% if not ns.first %}UNION ALL{% endif %}
{% set ns.first = false %}
SELECT
'{{ model_name }}' AS model,
'{{ metric.name }}' AS metric_name,
'{{ metric.get("label", metric.name) }}' AS label,
{{ "'" ~ metric.aliases | join(", ") ~ "'" if metric.get('aliases') else 'NULL' }} AS aliases,
'{{ metric.type }}' AS metric_type,
{{ "'" ~ (metric.tables | join(", ")) ~ "'" if metric.get('tables') else 'NULL' }} AS tables,
'{{ metric.metric_sql | trim }}' AS metric_sql,
{{ "'" ~ (metric.filter | join(" AND ")) ~ "'" if metric.get('filter') else 'NULL' }} AS filter,
{{ "'" ~ metric.format ~ "'" if metric.get('format') else 'NULL' }} AS format,
{{ "'" ~ metric.unit ~ "'" if metric.get('unit') else 'NULL' }} AS unit,
{{ "'" ~ metric.polarity ~ "'" if metric.get('polarity') else 'NULL' }} AS polarity,
{{ metric.precision if metric.get('precision') is not none else 'NULL' }} AS precision,
{{ "'" ~ (metric.tags | join(", ")) ~ "'" if metric.get('tags') else 'NULL' }} AS tags,
'{{ metric.get("description", "") | trim }}' AS description,
{{ "'" ~ (metric.example_questions | join("; ")) ~ "'" if metric.get('example_questions') else 'NULL' }} AS example_questions
{% endfor %}
{% endfor %}
{% else %}
SELECT
CAST(NULL AS VARCHAR) AS model,
CAST(NULL AS VARCHAR) AS metric_name,
CAST(NULL AS VARCHAR) AS label,
CAST(NULL AS VARCHAR) AS aliases,
CAST(NULL AS VARCHAR) AS metric_type,
CAST(NULL AS VARCHAR) AS tables,
CAST(NULL AS VARCHAR) AS metric_sql,
CAST(NULL AS VARCHAR) AS filter,
CAST(NULL AS VARCHAR) AS format,
CAST(NULL AS VARCHAR) AS unit,
CAST(NULL AS VARCHAR) AS polarity,
CAST(NULL AS INTEGER) AS precision,
CAST(NULL AS VARCHAR) AS tags,
CAST(NULL AS VARCHAR) AS description,
CAST(NULL AS VARCHAR) AS example_questions
WHERE 1 = 0
{% endif %}
Metadata Table Schema
| Column | Type | Description |
|---|---|---|
model |
STRING | Primary model (query anchor and grouping) |
metric_name |
STRING | Machine-readable identifier |
label |
STRING | Human-readable display name |
aliases |
STRING | Comma-separated alternative names |
metric_type |
STRING | measure, count, or derived |
tables |
STRING | Comma-separated table dependencies |
metric_sql |
STRING | Column expression or formula |
filter |
STRING | Predicates ANDed together |
format |
STRING | Display format (currency, percentage, etc.) |
unit |
STRING | Unit of measurement |
polarity |
STRING | positive or negative |
precision |
INTEGER | Decimal places |
tags |
STRING | Comma-separated organizational categories |
description |
STRING | Plain-English explanation |
example_questions |
STRING | Semicolon-separated business questions this metric answers |
Querying Metrics Metadata
Discover Available Metrics
SELECT * FROM nexus_metrics_metadata
LLM Context
An LLM writing SQL queries the metrics catalog alongside measurement and dimension metadata:
SELECT
model,
metric_name,
metric_type,
metric_sql,
filter,
description,
example_questions
FROM nexus_metrics_metadata
ORDER BY model, metric_name
Given the revenue metric definition, an LLM constructs:
SELECT
DATE_TRUNC('month', m.occurred_at) AS month,
SUM(m.revenue) AS revenue
FROM nexus_event_measurements m
JOIN nexus_event_dimensions d ON m.event_id = d.event_id
WHERE d.is_revenue_earned = true
GROUP BY 1
ORDER BY 1
The metric definition provided the column, the filter, and the table dependencies. The LLM assembled them into a query with time granularity, grouping, and ordering appropriate for the question.
Human Documentation
Metric descriptions, labels, and aliases serve double duty. A dashboard showing a "Revenue" metric card populates a tooltip from the description:
Revenue: Platform revenue on paid bookings.
The aliases field means an LLM recognizes "total revenue", "earned revenue",
and "platform revenue" as the same metric. And example_questions helps an LLM
match a user's natural language question to the right metric.
Relationship to Other Facets
Metrics reference event-layer tables — nexus core tables, source event tables,
or any model with event_id and occurred_at.
Source Data
→ Events (what happened)
→ Measurements (quantitative values)
→ Dimensions (categorical filters)
→ Source event tables (source-specific columns)
→ Metrics (canonical business calculations over event-layer tables)
Measurements and dimensions are in the data — rows in tables, computed at build time from source records. Metrics are about the data — definitions that describe how to query the data. This is why measurements and dimensions are core nexus tables, while metrics are a metadata table compiled from configuration.
Normalized and snapshot tables (e.g. Fivetran-synced application tables) are useful as dimension sources and reference data, but should not be metric hosts. If a business calculation needs data from a normalized table, transform that data into events upstream so the metric can be time-series compatible.
When Not to Define a Metric
Not every business metric needs a definition. Many metrics follow a standard pattern that's derivable from the event name alone — an LLM or analyst already knows how to write these queries given the nexus schema.
Pageviews — count of events with a specific event name:
SELECT COUNT(nexus_events.event_id)
FROM nexus_events
WHERE nexus_events.event_name = 'pageview'
Unique pageviewers — distinct persons who triggered the event:
SELECT COUNT(DISTINCT nexus_entity_participants.entity_id)
FROM nexus_events
LEFT JOIN nexus_entity_participants
ON nexus_events.event_id = nexus_entity_participants.event_id
WHERE nexus_events.event_name = 'pageview'
AND nexus_entity_participants.entity_type = 'person'
These are general patterns — COUNT(events) and COUNT(DISTINCT entities) by
event name — that apply to any event type. Defining a metric for each one would
be noise. The event metadata already tells an LLM everything it needs.
Define a metric when:
- The calculation isn't obvious from the schema (e.g., take rate involves specific columns and division)
- It requires specific filters beyond event name (e.g., revenue requires
is_revenue_earned = true) - It's a derived formula combining other metrics (e.g., conversion rate)
- You want to attach qualitative metadata — description, example questions, polarity — that wouldn't be discoverable otherwise
- Consistency matters and you don't want different people writing different SQL for the same concept
Skip a metric when the query is a predictable pattern over event_name that
anyone familiar with the schema could write.
Adding Metrics to a Client Project
- Define metrics in
dbt_project.ymlundervars.nexus.metrics, grouped by primary model - Create the model at
models/nexus/nexus_metrics_metadata.sql(copy the model template above) - Run
dbt build— the model compiles metric definitions into a queryable table
The metrics model has no upstream dependencies on other nexus tables — it reads purely from configuration. It can be built independently and updated by editing YAML.
Generating Semantic Layer Schemas
The metric definitions are structured and precise enough to deterministically
generate schemas for semantic layer tools like
Cube or
MetricFlow. A
measure type maps to a tool-native measure with consumer-chosen aggregation;
a count type maps to a count-distinct measure; a derived type maps to a
calculated metric referencing its components. The model key determines which
semantic layer object (cube, semantic model, etc.) the metric belongs to, and
tables declares the required joins.
If a metric can't be expressed precisely enough to generate a schema deterministically, it's a signal that the definition needs refinement — decompose into components, or push complexity into a dimension.
Related Documentation
- Event Measurements — Quantitative observations (the raw values metrics can aggregate)
- Event Dimensions — Categorical filters (the event selectors metrics can use)
- Metadata Tables — Other auto-generated catalogs
- Architecture Overview — Core tables ERD, key joins, and source-specific event fields