Nexus Data

dbt-nexus's substrate is join-heavy by design. Every artifact query typically stitches nexus_events ⋈ nexus_event_dimensions ⋈ nexus_event_measurements ⋈ nexus_entity_participants ⋈ nexus_entities ⋈ nexus_relationships, often 4–6 joins deep. This trades raw query speed for schema flexibility (any source can emit any dimension without DDL coordination) and write-time decoupling.

The package's job is to recover as much of that lost speed as possible at build time — pushing the canonical core models into shapes the warehouse can prune, hash-join, and reason about cheaply — without giving up the flexibility.

TL;DR

If you're on BigQuery and the feature flag is at its default, you already have:

Monthly partition prune on occurred_at-filtered queries against nexus_events, nexus_event_dimensions, nexus_event_measurements, and nexus_entity_participants. Typically ~10× scan-byte reduction on a 90-day window.
Block prune via clustering on the columns artifact queries actually filter (event_name, source, entity_type, relationship_type, event_id).
Join elimination via BigQuery informational PK/FK constraints — a LEFT JOIN nexus_event_dimensions whose dim columns aren't selected scans only the left side. BigQuery drops the join.

Together these compound: a date-windowed dashboard query that joins the substrate but selects narrowly often goes from "scan everything" to "scan one month of one table".

Scope of this section

The two build-time pages make the existing joins cheaper; the two architecture pages are about not running them at read time at all. Between them they cover query shape and materialization strategy. What remains genuinely out of scope:

Incremental rebuild mechanics. Whether to materialize is covered in Precompute architecture; the mechanics of cheap incremental refresh (partition write-truncate, late-arriving data, identity-resolution recompute) are a separate topic.
Snowflake build-time tuning. The toggle defaults off there because Snowflake clustering is billed background work, not free — opt in deliberately if you've measured a benefit. The read-path architecture is warehouse-agnostic; only the build-time partition/cluster specifics are BigQuery-shaped.
Diagnosing an un-prunable query. A query that doesn't filter by occurred_at can't partition-prune; see Measuring impact to tell that apart from a build-time miss.

Nexus Data

Performance

Pages

TL;DR

Scope of this section