Precompute architecture

How to accelerate the join-heavy substrate by not running the joins at read time — the precompute ladder, core & canonical queries, the ~6 grain tables the join graph collapses to, and edges vs. paths.

The BigQuery optimizations recover speed at build time by reshaping the core models so the warehouse can prune and hash-join them cheaply. They make the joins cheaper. This page is about the larger lever: not running the joins at read time at all — moving the join-heavy read path into materialized warehouse tables — without giving up the universality of raw SQL.

The fastest join is the one already computed.

The precompute ladder

Once an artifact is finalized, its queries are static in structure (parameterized only by date range, filters, and scope). That turns the whole problem into a single axis: at what grain do we precompute results?

Level What gets precomputed Lives in SQL universality Role
L1 — none nothing; live SQL on the core tables full today's default
L2 — semantic pre-agg rollups at a predefined metric × dimension grain warehouse lost — only predefined metrics emergent, never the primary
L3 — core-query materialization the artifact's own spine, materialized at its grain warehouse full — arbitrary SQL the primary lever
L4 — result cache in the app DB frozen result rows app silo n/a ruled out

Two lines decide everything here:

  • L1–L3 all land in warehouse tables. Only L4 leaves the warehouse. Materialized context must live in the client's warehouse, not an app silo — that's the warehouse-native principle (durability, portability, composability; see the internal strategy docs). So L4 — caching results in Convex — is off the table regardless of how tempting it looks.
  • L2 loses universality; L3 keeps it. A query generator (L2) can only emit what its metric vocabulary predefines. The bespoke, business-logic-heavy artifacts that are the actual product — advisor tools, audit views, worklists — can't be expressed in a constrained DSL. So precompute must operate on the artifact's SQL as written.

The consequence: L3 is the mechanism; L2 is a refactor that emerges from it. You never start at L2 (committing to a vocabulary up front is what costs you universality). You start from arbitrary SQL at L3 and generalize downward — factoring a shared rollup only where multiple artifacts demonstrably overlap.

Core & canonical queries

Every artifact's SQL already has two layers, even when it's one string:

  1. A core query — the parameter-independent join spine plus business logic (what counts as a "household", an "active" account, a "primary advisor").
  2. Component aggregations / filters on top — the metric card, the table, the chart.

This isn't theoretical. A real advisor-home dashboard defines a householdsCte (the core) and ~9 component queries that are each literally WITH ${householdsCte} SELECT <aggregation> FROM active_households …. The author already separated core from component — but the core spine executes ~9× per page load, once inside each component's SQL, because the warehouse sees nine distinct query texts and dedupes nothing. That redundancy is the cost.

The parameter boundary

Materializing a core query is not "materialize the result" — that freezes the filters (it's L4 in disguise). It's a split at the parameter boundary:

  • Parameter-independent half → precompute. All the joins and business logic, materialized at the finest grain that spans the artifact's known filterable dimensions (lifted to columns) plus time. The expensive work runs once per refresh.
  • Parameter-dependent half → stays live. The user's filter values, the date window, the scope, and the final aggregation become cheap predicates and group-bys on the materialized table.

Universality is preserved because the precompute is the artifact's SQL, just truncated at the parameter line. Parameter-relative logic — a "trailing 90 days" that depends on a user-supplied N, or a ranking computed over the filtered set — is the one thing that can't move below the boundary; it stays at read time, but on top of a join-free input.

Canonical queries

A canonical query is a core query shared across artifacts. The household/advisor spine isn't unique to one dashboard — entity_b_role = 'household' appears in 9 advisor artifacts, primary_advisor in 8. Every one rebuilds the same relationship/entity spine by hand, on every load.

Promote that spine to a single materialized model and all of them read it join-free. Canonical models emerge from observed reuse (a grep across the artifact corpus), not from a vocabulary designed up front — which is exactly why universality survives.

Grain tables: precomputing the join closure

Because the schema is fixed, you can precompute the joins themselves. The natural question is "how many permutations?" — and the answer is bounded, because the join graph is sparse.

The canonical join graph is ~12 nodes and ~13 edges — nowhere near a dense graph (which would be ~78). It's two hubs bridged by one table:

  • events hub (event_id): → dimensions, → measurements, → <source>_events, → participants, → touchpoints, → attribution
  • entities hub (entity_id): → participants, → identifiers, → states, → entity_states
  • bridge: nexus_entity_participants is the only edge between the hubs

"Every conceivable join" reads like a combinatorial explosion (2¹²), but it collapses, because 1:1 edges fold in for free and 1:N edges are grain boundaries where you stop. Walk the graph absorbing every 1:1 edge and halting at every fan-out, and you get ~6 maximal-lossless-join tables:

Grain table Join One row per
event-wide events ⋈ dimensions ⋈ measurements ⋈ <source>_events (all 1:1) event
entity-wide entities ⋈ entity_states ⋈ identifier rollup entity
activity participants ⋈ event-wide ⋈ entity-wide (the bridge) event × participant
state-history entities ⋈ states (1:N, temporal) entity × state
attribution touchpoints ⋈ attribution_model_results ⋈ channels attribution record
relationship relationships ⋈ entities×2 (the self-bridge) entity pair

Cover those ~6 and you've covered every canonical join — every query becomes a projection + filter + aggregation of one grain table. The combinatorial "which joins do I need" reduces to "pick a grain, select your columns" — O(grains), not O(2^tables).

Build-a-grain vs. between-grain joins

Grain tables don't make joins disappear — they change which joins remain:

  • Joins that build a grain — multi-way, fan-out, full-history reconstruction (entities ⋈ participants ⋈ events ⋈ dimensions). This is the expensive part. → absorbed into the grain tables, run once at refresh.
  • Joins between grains — a single hub-key join (event_id / entity_id), usually after a rollup that removes the fan-out. → left in the core query, and cheap.

"Emails with event counts" is entity_wide ⋈ (activity rolled up to per-entity COUNT) — the rollup collapses the fan-out, then it's a 1:1 join on entity_id, the exact column the grain tables are clustered on and declared PK/FK. "Events × attribution" is event_wide ⋈ attribution ON event_id. Both reduce to the single best-optimized join shape a warehouse has, instead of a 3–4-hop reconstruction through raw substrate.

The fan-out trap

"Precompute every join" is not "join everything into one wide table." Cross a 1:N edge into a wider table and you fan out and double-count — a per-event SUM(revenue) measured across the participant fan-out is simply wrong. This is why the join planner emits paths with warnings rather than a closure, and why nexus_relationships is deliberately excluded from its graph. The precompute unit is "maximal join at a fixed grain," six of them, each grain explicit — never a blind universal join.

Edges vs. paths

nexus_relationships is a graph (entity_a_id, entity_b_id both → entities), and graphs have paths. That's why you can't hand it a single canonical join path — the meaningful paths aren't in the schema, they're business semantics. This splits cleanly:

  • Edge — universal (schema). An entity-pair grain (relationships ⋈ entities ⋈ entities), one flat row per edge with both endpoints' attributes, clustered on (relationship_type, entity_a_role, entity_b_role). Makes any single hop a cheap role-filtered key-join, for every client. Lives in the substrate.
  • Path — client canonical (semantics). advisor —primary_advisor→ household —household→ account is one firm's vocabulary, not a schema edge. That traversal materializes as a canonical core query, self-joining the edge grain along the specific role chain, once at refresh. (Fixed-depth chains are simple joins; an unbounded-depth traversal — an org tree — becomes a recursive / transitive-closure materialization, still a canonical query.)

The rule of thumb: edges are universal grain tables; paths are materialized canonical queries. Relationships are exactly where the universal layer hands off to per-domain materialization.

Putting it together: the promotion ladder

These levels aren't a one-time choice — they're a promotion path, driven by traffic and a cost crossover (materialize when reads-per-refresh × join-cost-saved > refresh-cost):

  1. Every artifact ships at L1 (live SQL) — zero precompute cost, always fresh, correct for exploration and cold artifacts.
  2. A finalized, hot artifact is promoted to L3 — its core query is materialized at the parameter boundary; components become join-free reads.
  3. When several artifacts' cores share a grain, the shared spine is factored into a canonical model or a grain table (L2/substrate) — repoint all of them. Generalize downward from observed overlap.

The crossover is also the storage/build governor: cold artifacts that wouldn't pay back the refresh cost stay on L1. Nothing is precomputed speculatively.

Reference

  • BigQuery optimizations — the build-time reshaping (partition / cluster / PK-FK) that the grain and core tables inherit for free.
  • Caching & row-scoping — how the cross-user result cache and permission scoping interact with materialization.
  • Measuring impact — how to A/B the cost of a materialization vs. the live query.