Nexus Data

Learning Objectives

By the end of this lesson, you will be able to:

Describe the four built-in dbt materializations
Pick the right materialization for each layer of the Doe family project
Configure materialization at the project, directory, and model level
Recognize when to reach for incremental (and when not to)

What materialization means

A model is a select statement. The materialization decides what dbt does with that select — what physical object (or no object) it creates in the warehouse.

Materialization	What dbt creates
`view`	A view: `create or replace view ... as <select>`
`table`	A table, rebuilt from scratch every run
`incremental`	A table, but only new/changed rows are inserted on each run
`ephemeral`	No object at all — inlined as a CTE into downstream models

The first two are the workhorses. The third is for scale. The fourth is a niche tool.

`view` — light, always fresh, never stored

A view is just a saved SQL definition. Querying it re-executes the underlying select against whatever the source data currently is.

Use when:

The model is cheap to compute on the fly
You want the freshest possible data every time someone queries
You're in a staging layer where each select is light cleanup

Don't use when:

The query is expensive (every read pays the full cost)
Downstream models read from it repeatedly in the same run

{{ config(materialized='view') }}

select
    member,
    email,
    role
from {{ ref('family_members') }}
where email is not null

`table` — heavy, fast to read, rebuilt every run

A table is the materialized result of your select, stored to disk. dbt drops and recreates it every dbt run.

Use when:

Downstream models or queries read from it often
The select is expensive enough that you want to pay the cost once per run instead of per query
You need real warehouse statistics for the optimizer

Don't use when:

The model is small and trivial (view is enough)
You're rebuilding billions of rows every run (look at incremental)

{{ config(materialized='table') }}

select
    entity_id,
    display_name,
    last_interacted_at
from {{ ref('nexus_entities') }}
where entity_type = 'person'

For the Doe family, every marts/ model is a table. They're queried constantly (the family contacts list, the Christmas card list) and the underlying data isn't huge.

`incremental` — append-only at scale

incremental is a table that dbt builds in two modes:

Full refresh (first run, or --full-refresh): build from scratch
Incremental (subsequent runs): only insert rows that are new since the last build

For the Doe family this rarely matters — their data is small. But once you have 10M+ rows in a source, full-refreshing on every run becomes painful. incremental is how production nexus deployments keep nexus_events build times manageable.

A minimal incremental model:

{{ config(materialized='incremental', unique_key='event_id') }}

select
    event_id,
    occurred_at,
    event_type,
    source
from {{ ref('all_events') }}

{% if is_incremental() %}
    where occurred_at > (select max(occurred_at) from {{ this }})
{% endif %}

The {% if is_incremental() %} block runs only on incremental builds. On a full refresh it's skipped and dbt builds the full table.

We don't dive deeper here — the dbt incremental docs go into all the modes (append, merge, delete+insert) and the trade-offs. For now, just know that incremental exists and what it's for.

`ephemeral` — no object, inlined as a CTE

An ephemeral model never lands in the warehouse. Instead, dbt inlines its select as a CTE wherever it's ref()'d from.

Use when:

You want to factor out a snippet for readability without paying for another object
A model is referenced once and only once

Don't use when:

Multiple downstream models read it (you'll re-execute the CTE multiple times)
The select is heavy (each downstream pays the cost)

In practice you'll reach for ephemeral rarely. Mention it once, recognize it when you see it, move on.

Setting materialization

Materialization can be set at three levels — most specific wins.

1. Project level (`dbt_project.yml`)

Sets the default for entire directory subtrees:

models:
  doe_family:
    sources:
      +materialized: view
    staging:
      +materialized: view
    marts:
      +materialized: table

The + prefix is dbt's "this is a config" syntax.

2. Directory `_properties.yml` / inline YAML

Less common — you can override at a subdirectory level by adding a nested key.

3. Model file (`{{ config(...) }}`)

Highest precedence — set right inside the model:

{{ config(materialized='incremental', unique_key='event_id') }}

select ...

Use this for one-off overrides; let the project-level config carry the defaults.

The Doe family's materialization strategy

Layer	Default	Why
`sources/`	`view`	Trivial cleanup; freshness over speed
`staging/`	`view`	Mostly thin reshape; cheap to recompute
`marts/`	`table`	Read constantly by humans and downstream queries
(special)	`incremental`	Promoted on a case-by-case basis once a table gets large

This is a sensible default for any nexus project at personal-to-small-team scale. Most production deployments follow the same pattern.

Hands-On Exercise

Create models/marts/family_member_count.sql:

{{ config(materialized='view') }}

select count(*) as n_family_members
from {{ ref('family_members') }}

Run dbt build -s family_member_count. Inspect the result in BigQuery — note that it's a view, not a table.
Change the config to table:
```
{{ config(materialized='table') }}
```
Run dbt build -s family_member_count again. BigQuery now shows it as a table. dbt dropped and recreated it.
Remove the inline {{ config(...) }} entirely. In dbt_project.yml, confirm that models.doe_family.marts.+materialized is set to table (or set it). Rebuild and confirm it's still a table — the project-level default takes over.
(Optional, advanced) Convert a model to incremental with a unique_key and an is_incremental() filter. Run it twice and inspect the row count after each run.

Summary

Concept	Key takeaway
`view`	Saved SQL — freshest possible, pay on every read
`table`	Materialized result — fast reads, rebuilt each run
`incremental`	Table built additively — for large or expensive models
`ephemeral`	Inlined as a CTE — niche, use rarely
Where to set	Project-level for defaults, inline `{{ config() }}` for overrides
Doe family default	`view` for sources/staging, `table` for marts

Next Lesson

You can shape what each model does (ref(), source()) and how it's persisted (materialization). The third leg of the stool is Jinja — how dbt generates the SQL it actually sends to BigQuery. Head to 2.5 Jinja and programmatic SQL.

Learning Objectives

What materialization means

view — light, always fresh, never stored

table — heavy, fast to read, rebuilt every run

incremental — append-only at scale

ephemeral — no object, inlined as a CTE

Setting materialization

1. Project level (dbt_project.yml)

2. Directory _properties.yml / inline YAML

3. Model file ({{ config(...) }})

The Doe family's materialization strategy

Hands-On Exercise

Summary

Next Lesson

`view` — light, always fresh, never stored

`table` — heavy, fast to read, rebuilt every run

`incremental` — append-only at scale

`ephemeral` — no object, inlined as a CTE

1. Project level (`dbt_project.yml`)

2. Directory `_properties.yml` / inline YAML

3. Model file (`{{ config(...) }}`)