Materializations
How dbt persists each model — view, table, incremental, ephemeral — and how to choose between them.
Learning Objectives
By the end of this lesson, you will be able to:
- Describe the four built-in dbt materializations
- Pick the right materialization for each layer of the Doe family project
- Configure materialization at the project, directory, and model level
- Recognize when to reach for
incremental(and when not to)
What materialization means
A model is a select statement. The materialization decides what
dbt does with that select — what physical object (or no object) it
creates in the warehouse.
| Materialization | What dbt creates |
|---|---|
view |
A view: create or replace view ... as <select> |
table |
A table, rebuilt from scratch every run |
incremental |
A table, but only new/changed rows are inserted on each run |
ephemeral |
No object at all — inlined as a CTE into downstream models |
The first two are the workhorses. The third is for scale. The fourth is a niche tool.
view — light, always fresh, never stored
A view is just a saved SQL definition. Querying it re-executes the underlying select against whatever the source data currently is.
Use when:
- The model is cheap to compute on the fly
- You want the freshest possible data every time someone queries
- You're in a staging layer where each
selectis light cleanup
Don't use when:
- The query is expensive (every read pays the full cost)
- Downstream models read from it repeatedly in the same run
{{ config(materialized='view') }}
select
member,
email,
role
from {{ ref('family_members') }}
where email is not null
table — heavy, fast to read, rebuilt every run
A table is the materialized result of your select, stored to disk. dbt
drops and recreates it every dbt run.
Use when:
- Downstream models or queries read from it often
- The select is expensive enough that you want to pay the cost once per run instead of per query
- You need real warehouse statistics for the optimizer
Don't use when:
- The model is small and trivial (
viewis enough) - You're rebuilding billions of rows every run (look at
incremental)
{{ config(materialized='table') }}
select
entity_id,
display_name,
last_interacted_at
from {{ ref('nexus_entities') }}
where entity_type = 'person'
For the Doe family, every marts/ model is a table. They're queried
constantly (the family contacts list, the Christmas card list) and the
underlying data isn't huge.
incremental — append-only at scale
incremental is a table that dbt builds in two modes:
- Full refresh (first run, or
--full-refresh): build from scratch - Incremental (subsequent runs): only insert rows that are new since the last build
For the Doe family this rarely matters — their data is small. But
once you have 10M+ rows in a source, full-refreshing on every run
becomes painful. incremental is how production nexus deployments
keep nexus_events build times manageable.
A minimal incremental model:
{{ config(materialized='incremental', unique_key='event_id') }}
select
event_id,
occurred_at,
event_type,
source
from {{ ref('all_events') }}
{% if is_incremental() %}
where occurred_at > (select max(occurred_at) from {{ this }})
{% endif %}
The {% if is_incremental() %} block runs only on incremental builds.
On a full refresh it's skipped and dbt builds the full table.
We don't dive deeper here — the
dbt incremental docs
go into all the modes (append, merge, delete+insert) and the
trade-offs. For now, just know that incremental exists and what it's
for.
ephemeral — no object, inlined as a CTE
An ephemeral model never lands in the warehouse. Instead, dbt
inlines its select as a CTE wherever it's ref()'d from.
Use when:
- You want to factor out a snippet for readability without paying for another object
- A model is referenced once and only once
Don't use when:
- Multiple downstream models read it (you'll re-execute the CTE multiple times)
- The select is heavy (each downstream pays the cost)
In practice you'll reach for ephemeral rarely. Mention it once, recognize it when you see it, move on.
Setting materialization
Materialization can be set at three levels — most specific wins.
1. Project level (dbt_project.yml)
Sets the default for entire directory subtrees:
models:
doe_family:
sources:
+materialized: view
staging:
+materialized: view
marts:
+materialized: table
The + prefix is dbt's "this is a config" syntax.
2. Directory _properties.yml / inline YAML
Less common — you can override at a subdirectory level by adding a nested key.
3. Model file ({{ config(...) }})
Highest precedence — set right inside the model:
{{ config(materialized='incremental', unique_key='event_id') }}
select ...
Use this for one-off overrides; let the project-level config carry the defaults.
The Doe family's materialization strategy
| Layer | Default | Why |
|---|---|---|
sources/ |
view |
Trivial cleanup; freshness over speed |
staging/ |
view |
Mostly thin reshape; cheap to recompute |
marts/ |
table |
Read constantly by humans and downstream queries |
| (special) | incremental |
Promoted on a case-by-case basis once a table gets large |
This is a sensible default for any nexus project at personal-to-small-team scale. Most production deployments follow the same pattern.
Hands-On Exercise
-
Create
models/marts/family_member_count.sql:{{ config(materialized='view') }} select count(*) as n_family_members from {{ ref('family_members') }} -
Run
dbt build -s family_member_count. Inspect the result in BigQuery — note that it's a view, not a table. -
Change the config to
table:{{ config(materialized='table') }} -
Run
dbt build -s family_member_countagain. BigQuery now shows it as a table. dbt dropped and recreated it. -
Remove the inline
{{ config(...) }}entirely. Indbt_project.yml, confirm thatmodels.doe_family.marts.+materializedis set totable(or set it). Rebuild and confirm it's still a table — the project-level default takes over. -
(Optional, advanced) Convert a model to
incrementalwith aunique_keyand anis_incremental()filter. Run it twice and inspect the row count after each run.
Summary
| Concept | Key takeaway |
|---|---|
view |
Saved SQL — freshest possible, pay on every read |
table |
Materialized result — fast reads, rebuilt each run |
incremental |
Table built additively — for large or expensive models |
ephemeral |
Inlined as a CTE — niche, use rarely |
| Where to set | Project-level for defaults, inline {{ config() }} for overrides |
| Doe family default | view for sources/staging, table for marts |
Next Lesson
You can shape what each model does (ref(), source()) and how it's
persisted (materialization). The third leg of the stool is Jinja —
how dbt generates the SQL it actually sends to BigQuery. Head to
2.5 Jinja and programmatic SQL.