Examples and practicum
Three worked examples that bring Module 2 together, plus a consolidated end-to-end practicum.
Learning Objectives
By the end of this lesson, you will have:
- Walked through three worked dbt examples using the Doe family seed
- Stood up a complete dbt project end to end via the path you chose
- Built a real model that reads warehouse data, joins to a seed, and writes a result table
- Committed and pushed your work via Git
Worked example 1 — count the family
The simplest possible model. Create models/marts/family_count.sql:
select count(*) as family_size
from {{ ref('family_members') }}
Build it:
dbt build -s family_count
Confirm in BigQuery:
select * from `doe-family-dwh`.`doe_family_dev`.`family_count`;
You should see one row: family_size = 5.
Worked example 2 — list the kids' emails
A slightly less trivial model. Create
models/marts/kid_emails.sql:
select
member as name,
email
from {{ ref('family_members') }}
where role = 'child'
order by name
Build it:
dbt build -s kid_emails
You should see Jack, Joe, and Julie with their emails.
Worked example 3 — join the seed with a public BigQuery dataset
BigQuery ships a generous catalog of public datasets you can query for free. Let's join the family seed with one of them.
We'll use bigquery-public-data.usa_names.usa_1910_2013 — a list of US
baby names by year. For each Doe kid, count how many babies were given
that name in 2010.
First, register the public dataset as a dbt source. Create
models/sources/usa_names/_sources.yml:
version: 2
sources:
- name: usa_names
database: bigquery-public-data
schema: usa_names
tables:
- name: usa_1910_2013
Then create models/marts/kid_name_popularity.sql:
with kids as (
select member as name
from {{ ref('family_members') }}
where role = 'child'
),
names_2010 as (
select
name,
sum(number) as births_2010
from {{ source('usa_names', 'usa_1910_2013') }}
where year = 2010
group by name
)
select
k.name,
coalesce(n.births_2010, 0) as births_2010
from kids k
left join names_2010 n
on lower(k.name) = lower(n.name)
order by births_2010 desc
Build it:
dbt build -s kid_name_popularity
You've now joined hand-curated seed data with a real public warehouse
table — the same pattern you'll use in Module 3 to join the
family_members seed with resolved entities from Gmail / Calendar /
Notion.
Consolidated practicum
Time to put everything in Module 2 together.
Goal: build a model that reads from a public BigQuery dataset,
joins it with the family_members seed, applies at least one of the
concepts from Module 2 (a ref() chain, a non-default
materialization, a Jinja loop, or a macro), and push the result to
GitHub through a PR.
Step 1 — confirm Module 1 is done
- Cloud path: complete the assignment in 1.3 Setting up dbt Cloud
- Local path: complete the assignment in 1.4 Setting up dbt locally with VS Code
- Git repo created and connected: 1.5 Git and GitHub
You should be able to run dbt debug and see "All checks passed!"
Step 2 — seed + project structure
The seed from 2.1 and the layered layout from 2.7 should both be in place.
Step 3 — build something real
Suggested options:
- Easy mode: rebuild
kid_name_popularityfrom example 3 in your own project. - Harder: pick a different public BigQuery dataset (e.g.,
bigquery-public-data.austin_311,noaa_gsod,openaq) and join it withfamily_membersin a way that's meaningful to you. - Concept-driven: require yourself to use at least one of the Module 2 concepts. Examples:
- Hardest: add a custom test (uniqueness, not-null, or
accepted_values) on your seed and on your mart. See the dbt docs on tests.
Step 4 — version control
If you're going the local route:
git checkout -b m2-practicum
git add models/ seeds/ dbt_project.yml macros/
git commit -m "Module 2 practicum: kid name popularity model with macro"
git push -u origin m2-practicum
Then open a PR on GitHub and merge it.
Cloud users: commit through the dbt Cloud IDE, open the PR from your managed repo or from GitHub if you've wired one up.
Step 5 — verify
In BigQuery, query the resulting table. Confirm it contains what you expect.
Self-check
You're ready for Module 3 if you can answer yes to all of these:
- I can run
dbt debugand see "All checks passed!" - I have a project structured with
sources/,staging/, andmarts/ - I've loaded a seed file with
dbt seed - I've built a model that joins seed data with a
source()reference - I've used at least one of: a non-default materialization, a Jinja construct, or a macro
- My project is in Git with at least one merged PR
If any of those is "no", revisit the relevant lesson before continuing.
Summary
| What you built | Why it matters in Module 3 |
|---|---|
family_count |
Simplest possible model — proves your environment works |
kid_emails |
Filtering and ordering — basic mart-level model patterns |
kid_name_popularity |
Joining seed data to a warehouse source — the exact shape you'll use against resolved entities |
Next Lesson
You're ready. Head to 3.1 Prerequisite: Ingesting source data with os-nexus — where real data starts flowing into the warehouse.