Jinja and programmatic SQL

Jinja is what makes dbt SQL programmable — control flow, loops, compile-time SQL generation. The compiled SQL is the truth.

Learning Objectives

By the end of this lesson, you will be able to:

  • Read the three Jinja delimiters: {{ }}, {% %}, {# #}
  • Use if, for, and set to generate SQL at compile time
  • Inspect the compiled SQL to debug what Jinja produced
  • Recognize three patterns Jinja unlocks that raw SQL can't express cleanly

Why dbt uses a templating language

Raw SQL has two limitations dbt needs to work around:

  1. You can't reference another model's name without hardcoding it. Different environments (dev, prod, your teammate's schema) need different fully-qualified names. → {{ ref('model_name') }} solves this.

  2. You can't write SQL that generates SQL. Want to select all 47 columns from a table except 3? Want a union all over 20 tables that share a schema? Raw SQL has no way to express either without typing the whole thing out by hand.

Jinja — the templating language dbt embeds in .sql files — handles both. dbt compiles your Jinja-templated SQL into pure SQL, then sends that pure SQL to BigQuery. Jinja is the layer where SQL gets programmable.


The three Jinja delimiters

Every Jinja construct uses one of three delimiter pairs:

Delimiter Purpose Example
{{ ... }} Expression select * from {{ ref('family_members') }}
{% ... %} Statement / logic {% if var('include_kids') %}...{% endif %}
{# ... #} Comment {# this is not in the compiled SQL #}

Expressions evaluate to a string and get spliced into the SQL. Statements drive control flow but emit nothing on their own. Comments disappear at compile time.


Variables and set

You can declare a Jinja-level variable in the middle of a model:

{% set kid_roles = ['child', 'teen'] %}

select *
from {{ ref('family_members') }}
where role in ({{ "'" ~ kid_roles | join("', '") ~ "'" }})

(That last line concatenates the list into 'child', 'teen' at compile time.)

You can also pull variables from dbt_project.yml:

vars:
  active_year: 2026
where year = {{ var('active_year') }}

And override at the command line:

dbt run --vars '{active_year: 2025}'

if — conditional SQL

select
    member,
    email
    {% if var('include_roles', false) %}
    , role
    {% endif %}
from {{ ref('family_members') }}

If include_roles is true at compile time, the model includes the role column. If false, it doesn't. One model, two shapes — the choice is made when dbt compiles, not when BigQuery runs.

if is also great for environment branching using target.name:

select *
from {{ ref('nexus_events') }}
{% if target.name == 'dev' %}
where occurred_at >= timestamp_sub(current_timestamp(), interval 30 day)
{% endif %}

In dev you limit to 30 days for fast builds; in prod you process the full history. No code change at deploy time.


for — looping over a list

This is where Jinja really shines. Say you have five Doe family emails and want to count messages received by each one in a single model:

{% set family_emails = [
    'jane@doefamily.example',
    'john@doefamily.example',
    'jack@doefamily.example',
    'joe@doefamily.example',
    'julie@doefamily.example'
] %}

select
    {% for email in family_emails %}
    countif(recipient_email = '{{ email }}') as msgs_to_{{ email.split('@')[0] }}
    {%- if not loop.last %},{% endif %}
    {% endfor %}
from {{ source('gmail', 'recipients') }}

That compiles to:

select
    countif(recipient_email = 'jane@doefamily.example')  as msgs_to_jane,
    countif(recipient_email = 'john@doefamily.example')  as msgs_to_john,
    countif(recipient_email = 'jack@doefamily.example')  as msgs_to_jack,
    countif(recipient_email = 'joe@doefamily.example')   as msgs_to_joe,
    countif(recipient_email = 'julie@doefamily.example') as msgs_to_julie
from `doe-family-dwh`.`gmail`.`recipients`

One Jinja loop, five SQL expressions. Now imagine the list comes from a family_members seed via a compile-time run_query() call (an advanced trick): you've just built a model that adapts as you add and remove family members, with zero hand-edits.


Three patterns Jinja unlocks

1. Star-except

Raw SQL has no way to say "all columns except these three". Jinja can, via dbt_utils.star:

select {{ dbt_utils.star(from=ref('gmail_events'), except=['internal_id', 'raw_payload']) }}
from {{ ref('gmail_events') }}

At compile time, dbt queries the warehouse for the column list, then emits everything except the excluded ones. Pure SQL would force you to enumerate every kept column by hand.

2. Dynamic UNION ALL

Got 30 tables in os_nexus.event_* and you want to union them?

{% set event_tables = ['event_signup', 'event_login', 'event_purchase', ...] %}

{% for t in event_tables %}
select *, '{{ t }}' as source_table from {{ source('os_nexus', t) }}
{% if not loop.last %} union all {% endif %}
{% endfor %}

One model, N tables unioned, easy to maintain.

3. Conditional environment behavior

You already saw if target.name == 'dev'. The same idea drives nexus.limit_by_timestamp() (in Module 2) — a macro that injects a date filter only in dev, keeping local builds fast.


The compiled SQL is the truth

When something doesn't behave the way you expect, read the compiled SQL. It's the actual SQL dbt sent to BigQuery, with all Jinja resolved. dbt writes it to target/compiled/<project>/... after every run.

In dbt Cloud: open a model → click the Compiled tab.

Locally with dbt Power User: right-click a model → View Compiled SQL.

From the CLI:

dbt compile -s my_model
cat target/compiled/doe_family/models/marts/my_model.sql

Always read the compiled SQL when debugging Jinja. The error messages from BigQuery are about the compiled output, not what you wrote.


A note on whitespace

Jinja preserves whitespace by default, which can produce ugly compiled SQL. Two helpers:

  • {%- and -%} strip whitespace on that side of the statement
  • {{- and -}} do the same for expressions
select
    {% for col in cols -%}
    {{ col }}{% if not loop.last %},{% endif %}
    {% endfor %}

Compiles to cleaner SQL than without the -. Useful when reading compiled output during debugging.


Hands-On Exercise

  1. Create models/marts/family_email_summary.sql using a Jinja for loop to count messages received per family member. Use the family_members seed to drive the loop (hint: a hardcoded list is fine for this exercise — using a seed dynamically is a stretch goal).

  2. Run dbt compile -s family_email_summary. Open the compiled SQL and check it generated one countif(...) per family member.

  3. Add a top-of-file {% set debug = true %}. Wrap a no-op comment in {% if debug %}{% endif %}. Confirm in the compiled SQL that the comment is/isn't there depending on the value.

  4. (Stretch) Add {% if target.name == 'dev' %} limit 100 {% endif %} to one of your existing staging models. Run in dev — confirm the limit. Switch target to anything else — confirm the limit disappears.


Summary

Concept Key takeaway
Jinja Templating language dbt uses to make SQL programmable
Three delimiters {{ }} expression, {% %} statement, {# #} comment
Compile time Jinja runs before the SQL is sent to BigQuery — it generates the SQL
Patterns Star-except, dynamic unions, environment-conditional filters
Compiled SQL The actual SQL sent to BigQuery; always read it when Jinja confuses you
Whitespace control {%- ... -%} to strip noise in the compiled output

Next Lesson

Once your project starts repeating the same Jinja snippets in three places, factor them into reusable units called macros — and once someone else has already written the macro you need, install their package. Head to 2.6 Macros and packages.