Variables and Parameters
Variables allow you to make your dbt project more dynamic and configurable by passing values at runtime or setting them in configuration files. They enable you to create flexible data transformations that can adapt to different environments, use cases, and scenarios.
Understanding dbt Variables
Variables in dbt serve two primary purposes:
Make code reusable - Define values once and reference them throughout your project
Enable flexibility - Change behavior without modifying code
There are several ways to define and use variables in dbt:
Project variables - Defined in
dbt_project.yml
Command-line variables - Passed at runtime
Environment variables - Accessed via Jinja macros
Defining Variables in dbt_project.yml
The simplest way to define variables is in your dbt_project.yml
file:
vars:
# Simple scalar values
start_date: '2020-01-01'
end_date: '2022-12-31'
# Lists
excluded_countries: ['test', 'demo', 'internal']
# Dictionaries
partner_sales_targets: {
'tier1': 1000000,
'tier2': 500000,
'tier3': 100000
}
# Environment-specific variables
dev:
row_limit: 100
debug_mode: true
prod:
row_limit: null
debug_mode: false
These variables become available throughout your project via the var()
function.
Using Variables in Models
Once defined, you can reference variables in your models using the var()
function:
-- models/reporting/monthly_sales.sql
SELECT
date_trunc('month', order_date) as month,
SUM(amount) as monthly_sales
FROM {{ ref('stg_orders') }}
WHERE
order_date >= '{{ var("start_date") }}'
AND order_date <= '{{ var("end_date") }}'
{% if var('row_limit') %}
LIMIT {{ var('row_limit') }}
{% endif %}
The var()
function has two parameters:
The variable name
An optional default value that's used if the variable isn't defined
sqlCopy-- Using a default value
SELECT * FROM {{ ref('stg_users') }}
WHERE status = '{{ var("user_status", "active") }}'
Passing Variables at Runtime
For maximum flexibility, pass variables at runtime using the --vars
flag:
dbt run --vars '{"start_date": "2023-01-01", "end_date": "2023-03-31"}'
You can pass complex structures too:
dbt run --vars '{"regions": ["north", "south"], "include_test_data": false}'
Runtime variables override any variables defined in dbt_project.yml
.
Working with Environment Variables
You can access environment variables using the env_var
Jinja function:
-- Configuring a model to use environment variables
{{
config(
schema=env_var('DBT_SCHEMA', 'analytics')
)
}}
SELECT * FROM {{ ref('stg_orders') }}
This is particularly useful for sensitive information (like API keys) or values that vary by environment.
Advanced Variable Techniques
Conditional Logic with Variables
Variables allow you to implement conditional logic in your models:
{% if var('data_source') == 'api' %}
SELECT * FROM {{ ref('stg_api_data') }}
{% else %}
SELECT * FROM {{ ref('stg_warehouse_data') }}
{% endif %}
Dynamic Filtering
Create flexible filtering based on variable values:
SELECT
*
FROM {{ ref('stg_transactions') }}
WHERE 1=1
{% if var('filter_by_date', false) %}
AND transaction_date BETWEEN '{{ var("start_date") }}' AND '{{ var("end_date") }}'
{% endif %}
{% if var('filter_by_country', false) %}
AND country IN (
{% for country in var('countries', []) %}
'{{ country }}'{% if not loop.last %},{% endif %}
{% endfor %}
)
{% endif %}
Date/Time Variables
A common pattern for incremental models is using variables for date ranges:
{% set run_date = var('run_date', modules.datetime.date.today().strftime('%Y-%m-%d')) %}
SELECT
*
FROM {{ source('events', 'daily_events') }}
WHERE
event_date = '{{ run_date }}'
Best Practices for Variables
Set meaningful defaults
Provide sensible default values to make your code more robust
Use descriptive names
Choose clear, explicit variable names that explain purpose
Document variables
Add comments in dbt_project.yml
to explain each variable's purpose
Consistent formatting
Maintain consistent casing and naming conventions
Avoid hardcoding
Use variables instead of hardcoding values that might change
Example: Well-Structured Variables
vars:
# Analysis date range - used for filtering transaction data
# Format: YYYY-MM-DD
analysis_start_date: '2023-01-01' # Inclusive
analysis_end_date: '2023-12-31' # Inclusive
# Revenue recognition settings
rev_rec_delay_days: 14 # Days to delay revenue recognition
include_refunds: false # Set to true to include refunded transactions
# Environment-specific settings
dev:
debug_mode: true # Enables additional logging
data_sample_pct: 10 # Only process 10% of data in dev
prod:
debug_mode: false
data_sample_pct: 100 # Process all data in prod
Common Use Cases
Environment-Specific Configuration
Define different behavior based on your deployment environment:
# dbt_project.yml
vars:
dev:
schema_prefix: 'dev_'
row_limit: 1000
prod:
schema_prefix: ''
row_limit: null
-- models/model.sql
{{
config(
schema=var('schema_prefix', 'dev_') ~ 'marketing'
)
}}
SELECT * FROM {{ ref('stg_data') }}
{% if var('row_limit') %}
LIMIT {{ var('row_limit') }}
{% endif %}
Parameterized Reporting
Create reports with customizable parameters:
-- models/daily_sales_report.sql
{% set date_column = var('date_column', 'order_date') %}
{% set granularity = var('granularity', 'day') %}
SELECT
DATE_TRUNC('{{ granularity }}', {{ date_column }}) as period,
SUM(amount) as sales
FROM {{ ref('fct_orders') }}
GROUP BY 1
ORDER BY 1
Then run with different settings:
dbt run --select daily_sales_report --vars '{"granularity": "month", "date_column": "shipped_date"}'
By effectively using variables in your dbt project, you create more flexible, maintainable, and reusable data transformations that can easily adapt to different needs and environments without code changes.
Last updated
Was this helpful?