# Variables and Parameters

Variables allow you to make your dbt project more dynamic and configurable by passing values at runtime or setting them in configuration files. They enable you to create flexible data transformations that can adapt to different environments, use cases, and scenarios.

### Understanding dbt Variables

Variables in dbt serve two primary purposes:

1. **Make code reusable** - Define values once and reference them throughout your project
2. **Enable flexibility** - Change behavior without modifying code

There are several ways to define and use variables in dbt:

* **Project variables** - Defined in `dbt_project.yml`
* **Command-line variables** - Passed at runtime
* **Environment variables** - Accessed via Jinja macros

***

### Defining Variables in dbt\_project.yml

The simplest way to define variables is in your `dbt_project.yml` file:

```yaml
vars:
  # Simple scalar values
  start_date: '2020-01-01'
  end_date: '2022-12-31'
  
  # Lists
  excluded_countries: ['test', 'demo', 'internal']
  
  # Dictionaries
  partner_sales_targets: {
    'tier1': 1000000,
    'tier2': 500000,
    'tier3': 100000
  }
  
  # Environment-specific variables
  dev:
    row_limit: 100
    debug_mode: true
  prod:
    row_limit: null
    debug_mode: false
```

These variables become available throughout your project via the `var()` function.

***

### Using Variables in Models

Once defined, you can reference variables in your models using the `var()` function:

```sql
-- models/reporting/monthly_sales.sql
SELECT
  date_trunc('month', order_date) as month,
  SUM(amount) as monthly_sales
FROM {{ ref('stg_orders') }}
WHERE 
  order_date >= '{{ var("start_date") }}'
  AND order_date <= '{{ var("end_date") }}'
  {% if var('row_limit') %}
  LIMIT {{ var('row_limit') }}
  {% endif %}
```

The `var()` function has two parameters:

1. The variable name
2. An optional default value that's used if the variable isn't defined

```sql
sqlCopy-- Using a default value
SELECT * FROM {{ ref('stg_users') }}
WHERE status = '{{ var("user_status", "active") }}'
```

{% hint style="info" %}
**Variable Behavior**

When you use the `var()` function:

* It will use the variable from `dbt_project.yml` if defined
* Command-line variables override values from `dbt_project.yml`
* If no variable is found and no default is specified, dbt will raise an error
* Environment-specific variables (`dev`, `prod`) are only used when running in that environment
  {% endhint %}

***

#### Passing Variables at Runtime

For maximum flexibility, pass variables at runtime using the `--vars` flag:

```bash
dbt run --vars '{"start_date": "2023-01-01", "end_date": "2023-03-31"}'
```

You can pass complex structures too:

```bash
dbt run --vars '{"regions": ["north", "south"], "include_test_data": false}'
```

Runtime variables override any variables defined in `dbt_project.yml`.

***

### Working with Environment Variables

You can access environment variables using the `env_var` Jinja function:

```sql
-- Configuring a model to use environment variables
{{ 
  config(
    schema=env_var('DBT_SCHEMA', 'analytics')
  ) 
}}

SELECT * FROM {{ ref('stg_orders') }}
```

This is particularly useful for sensitive information (like API keys) or values that vary by environment.

{% hint style="info" %}
**Security Note**

Never use `env_var()` for credentials that should remain secret. These values could be exposed in compiled SQL or logs. Instead, use your platform's secure environment variable handling for credentials.
{% endhint %}

***

### Advanced Variable Techniques

**Conditional Logic with Variables**

Variables allow you to implement conditional logic in your models:

```sql
{% if var('data_source') == 'api' %}
  SELECT * FROM {{ ref('stg_api_data') }}
{% else %}
  SELECT * FROM {{ ref('stg_warehouse_data') }}
{% endif %}
```

**Dynamic Filtering**

Create flexible filtering based on variable values:

```sql
SELECT
  *
FROM {{ ref('stg_transactions') }}
WHERE 1=1
  {% if var('filter_by_date', false) %}
  AND transaction_date BETWEEN '{{ var("start_date") }}' AND '{{ var("end_date") }}'
  {% endif %}
  
  {% if var('filter_by_country', false) %}
  AND country IN (
    {% for country in var('countries', []) %}
      '{{ country }}'{% if not loop.last %},{% endif %}
    {% endfor %}
  )
  {% endif %}
```

**Date/Time Variables**

A common pattern for incremental models is using variables for date ranges:

```sql
{% set run_date = var('run_date', modules.datetime.date.today().strftime('%Y-%m-%d')) %}

SELECT 
  *
FROM {{ source('events', 'daily_events') }}
WHERE 
  event_date = '{{ run_date }}'
```

***

#### Best Practices for Variables

| Practice                | Description                                                          |
| ----------------------- | -------------------------------------------------------------------- |
| Set meaningful defaults | Provide sensible default values to make your code more robust        |
| Use descriptive names   | Choose clear, explicit variable names that explain purpose           |
| Document variables      | Add comments in `dbt_project.yml` to explain each variable's purpose |
| Consistent formatting   | Maintain consistent casing and naming conventions                    |
| Avoid hardcoding        | Use variables instead of hardcoding values that might change         |

**Example: Well-Structured Variables**

```yaml
vars:
  # Analysis date range - used for filtering transaction data
  # Format: YYYY-MM-DD
  analysis_start_date: '2023-01-01'  # Inclusive
  analysis_end_date: '2023-12-31'    # Inclusive
  
  # Revenue recognition settings
  rev_rec_delay_days: 14             # Days to delay revenue recognition
  include_refunds: false             # Set to true to include refunded transactions
  
  # Environment-specific settings
  dev:
    debug_mode: true                 # Enables additional logging
    data_sample_pct: 10              # Only process 10% of data in dev
  prod:
    debug_mode: false
    data_sample_pct: 100             # Process all data in prod
```

***

### Common Use Cases

**Environment-Specific Configuration**

Define different behavior based on your deployment environment:

```yaml
# dbt_project.yml
vars:
  dev:
    schema_prefix: 'dev_'
    row_limit: 1000
  prod:
    schema_prefix: ''
    row_limit: null
```

```sql
-- models/model.sql
{{ 
  config(
    schema=var('schema_prefix', 'dev_') ~ 'marketing'
  ) 
}}

SELECT * FROM {{ ref('stg_data') }}
{% if var('row_limit') %}
LIMIT {{ var('row_limit') }}
{% endif %}
```

**Parameterized Reporting**

Create reports with customizable parameters:

```sql
-- models/daily_sales_report.sql
{% set date_column = var('date_column', 'order_date') %}
{% set granularity = var('granularity', 'day') %}

SELECT
  DATE_TRUNC('{{ granularity }}', {{ date_column }}) as period,
  SUM(amount) as sales
FROM {{ ref('fct_orders') }}
GROUP BY 1
ORDER BY 1
```

Then run with different settings:

```bash
dbt run --select daily_sales_report --vars '{"granularity": "month", "date_column": "shipped_date"}'
```

By effectively using variables in your dbt project, you create more flexible, maintainable, and reusable data transformations that can easily adapt to different needs and environments without code changes.
