# Table Materialization

## Table Materialization

A table materialization rebuilds your model as a physical table in your data warehouse during each dbt run. Unlike views, tables store the actual data rather than just the query logic, using a `CREATE TABLE AS` statement.

***

### How Table Materializations Work

When you materialize a model as a table, dbt executes the model's SQL query and stores the results as a table in your data warehouse. During each run, dbt:

1. Runs `DROP TABLE IF EXISTS` on the existing table
2. Executes `CREATE TABLE AS` with your model's SQL
3. Applies any configured table properties (like indexes, distribution keys, etc.)

This means:

* Data is physically stored in your warehouse
* Queries against the table are faster but builds take longer
* Data isn't automatically updated when source data changes

Under the hood, dbt executes a `CREATE TABLE AS` statement:

```sql
CREATE OR REPLACE TABLE "database"."schema"."customer_orders" AS (
    SELECT * FROM source_table WHERE condition = true
);
```

{% hint style="info" %}
Tables are ideal when query performance is more important than build time or real-time data needs.
{% endhint %}

***

### When to Use Table Materializations

Tables are particularly valuable for:

| Use Case                    | Why Tables Work Well                                                           |
| --------------------------- | ------------------------------------------------------------------------------ |
| Performance-critical models | Tables provide the fastest query performance, ideal for dashboards and reports |
| Complex transformations     | Compute-intensive operations only need to run once during build                |
| Frequently accessed data    | Multiple users or systems can query without recomputing                        |
| Downstream dependencies     | When many models reference this data, tables reduce overall processing         |

***

### Configuring Table Materializations

Tables can be configured at both the model and project level.

#### Model-Level Configuration

```sql
-- In your model SQL file
{{
    config(
        materialized='table',
        sort='order_date',
        dist='customer_id'
    )
}}

SELECT
    customer_id,
    order_date,
    SUM(amount) as total_amount
FROM {{ ref('stg_orders') }}
GROUP BY 1, 2
```

#### Project-Level Configuration

```yaml
# In your dbt_project.yml file
models:
  your_project:
    marts:
      +materialized: table
```

This sets all models in the `marts/` directory to materialize as tables.

***

### Performance Optimization

Different warehouses offer specific optimization options for tables:

| Warehouse | Optimization Options                                                                |
| --------- | ----------------------------------------------------------------------------------- |
| Snowflake | <p>• Clustering keys<br>• Search optimization<br>• Automatic query optimization</p> |
| BigQuery  | <p>• Partitioning<br>• Clustering<br>• Table expiration</p>                         |
| Redshift  | <p>• Distribution keys<br>• Sort keys<br>• Table compression</p>                    |

To apply these optimizations, use the `config()` function with warehouse-specific parameters:

```sql
{{
    config(
        materialized='table',
        snowflake_cluster_by=['customer_id', 'order_date'],
        bigquery_partition_by={
            "field": "order_date",
            "data_type": "date"
        }
    )
}}
```

***

### Advantages and Limitations

| Advantages                   | Limitations               |
| ---------------------------- | ------------------------- |
| ⚡ Fast query performance     | 🕒 Slower build times     |
| 📊 Efficient for BI tools    | 🔄 No automatic updates   |
| 💪 Great for complex queries | 💾 Uses more storage      |
| 🔀 Ideal for multiple users  | 📈 Higher warehouse costs |

***

### When to Consider Other Materializations

While tables are powerful, consider alternatives when:

* Data needs to be real-time (use views)
* Table is very large and only needs incremental updates (use incremental)
* Model is a simple intermediate transformation used by only one downstream model (use ephemeral)

***

### Best Practices

1. **Materialization Strategy**: Use tables for final reporting layers and complex transformations
2. **Build Frequency**: Schedule table rebuilds based on source data update frequency
3. **Performance Tuning**: Apply appropriate indexes, partitioning, or clustering for your warehouse
4. **Resource Management**: Schedule builds during off-peak hours for large tables
5. **Monitoring**: Track build times and storage usage to identify optimization opportunities

By using table materializations strategically, you can balance performance needs with resource utilization to create an efficient data transformation pipeline.
