Using Delete+Insert for Incremental Models

The delete+insert method is like a "replace-all" approach for updated records. It first removes all records that match your incremental criteria and then inserts the new versions of those records. Think of it as replacing an entire page in a book rather than editing individual words.

When to Use Delete+Insert Method

Batch updates where most records change
When merge performance becomes a bottleneck
Simpler change patterns
Large-scale updates
When most fields in each record need updating

Advantages and Trade-offs

Advantages

Trade-offs

Better performance than merge for bulk updates

Less granular control than merge

Simpler execution plan

All columns are updated (no column-specific updates)

Good for high-volume changes

Potential for higher resource usage during delete phase

Simpler SQL generated

Doesn't work well for individual record updates

Example Implementation

{{
  config(
    materialized='incremental',
    incremental_strategy='delete+insert',
    unique_key='id'
  )
}}

SELECT 
    id,
    status,
    batch_date,
    amount,
    updated_at
FROM {{ ref('stg_source') }}

{% if is_incremental() %}
    WHERE batch_date >= (SELECT MAX(batch_date) FROM {{ this }})
{% endif %}

In this example:

Records matching the WHERE condition will first be deleted
Then all new/updated records will be inserted
This is more efficient than merge when most fields need updating

How It Works

When you run a delete+insert incremental model, dbt:

Creates a temporary table with all the data from your query
Identifies records in the target table that match the incremental filter condition
Deletes those matching records from the target table
Inserts all records from the temporary table into the target table

The result is similar to a merge, but with a different execution strategy that can be more efficient in specific scenarios.

Step By Step Guide: Delete+Insert Method Implementation

Let's see how the delete+insert method works with a practical example.

Initial Setup

Follow the same initial setup as the Merge Method example, creating the staging model and first version of the incremental model.

Testing Delete+Insert Method

Update incremental model config (keep same staging data):

-- models/incremental_test.sql
{{
  config(
    materialized='incremental',
    incremental_strategy='delete+insert',
    unique_key='id'
  )
}}

SELECT 
    id,
    status,
    amount,
    updated_at
FROM {{ ref('stg_test_source') }}

{% if is_incremental() %}
    WHERE updated_at > (SELECT MAX(updated_at) FROM {{ this }})
{% endif %}

Update staging model with changes:

-- models/staging/stg_test_source.sql
WITH source_data AS (
    SELECT 1 as id,
           'inactive' as status,  -- Changed status
           150 as amount,         -- Changed amount
           CURRENT_TIMESTAMP() as updated_at
    UNION ALL
    SELECT 2 as id,
           'approved' as status,  -- Changed from 'pending'
           200 as amount,
           CURRENT_TIMESTAMP() as updated_at
    UNION ALL
    SELECT 4, 'new', 400, CURRENT_TIMESTAMP()  -- New record
)
SELECT * FROM source_data

Run staging model:

dbt run --select stg_test_source

Run incremental model:

dbt run --select incremental_test

Verify in Snowflake:

SELECT 
    * 
FROM 
    your_database.your_schema.incremental_test 
ORDER BY 
    id

Expected Result:

ID | STATUS   | AMOUNT | UPDATED_AT
1  | inactive | 150    | [new_timestamp]  -- Updated
2  | approved | 200    | [new_timestamp]  -- Updated with new status
3  | active   | 300    | [old_timestamp]  -- Unchanged (wasn't in staging)
4  | new      | 400    | [new_timestamp]  -- New

Notice that all records in the staging model (IDs 1, 2, and 4) were processed as a batch - deleted and reinserted. Record 3 remained unchanged as it wasn't in the staging data and didn't match the incremental condition.

Best Practices for Delete+Insert

Use with Batch Processing - This method works best when processing logical batches of data (like daily updates)
Optimize the Filter Condition - Choose an incremental filter that selects all records that might need updating but minimizes unnecessary processing
Consider Performance Impact - The delete operation affects indexes and can cause fragmentation; plan your maintenance accordingly
Pair with Partitioning - If your database supports it, partitioning can make delete+insert operations more efficient

The delete+insert method is particularly effective for batch-oriented workflows where most record fields change together, striking a balance between the flexibility of merge and the performance of append-only approaches.

PreviousUsing Merge for Incremental Models NextUsing Append for Incremental Models

Last updated 5 months ago

Was this helpful?