# Using Microbatch for Incremental Models

The microbatch method processes incremental updates in smaller batches, designed specifically for very large time-series datasets. It's particularly valuable when dealing with datasets that are too large to process in a single operation.

{% hint style="warning" %}
The microbatch method is currently in Beta. Features and syntax may change in future releases.
{% endhint %}

***

### When to Use Microbatch Method

* Very large time-series datasets
* When reliability is crucial
* When processing needs to be more resilient
* When database resource limits are a concern
* For tables with billions of rows
* When single-transaction operations time out

***

### Advantages and Trade-offs

| Advantages                        | Trade-offs                                          |
| --------------------------------- | --------------------------------------------------- |
| More efficient for large datasets | More complex setup                                  |
| Better error handling             | Requires careful batch size tuning                  |
| Improved resilience               | Only available on certain platforms                 |
| Reduced memory usage              | Slightly more overhead than single-batch operations |
| Can work around query timeouts    | Still in beta status                                |

### Example Implementation

```sql
{{
  config(
    materialized='incremental',
    incremental_strategy='microbatch',
    unique_key='id',
    incremental_predicates=[
      "DBT_INTERNAL_DEST.event_time > dateadd(day, -7, current_date)"
    ]
  )
}}

SELECT 
    id,
    event_type,
    event_time,
    user_id,
    properties
FROM {{ ref('stg_events') }}

{% if is_incremental() %}
    WHERE event_time > (SELECT MAX(event_time) FROM {{ this }})
{% endif %}
```

In this example:

* Updates are processed in smaller batches
* `incremental_predicates` limits the scope of each operation
* The overall process is more resilient to timeouts and failures

***

### How It Works

When you run an incremental model with the microbatch method, dbt:

1. Breaks the update into smaller chunks (batches)
2. Processes each batch in a separate transaction
3. Continues with remaining batches even if some fail
4. Reports overall success or partial failure

This approach is similar to the merge method, but with added reliability for very large datasets.

***

### Configuration Options

The microbatch method has several unique configuration options:

```sql
{{
  config(
    materialized='incremental',
    incremental_strategy='microbatch',
    unique_key='id',
    
    -- Microbatch-specific options
    microbatch_size=5000,  -- Number of rows per batch
    microbatch_limit=250,  -- Maximum number of batches
    incremental_predicates=[
      "DBT_INTERNAL_DEST.created_at > dateadd(day, -7, current_date)"
    ]
  )
}}
```

| Option                   | Description                                 | Default |
| ------------------------ | ------------------------------------------- | ------- |
| `microbatch_size`        | Number of rows per batch                    | 10000   |
| `microbatch_limit`       | Maximum number of batches to process        | 500     |
| `incremental_predicates` | Conditions to limit the scope of each batch | None    |

***

### Best Practices for Microbatch Method

1. **Tune Batch Size** - Find the optimal batch size that balances performance and reliability
2. **Use Incremental Predicates** - Always use predicates to limit the scope of the operation
3. **Monitor Partial Failures** - Set up alerting for when some batches fail but others succeed
4. **Consider Recovery Strategies** - Have plans for reprocessing failed batches

The microbatch method is still evolving but offers a promising solution for handling very large incremental models where other methods might time out or consume too many resources. It's particularly valuable for high-volume event data or IoT datasets where individual transactions might exceed database limitations.
