Build and Test Models with New Source Data

This template creates an optimized schedule that executes dbt™ models only when necessary by comparing manifests between runs. It reduces unnecessary model runs by building only models with code changes or fresher source data, making it ideal for resource optimization and efficient production deployments.

Key Benefits

  • Run only modified models that have changed since the last successful execution

  • Execute models with fresher source data, avoiding unnecessary processing of static data

  • Automatically retry failed models from previous runs for quick production recovery

Prerequisites

  • Scheduler Environment is connected to your data warehouse provider.

  • An existing Bolt schedule that executed dbt source freshness for manifest comparison.

    • The steps to create this schedule and generate the initial artifacts are covered in Part 1 and Part 2 below.

    • If you already have a Bolt schedule that has executed the dbt source freshness command, you can skip ahead to Part 3: Create "Build and Test Models with New Source Data" from Template.

How to Configure

Implementing the "Build and Test Models with New Source Data" schedule requires a multi-step setup process. Follow these instructions carefully to ensure a successful configuration.

Part 1: Create a Source Freshness Schedule (Prerequisite)

If you don't have an existing Bolt schedule that executes the dbt source freshness command, you'll need to set that up first. This provides the necessary artifacts for the optimized build schedule.

  1. From the Bolt home screen, select + New Schedule and then select the "Snapshot Source Data Freshness" template.

  2. Name: Provide a relevant name (e.g. "Source Freshness Check")

  3. Click "Deploy" to publish the new Source Freshness schedule.

Hint: Refer to the Snapshot Source Data Freshness documentation for more details on configuring this template.

Part 2: Generate Initial Artifacts for Source Freshness Schedule (Prerequisite)

The purpose of this part is to execute an initial run of the Source Freshness Check schedule. This generates the manifest files (artifacts) that will be used as the baseline for the optimized build schedule to compare against in subsequent runs.

  1. From the Bolt home screen, click on the newly created Source Freshness Check schedule.

  2. Click "Run" to execute an initial run of the schedule and generate the manifest files (artifacts)

  3. Verify the initial run was successful by checking the run history

Note: To view all generated artifacts, see documentation analyzing individual run details.

Part 3: Create "Build and Test Models with New Source Data" from Template

Create the main Bolt schedule that will execute the dbt model builds and tests. It leverages the artifacts (manifests, run results, sources) generated in the previous prerequisite steps to enable the optimized model rebuilds.

  1. From the Bolt home screen, select + New Schedule and then select the "Build and Test Models with New Source Data" template.

  2. Schedule Type: Select Deferred, which enables manifest comparison between runs.

  3. Name: Provide a relevant name (e.g. "Build and Test Models with New Source Data")

  4. Description (Optional): Describe the purpose of this schedule (e.g. "Automatically build and test only models with fresher source data")

  5. Deferred Schedule: Initially, select the existing "Source Freshness Check" schedule. This allows the first run to have manifest files (AKA artifacts) to compare against.

  6. Last Run Type: Select Last Run to use the artifacts from the most recent execution.

  7. Command Settings: Update the existing dbt commands:

dbt source freshness 
dbt build --select source_status:fresher+

Starting with the command dbt build --select source_status:fresher+ ensures the initial setup is successful in generating the necessary artifacts, before transitioning to the more comprehensive command in the later step.

  1. Notification Settings (Optional): Configure success, failure, and SLA breach alerts via email, Slack, or MS Teams.

  2. Click "Deploy" to save the new schedule.

Part 4: Generate Initial Artifacts for "Build and Test Models with New Source Data" Schedule

Execute an initial run of the "Build and Test Models with New Source Data" schedule. This generates the necessary manifest files (artifacts) that will be used for state comparison.

  1. From the Bolt home screen, click on the newly created "Build and Test Models with New Source Data" schedule.

  2. Click "Run" to execute an initial run and generate the manifest files (artifacts).

  3. Verify the initial run was successful by checking the run history

Part 5: Transition to Self-Sustaining Schedule

Update the "Build and Test Models with New Source Data" schedule to be self-sustaining, allowing it to compare each subsequent run against its own previous successful execution. This makes the schedule more robust and efficient over time.

  1. From the Bolt home screen, click on the "Build and Test Models with New Source Data" schedule.

  2. Click "Edit" to modify the configuration.

  3. Update the Deferred Schedule to "self" so the schedule can compare against its own previous successful run.

  4. Command Settings: Reintroduce the full commands:

dbt source freshness 
dbt build --select source_status:fresher+ state:modified+ result:error+ result:fail+

Reintroducing the full command allows the schedule to compare each run against its own previous successful execution, providing a comprehensive rebuild process. This addresses changes in source data and model code, while retrying failed models to keep the data pipeline up-to-date and healthy over time.

  1. Click "Deploy" to save the changes.

This multi-step setup ensures the "Build and Test Models with New Source Data" schedule has the necessary artifacts and configuration to intelligently rebuild only the models with fresher source data or code changes, optimizing compute and improving deployment efficiency.

Last updated