Re-executes the last dbt™ command from the point of failure

This template outlines how to configure Paradime's scheduler to retry failed dbt™ models or tests. By implementing this solution, you can create a resilient data pipeline that allows you to reruns failed models and test from previous executions.

Key Benefits

Prerequisites

Default Configuration

Schedule Settings

Setting
Value
Explanation

Schedule Type

Deferred

Ensures consistent execution for production workloads in a single environment. Best for regular data pipeline runs

Schedule Name

dbt retry

Descriptive name that indicates purpose

Deferred schedule

hourly run

Specifies the production schedule you want to re-run from last point of failure.

Git Branch

main

Uses your default production branch to ensure you're always running the latest approved code

Last run type (for comparison)

Last Run

Ensure the schedule defers to Last run that completed with errors.

Command Settings

The template uses a single command to re-run models and tests from the last point of failure:

  • dbt retry: Executes dbt™ models and test from the last point of failures for the deferred schedule name set in the configuration.

This command ensures that all your models and tests that failed in the last run are re-executed.

For custom command configurations, see Command Settings documentation.

Trigger Type

  • Type: Scheduled Run (Cron)

  • Cron Schedule: OFF (This schedule will be triggered by a new pull request event, not a cron schedule)

For custom Trigger configurations, see Trigger Types documentation.

Notification Settings

  • Email Alerts:

    • Success: Confirms all models were re-built and tested successfully, letting you know your data pipeline is healthy

    • Failure: Immediately alerts you when models fail to build or tests fail, allowing quick response to issues

    • SLA Breach: Alerts when runs take longer than the set duration (default: 2 hours), helping identify performance degradation

For custom notification configurations, see Notification Settings documentation.

Use Cases

Primary Use Cases

  • Handling Intermittent Connection Issues: Automatically retry models that failed due to temporary connection problems with your data warehouse

  • Recovering from Resource Constraints: Retry models that failed due to memory limitations or query timeouts during peak usage times

  • Addressing Data Availability Delays: Recover from failures caused by source data not being available at the expected time

  • Managing Dependencies: Ensure downstream models get rebuilt after their dependencies are successfully retried

  • Production Recovery: Quickly restore production data pipelines after unexpected failures without manual intervention

Last updated

Was this helpful?