Deferral and State comparison

When configuring a schedule in Paradime, you can set your execution settings to defer to a previous run state and use this in combination with the state method or source_status method.

To benefit from using deferring and state you need to leverage your schedule configuration to use dbt™️ run artifacts from a previous invocation of dbt™️.

Paradime enables you to take advantage of this dbt™️ feature by simply adding your schedule name to the deferred_schedule_name configuration.

You can choose to defer to the last invocation of the same schedule or to another schedule name.

Additional schedule parameters available for deferred_schedule

ParametersDescriptionExample

enabled

[Required] Set this to TRUE to enabled deferred for the schedule

true

deferred_schedule_name

[Required] The name of the Bolt schedule used to to look for the most recent successful run manifest.json / sources.json / run_results.json for state comparison. It can be another schedule or the using the same schedule name (self-defering).

hourly

successful_run_only

[Optional] By default paradime will look for the last successful run. Set this to false to let paradime to use defer with last run of the selected schedule, irrespective of the run status.

false

ℹ️ You can read more here about state method or source_status method.

The "source_status" method

Only supported by dbt Core™️ v1.1 or newer.

One element of job state is the source_status of a prior dbt™️ invocation. After executing dbt source freshness, dbt™️ creates the sources.json artifact which contains execution times and max_loaded_at dates for dbt™️ sources.

You can use the source_status method in combination with the deferred_schedule_name configuration in Paradime to compare the source_freshness state of your schedule to a previous run or another run source_status. In combination with a dbt run or dbt build command, this will allow you to run only model dependencies where sources contains "fresher" data.

Example schedule configuration using source_status method

Below is an example of a Bolt schedule using deferred configuration in addition to the standard schedule configuration parameters.

To use this feature, set the deferred schedule parameter with the configuration enabled: true and specify in the deferred_schedule_name parameter the name of an existing bolt schedule used to defer.

schedules:
  - name: hourly # the name of your schedule
    deferred_schedule:
      enabled: true # true to enable this schedule to use deferred state
      deferred_schedule_name: source_status #the name of the bolt schedule used to to look for the most recent successful run manifest.json / sources.json /run_results.json for state comparison.
      successful_run_only: false #[optional]by default paradime will look for the last successful run. Set this to false to let paradime to use defer with last run of the selected schedule, irrespective of the run status.
    schedule: "@hourly" # the schedule cron configuration
    environment: production #the environment used to run the schedule -> this is always production
    commands:
      - dbt source freshness # must be run again to compare current to previous state
      - dbt build --select source_status:fresher+
    owner_email: "john@acme.io" #the email of the schedule owner
    slack_on: # the configuration of when a notification is triggered. Here we want to send a notification when the run is completed either successfully or when failing
      - passed
      - failed
    slack_notify: # the channel/user that will be notified
      - "#data-alerts"
      - "@john"
    email_notify: # the email addresses that will be notified
      - "john@acme.com"
      - "data_team@acme.com"
    email_on: # the configuration of when a notification is triggered. Here we want to send a notification when the run is completed either successfully or when failing
      - passed
      - failed

Last updated