Using --defer in Paradime

The --defer feature in Paradime allows you to leverage production data and schemas during development, significantly speeding up your dbt™ workflow. This guide will walk you through using --defer, from basic usage to advanced features.

Basic Usage of --defer

With Paradime, you can continuously develop using production data and schema. By enabling defer to prod in a dbt™ run command, Paradime automatically fetches the latest manifest.json, ensuring you always work with the most current data and schema.

Prerequisites

  1. An existing Bolt Schedule

  2. An Enabled "Defer to Production" Schedule

How it works

When using --defer, dbt™ resolves ref calls based on two criteria:

  1. Is the referenced node included in the current run's model selection?

  2. Does the reference node exist as a database object in your development environment?

If both answers are No, --defer resolves the ref() using the namespace from the state manifest of the specified schedule.

Using --defer in the Terminal

Similarly, You can use --defer via the Code IDE's integrated terminal:

dbt run --select <model> --defer --schedule-name=<schedule_name>

Example: Deferred vs Standard Run

To illustrate the effect of using --defer, here's a comparison of compiled SQL:

dim_customers.sql
with orders as (

    select * from `dbt-demo-project`.`dbt_prod`.`stg_orders`

),

final as (

    select
        customer_id,
        min(order_date) as first_order,
        max(order_date) as most_recent_order,
        count(order_id) as number_of_orders
    from orders
    group by 1
)

select * from final

Notice how the deferred run uses the production schema (dbt_prod), while the standard run uses the development schema (dbt_fabio).

Viewing the Deferred Schedule

After running a dbt™ command with --defer, you can view details of the production run used for deferral in the Integrated Terminal. The output includes a clickable URL to the Bolt UI for more information.

Advanced Use of --defer

Using --favor-state

The --favor-state flag provides additional control over how dbt™ resolves node references:

dbt run --select customer_orders --defer --favor-state --schedule-name=daily_run

This command tells dbt™ to prefer the state from the 'daily_run' schedule, even if the models exist in your current environment.

The "source_status" Method

The source_status method allows you to run only models with dependencies on fresher source data. Here's an example configuration:

schedules:
  - name: hourly # the name of your schedule
    deferred_schedule:
      enabled: true # true to enable this schedule to use deferred state
      deferred_schedule_name: source_status #the name of the bolt schedule used to to look for the most recent successful run manifest.json / sources.json /run_results.json for state comparison.
      successful_run_only: false #[optional]by default paradime will look for the last successful run. Set this to false to let paradime to use defer with last run of the selected schedule, irrespective of the run status.
    schedule: "@hourly" # the schedule cron configuration
    environment: production #the environment used to run the schedule -> this is always production
    commands:
      - dbt source freshness # must be run again to compare current to previous state
      - dbt build --select source_status:fresher+
    owner_email: "john@acme.io" #the email of the schedule owner
    slack_on: # the configuration of when a notification is triggered. Here we want to send a notification when the run is completed either successfully or when failing
      - passed
      - failed
    slack_notify: # the channel/user that will be notified
      - "#data-alerts"
      - "@john"
    email_notify: # the email addresses that will be notified
      - "john@acme.com"
      - "data_team@acme.com"
    email_on: # the configuration of when a notification is triggered. Here we want to send a notification when the run is completed either successfully or when failing
      - passed
      - failed

This configuration runs dbt source freshness and then builds only models affected by fresher source data.

Read more here about state method or source_status method.

Additional schedule parameters available for deferred_schedule

When configuring a deferred schedule, you can use the following parameters in the deferred_schedule section:

ParametersDescriptionExample

enabled

[Required] Set this to TRUE to enable deferral for the schedule

true

deferred_schedule_name

[Required] The name of the Bolt schedule used to look for the most recent successful run artifacts (manifest.json, sources.json, run_results.json) for state comparison. It can be another schedule or the same schedule name (self-deferring).

hourly

successful_run_only

[Optional] By default, Paradime will look for the last successful run. Set this to false to use the last run of the selected schedule, regardless of its status.

false

Last updated