Schedules as Code

What are Paradime YAML-Based Schedules?

Paradime YAML schedules are configuration-as-code definitions, allowing you to define, version, and manage your data pipeline schedules directly within your dbt project repository. These schedules are configured in a single file named paradime_schedules.yml located in the root directory of your dbt project (alongside dbt_project.yml).

Prerequisites

In order to run yaml-based schedules, connect your data warehouse to the Scheduler Environment.

File Location and Structure

your-dbt-project/
├── dbt_project.yml
├── paradime_schedules.yml    # All schedules must be defined here
├── models/
├── tests/
└── ...

Why Use YAML-Based Schedules?

Version Control
- Schedule configurations are tracked alongside your dbt models
- Review schedule modifications through Pull Requests
- Enforce team review processes for schedule changes
Infrastructure as Code
- Schedules are treated as code, not just UI configurations
- Easy replication across environments
- Enables automated deployment pipelines
Team Collaboration
- Simplified schedule review process
- Standard formatting and validation
- Documentation lives with the code

How YAML-bases schedules are deployed?

Schedules are always read from the paradime_schedules.yml file on your default branch (usually main or master).

Automatic Refresh: Paradime checks for changes every 10 minutes.
Manual Refresh: For immediate updates, navigate to the Bolt interface and click Parse Schedules.

💡 Note: To update your schedules, make sure to merge your changes to the default branch first.

Configuration Reference

This section describes the YAML configuration format for scheduling and managing automated tasks. The configuration supports various execution modes including scheduled runs, trigger-based execution, deferred scheduling, CI/CD integration and API.

💡 Looking for complete examples? Jump to the Example Configurations section below.

Base Configuration

Every scheduler configuration must include these basic fields:

name: string               # Name of the schedule
description: string        # Job description
owner_email: string       # Email of the job owner
git_branch: string        # Target Git branch
environment: string       # Only "production" is supported
commands:                 # List of commands to execute
  - string

Execution Modes

1. Schedule-Triggered Execution

Basic scheduled execution using cron expression:

schedule: '0 */2 * * *'     # Runs every 2 hours

2. Run Completion Trigger

Triggers execution based on completion of another job:

schedule: 'OFF'
schedule_trigger:
  enabled: true
  schedule_name: string     # Name of the trigger schedule
  workspace_name: string    # Workspace containing the trigger schedule
  trigger_on:               # Events that trigger execution
    - passed
    - failed

3. Merge Trigger

Triggers execution on merge events:

schedule: 'OFF'
trigger_on_merge: true     # Enables merge-triggered execution

Requires GitHub integration.

4. Deferred Scheduling

Allows schedules to used dbt defer to artifacts comparison:

schedule: 'OFF'
deferred_schedule:
  enabled: true
  deferred_schedule_name: string  # Name of the deferred schedule
  successful_run_only: boolean

5. Turbo CI Configuration

Configuration for CI pipelines:

schedule: 'OFF'
turbo_ci:
  enabled: true
  deferred_schedule_name: string  # Name of the schedule for CI
  successful_run_only: boolean    # Whether to run only after successful executions

6. API Configuration

Basic configuration when triggering Bolt via API:

schedule: 'OFF'

For more details on Paradime APIs check our Developers guide.

7. Suspended State

Configuration for suspended jobs:

suspended: true            # Indicates the job is suspended

Notifications Configuration

Notifications can be configured for various events through multiple channels:

notifications:
  emails:                  # Email notifications configuration
    - address: string      # List of recipient email address
      events:              # List of events to notify for each recipient
        - passed           # Schedule completed successfully
        - failed          # Schedule completed successfully with errors
        - sla             # SLA threshold exceeded
        
  microsoft_teams:         # Microsoft Teams notifications
    - channel: string      # List of Teams channel name
      events:              # List of events to notify for each recipient
        - passed          # Schedule completed successfully
        - failed          # Schedule completed successfully with errors
        - sla             # SLA threshold exceeded
        
  slack_channels:
    - channel: string      # List of Slack channel name
      events:              # List of events to notify for each recipient
        - passed          # Schedule completed successfully
        - failed          # Schedule completed successfully with errors
        - sla             # SLA threshold exceeded
        
sla_minutes: number        # SLA threshold in minutes

For Slack and MS Teams notifications, check our integrations guide:

Example: Complete Configuration

schedules:
  - name: daily run
    description: "Daily build of all dbt models"
    owner_email: [email protected] 
    environment: production
    git_branch: main  
    commands:
      - dbt run
      - dbt test
    schedule: 0 10 * * *    
    sla_minutes: 60
    notifications:
      emails:                 
        - address: [email protected] 
          events:
            - failed
            - sla
      slack_channels:
        - channel: data-team    
          events:            
            - passed         
            - failed         
        - channel: pipeline-monitoring    
          events:                    
            - failed         
            - sla

schedules:
  - name: "Finance reports update"
    description: "Update all finance models"
    owner_email: [email protected] 
    environment: production
    git_branch: main  
    commands:
      - dbt run --select tag:finance
    schedule: 'OFF'
    schedule_trigger:
      enabled: true
      schedule_name: "Daily Run"
      workspace_name: data-platform
      trigger_on:
        - passed
        - failed  
    sla_minutes: 60
    notifications:
      emails:                 
        - address: [email protected] 
          events:
            - failed
            - sla
      slack_channels:
        - channel: data-team    
          events:            
            - passed         
            - failed         
        - channel: pipeline-monitoring    
          events:                    
            - failed         
            - sla

schedules:
  - name: "On Merge run CD"
    description: "Continuos deployment run to deploy changes as soon as Merged to the MAIN branch"
    owner_email: [email protected] 
    environment: production
    git_branch: main  
    commands:
      - dbt run --select state:modified+
    schedule: 'OFF'
    trigger_on_merge: true 
    deferred_schedule:
      enabled: true
      deferred_schedule_name: "On Merge run CD"  
      successful_run_only: True     
    sla_minutes: 60
    notifications:
      emails:                 
        - address: [email protected] 
          events:
            - failed
      slack_channels:
        - channel: data-team    
          events:            
            - passed         
            - failed         
        - channel: pipeline-monitoring    
          events:                    
            - failed

schedules:
  - name: "High Frequency run"
    description: "Trigger run and build models only when new data landed in sources tables"
    owner_email: [email protected] 
    environment: production
    git_branch: main  
    commands:
      - dbt source freshness
      - dbt build --select source_status:fresher+ state:modified+ result:error+ result:fail+
    schedule: '0,30 6-23 * * *'
    deferred_schedule:
      enabled: true
      deferred_schedule_name: "High Frequency run"
      successful_run_only: False     
    sla_minutes: 60
    notifications:
      emails:                 
        - address: [email protected] 
          events:
            - failed
            - sla
      slack_channels:
        - channel: data-team    
          events:            
            - passed         
            - failed         
        - channel: pipeline-monitoring    
          events:                    
            - failed         
            - sla

schedules:
  - name: "Turbo CI run"
    description: "Trigger run and build models when a Pull Request is opened in a temporary schema"
    owner_email: [email protected] 
    environment: production
    git_branch: main  
    commands:
      - dbt build --select state:modified+
    schedule: 'OFF'
    deferred_schedule:
      enabled: true
      deferred_schedule_name: "High Frequency run"
      successful_run_only: False     
    sla_minutes: 60
    notifications:
      emails:                 
        - address: [email protected] 
          events:
            - failed
            - sla
      slack_channels:
        - channel: data-team    
          events:            
            - passed         
            - failed         
        - channel: pipeline-monitoring    
          events:                    
            - failed         
            - sla

Best Practices

Schedule Format

Use standard cron expressions for scheduling
- ✅ Standard cron to define days 0-6
  - 10 * * * 0-6 : At minute 10 on every day-of-week from Sunday through Saturday.
- ❌ Non-standard cron to define days 1-7
  - 10 * * * 1-7 : At minute 10 on every day-of-week from Monday through Sunday
Use 'OFF' to disable scheduled execution
Use crontab.guru to validate your cron expressions

SLA Configuration

sla_minutes should be set based on job complexity
Consider dependencies when setting SLA
Recommended minimum: 30 minutes

Notification Configuration

Configure at least one notification channel
Include critical events (failed, SLA) in notifications
Use team channels for collaborative workflows
Make sure to set the Slack / MS Teams Channel or Email for System notifications. Check our guide here for Notifications Settingss

Paradime schedules terminal commands

Before running any of the following commands, navigate to your dbt™️ project directory where paradime_schedules.yml is located.

CLI command

Description

paradime schedule verify

Validate File Format - This command checks the paradime_schedules.yml for formatting errors and outputs the result.

paradime schedule run

Run Schedule Locally - To run all defined schedules based on your local context: (ie. based on your development environment and your current branch).

paradime schedule run <schedule_name>

Run Selected Schedule Locally - To run the named schedule based on your local context: (ie. based on your development environment and your current branch).

paradime schedule run --dry-run

Dry run - To simulate all schedule executions without running dbt™️ models.

paradime schedule run --dry-run <schedule_name>

Dry run - To simulate the named schedule executions without running dbt™️ models.

PreviousDeploy Code Changes On Merge NextManaging Schedules

Last updated 8 months ago

Was this helpful?