Using Azure DevOps

With Paradime Turbo CI, you can run and test code changes to your dbt™️ project prior to merge them in your main branch, by configuring a Bolt schedule to run when a new pull request is opened against your dbt™️ repository.

Paradime will build the models affected by your changed into a temporary schema, and will run tests that you have written against the changes models to validate your test are still passing. When opening a pull request, Paradime Turbo CI run status will be showing in the pull request.

How Turbo CI works

When Turbo CI is configured, an Azure Pipeline will trigger a TurboCI schedule in Paradime executing your configured dbt™️ command to run and test your dbt™️ changes.

When running, models will be built in a temporary schema using the prefix paradime_turbo_ci_pr_ followed by the commit SHA number (e.g paradime_turbo_ci_pr_6d8f55c0). This will enabled you to see the results of your changes associated with the code in your pull request in your production data warehouse

Pre-requisites

To use this feature it is required to have an additional production environment configured in Paradime, where the target is set to ci.

ℹ️ Check our setup guide here based on your data warehouse provider.

Configure a custom schema for CI runs

Customize you dbt™️ schema generation at runtime, to make sure that when a PR is opened, Paradime Turbo CI will build and test your changed dbt™️ models in a temporary schema. We will want to update the dbt™️ generate_schema_name.sql macro.

In your dbt™️ project, in your macros folder create a macro called: generate_schema_name.sql and use a similar logic as below, where when a Bolt run is executed using the ci target the schema where your models will be built will use the paradime_turbo_ci_pr prefix.

{% macro generate_schema_name(custom_schema_name, node) -%}

    {%- set default_schema = target.schema -%}
    {%- if target.name == 'prod' -%}

        {%- if custom_schema_name is none -%}
            {{ default_schema }}
        {%- else -%}
            {{ custom_schema_name | trim }}
        {%- endif -%}

    {%- elif target.name == 'ci' -%}

        {{'PARADIME_TURBO_CI_PR'}}_{{ env_var('PARADIME_SCHEDULE_GIT_SHA', 'unknown_sha') [:8] }}_{{ default_schema | trim }}

    {%- else -%}

        {%- if custom_schema_name is none -%}

            {{ default_schema }}

        {%- else -%}

            {{ default_schema }}_{{ custom_schema_name | trim }}

        {%- endif -%}

    {%- endif -%}

{%- endmacro %}

See also:

Deferral and State Comparison explained

When creating the Turbo CI job in Paradime, you can set your execution settings to defer to a previous run state by adding in the deferred_schedule_name configuration the name of the Bolt productions schedule you want to defer to

Paradime will look at the manifest.json from the specified schedule's most recent successful run (unless setting the optional parameter successful_run_only: false) to determined the set of new and modified dbt™️ resources. In your Turbo CI commands, you can then use the state:modified+ argument to only run the modified resources and their downstream dependencies.

A common example is:

dbt build --select state:modified+ --target ci

This will run and test all modified models with the downstream dependencies. Useful to validate that test are still passing after making changes to upstream dbt™️ models.

👉 To read more about state comparison check the dbt Core™️ documentation here.

1. Configuring Bolt Turbo CI job

Turbo CI in Paradime is configured in the same file where all your other bolt schedules are configured paradime_schedules.yml. Setting up your Turbo CI jobs is very similar to a normal bolt schedule with the addition of a couple more configurations.

A Turbo CI job differ from a bolt schedule as it involves:

  • The Bolt schedule to defer to

  • The commands to use the state:modified+ selector to build / tests only new dbt™️ models and their downstream dependencies. State comparison can only happen when there is a deferred schedule name configured to compare state to.

  • Being triggered by a pull request opened in your Azure DevOps dbt™️ repository

Additional schedule parameters available for turbo CI schedules

ParametersDescriptionExample

enabled

[Required] Set this to TRUE to enabled deferred for the schedule

true

deferred_manifest_schedule

[Required] The name of the Bolt schedule used to to look for the most recent successful run manifest.json / sources.json / run_results.json for state comparison.

hourly

successful_run_only

[Optional] By default paradime will look for the last successful run. Set this to false to let paradime to use defer with last run of the selected schedule, irrespective of the run status.

false

Example Turbo CI configuration

paradime_schedules.yml
- name: turbo_ci # the name of your Turbo CI job
  turbo_ci:
    enabled: true # true to enabled this Turbo CI job to run on pull request
    deferred_manifest_schedule: hourly #the name of the bolt schedule where the Turbo CI job will look for the most recent successful run manifest.json for state comparison
    successful_run_only: false #[optional]by default paradime will look for the last successful run. Set this to false to let paradime to use defer with last run of the selected schedule, irrespective of the run status.
  schedule: "OFF" # set the schedule configuration to not run on a schedule (to be used for PR only)
  environment: production #the environment used to run the schedule -> this is always production
  commands:
    - dbt build --select state:modified+ --target ci #the dbt command you want to run when the pull request is opened
  owner_email: "john@acme.io" #the email of the Turbo CI job owner_email

2. Create an Azure Pipeline

Generate API keys and find you workspace token

API keys are generate at a workspace level.

To be able to trigger Bolt using the API, you will first need to generate API keys for your workspace. Got to account settings and generate your API keys, make sure to save in your password manager:

  • API key

  • API secret

  • API Endpoint

  • Workspace token

You will need this later when setting up the secrete in Azure pipelines.

pageGenerate API keyspageCompany & Workspace token

Create an Azure Pipeline

Now you will need to create a new azure-pipeline.yml file in your dbt™️ repository. Copy the code block below and enter the values required.

Example Azure pipelines configuration file
paradime_turbo_ci.yml
trigger:
  - none

pr:
  - '*'

variables:
  - name: schedule_name
    value: <the schedule name set in Paradime> #example turbi_ci_job
  - name: api_endpoint
    value: <the api endpoint generated in the previous step> #example https://api.paradime.io/api/v1/uany7ed234ovarzx/graphql
  - name: workspace_token
    value: <the workspace token generated in the previous step> #example 8p232d9mo4cvea9w
  - name: base_paradime_bolt_url
    value: <the Paradime URL of your instance, make sure to include /bolt/run_id/> # Example 'https://app.paradime.io/bolt/run_id/'
  - name: git_head
    value: $(System.PullRequest.SourceCommitId)
  - name: pythonVersion
    value: '3.8'

pool:
  vmImage: 'ubuntu-latest'


steps:
- task: UsePythonVersion@0
  inputs:
    versionSpec: $(pythonVersion)
    addToPath: true
  displayName: 'Set Python Version'

- script: |
    python -m pip install --upgrade pip
    pip install paradime-io
  displayName: 'Install dependencies'

- script: |
    export PYTHONUNBUFFERED=1
    python -c "
    import time
    from paradime import Paradime
  
    # Create a Paradime client with your API credentials
    paradime = Paradime(api_endpoint='${{variables.api_endpoint}}', api_key='$(API_KEY)', api_secret='$(API_SECRET)')
  
    # Trigger a run of the Bolt schedule and get the run ID
    run_id = paradime.bolt.trigger_run(schedule_name='${{variables.schedule_name}}', branch='${{variables.git_head}}')
    print(f'Triggered Bolt Run. View run details: "${{variables.base_paradime_bolt_url}}"{run_id}?workspaceToken="${{variables.workspace_token}}"')
     
    # Continuously check the run status
    while True:
      run_status = paradime.bolt.get_run_status(run_id)
      print(f'Run Status: {run_status}')
      if run_status != 'RUNNING':
        break  # Exit loop if status is anything other than RUNNING
      time.sleep(10)  # Wait for 10 seconds before checking again

    exit(0 if run_status == 'SUCCESS' else 1)
    "
  displayName: 'Trigger and Monitor Paradime Bolt Run'
  # Define a timeout for this job
  timeoutInMinutes: 60

Add the API keys and Credential in the Azure Pipeline variables

Finally you need to add the API key and credentials generated in the previous step in Azure Pipelines.

Set the corresponding values using your credentials for the variable names:

  • API_KEY

  • API_SECRET

3. Setup Git branch Build validation

Finally set up PR Build validation for your dbt™️ repository to trigger our Azure Pipeline and run Turbo CI when a new PR is opened against your default branch.

View Turbo CI run Logs

As per other bolt schedules, you can inspect run logs for each of the Turbo CI jobs running when a pull request is opened.

pageView Bolt runs logs

Last updated