Using GitLab

With Paradime Turbo CI, you can run and test code changes to your dbt™️ project prior to merge them in your main branch, by configuring a Bolt schedule to run when a new pull request is opened against your dbt™️ repository.

Paradime will build the models affected by your changed into a temporary schema, and will run tests that you have written against the changes models to validate your test are still passing. When opening a pull request, Paradime Turbo CI run status will be showing in the pull request.

How Turbo CI works

When Turbo CI is configured, an GitLab CI/CD Pipeline will trigger a TurboCI schedule in Paradime executing your configured dbt™️ command to run and test your dbt™️ changes.

When running, models will be built in a temporary schema using the prefix paradime_turbo_ci_pr_ followed by the commit SHA number (e.g paradime_turbo_ci_pr_6d8f55c0). This will enabled you to see the results of your changes associated with the code in your pull request in your production data warehouse

Pre-requisites

To use this feature it is required to have an additional production environment configured in Paradime, where the target is set to ci.

ℹ️ Check our setup guide here based on your data warehouse provider.

Configure a custom schema for CI runs

Customize you dbt™️ schema generation at runtime, to make sure that when a PR is opened, Paradime Turbo CI will build and test your changed dbt™️ models in a temporary schema. We will want to update the dbt™️ generate_schema_name.sql macro.

In your dbt™️ project, in your macros folder create a macro called: generate_schema_name.sql and use a similar logic as below, where when a Bolt run is executed using the ci target the schema where your models will be built will use the paradime_turbo_ci_pr prefix.

{% macro generate_schema_name(custom_schema_name, node) -%}

    {%- set default_schema = target.schema -%}
    {%- if target.name == 'prod' -%}

        {%- if custom_schema_name is none -%}
            {{ default_schema }}
        {%- else -%}
            {{ custom_schema_name | trim }}
        {%- endif -%}

    {%- elif target.name == 'ci' -%}

        {{'PARADIME_TURBO_CI_PR'}}_{{ env_var('PARADIME_SCHEDULE_GIT_SHA', 'unknown_sha') [:8] }}_{{ default_schema | trim }}

    {%- else -%}

        {%- if custom_schema_name is none -%}

            {{ default_schema }}

        {%- else -%}

            {{ default_schema }}_{{ custom_schema_name | trim }}

        {%- endif -%}

    {%- endif -%}

{%- endmacro %}

See also:

Deferral and State Comparison explained

When creating the Turbo CI job in Paradime, you can set your execution settings to defer to a previous run state by adding in the deferred_schedule_name configuration the name of the Bolt productions schedule you want to defer to

Paradime will look at the manifest.json from the specified schedule's most recent successful run (unless setting the optional parameter successful_run_only: false) to determined the set of new and modified dbt™️ resources. In your Turbo CI commands, you can then use the state:modified+ argument to only run the modified resources and their downstream dependencies.

A common example is:

dbt build --select state:modified+ --target ci

This will run and test all modified models with the downstream dependencies. Useful to validate that test are still passing after making changes to upstream dbt™️ models.

👉 To read more about state comparison check the dbt Core™️ documentation here.

1. Configuring Bolt Turbo CI job

Turbo CI in Paradime is configured in the same file where all your other bolt schedules are configured paradime_schedules.yml. Setting up your Turbo CI jobs is very similar to a normal bolt schedule with the addition of a couple more configurations.

A Turbo CI job differ from a bolt schedule as it involves:

  • The Bolt schedule to defer to

  • The commands to use the state:modified+ selector to build / tests only new dbt™️ models and their downstream dependencies. State comparison can only happen when there is a deferred schedule name configured to compare state to.

  • Being triggered by a pull request opened in your GitLab dbt™️ repository

Additional schedule parameters available for turbo CI schedules

ParametersDescriptionExample

enabled

[Required] Set this to TRUE to enabled deferred for the schedule

true

deferred_manifest_schedule

[Required] The name of the Bolt schedule used to to look for the most recent successful run manifest.json / sources.json / run_results.json for state comparison.

hourly

successful_run_only

[Optional] By default paradime will look for the last successful run. Set this to false to let paradime to use defer with last run of the selected schedule, irrespective of the run status.

false

Example Turbo CI configuration

paradime_schedules.yml
- name: turbo_ci # the name of your Turbo CI job
  turbo_ci:
    enabled: true # true to enabled this Turbo CI job to run on pull request
    deferred_manifest_schedule: hourly #the name of the bolt schedule where the Turbo CI job will look for the most recent successful run manifest.json for state comparison
    successful_run_only: false #[optional]by default paradime will look for the last successful run. Set this to false to let paradime to use defer with last run of the selected schedule, irrespective of the run status.
  schedule: "OFF" # set the schedule configuration to not run on a schedule (to be used for PR only)
  environment: production #the environment used to run the schedule -> this is always production
  commands:
    - dbt build --select state:modified+ --target ci #the dbt command you want to run when the pull request is opened
  owner_email: "john@acme.io" #the email of the Turbo CI job owner_email

2. Create a GitLab Pipeline

Generate API keys and find you workspace token

API keys are generate at a workspace level.

To be able to trigger Bolt using the API, you will first need to generate API keys for your workspace. Got to account settings and generate your API keys, make sure to save in your password manager:

  • API key

  • API secret

  • API Endpoint

  • Workspace token

You will need this later when setting up the secret in GitLab pipelines.

pageGenerate API keyspageCompany & Workspace token

Now you will need to create a new .gitlab-ci.yml file at the root of your project your dbt™️ repository. Copy the code block below and enter the values required.

Example GitLab pipelines configuration file
.gitlab-ci.yml
# Paradime variables
variables:
  SCHEDULE_NAME: <the schedule name set in Paradime> #example turbi_ci_job
  API_URL:  <the api endpoint generated in the previous step> #example https://api.paradime.io/api/v1/uany7ed234ovarzx/graphql
  WORKSPACE_TOKEN: <the workspace token generated in the previous step> #example 8p232d9mo4cvea9w
  BASE_PARADIME_BOLT_URL: 'https://app.paradime.io/bolt/run_id/'

stages:
  - paradime_turbo_ci

workflow:
  rules:
    - if: '$CI_PIPELINE_SOURCE == "merge_request_event"'

trigger_bolt_run:
  stage: paradime_turbo_ci
  image: ubuntu:latest
  script:
    - apt-get update && apt-get install -y curl jq # Install curl and jq before using them
    - |
      response=$(curl -s -X POST \
      -H "Content-Type: application/json" \
      -H "X-API-KEY: $API_KEY" \
      -H "X-API-SECRET: $API_SECRET" \
      -d "{
        \"query\": \"mutation trigger { triggerBoltRun(scheduleName: \\\"${SCHEDULE_NAME}\\\", branch: \\\"${CI_COMMIT_SHA}\\\") { runId } }\"
      }" "${API_URL}")
      echo "$response"

      runId=$(echo "$response" | jq -r '.data.triggerBoltRun.runId')
      if [ "$runId" == "null" ]; then
        exit 1
      fi

      echo "$runId" > runID.tmp # Save runId to a file to be read in the next step
      
    - |
      runId=$(cat runID.tmp)
      
      echo "${BASE_PARADIME_BOLT_URL}${runId}?workspaceToken=${WORKSPACE_TOKEN}"
      while :
      do
        response=$(curl -s -X POST \
        -H "Content-Type: application/json" \
        -H "X-API-KEY: ${API_KEY}" \
        -H "X-API-SECRET: ${API_SECRET}" \
        -d "{
          \"query\": \"{ boltRunStatus(runId: ${runId}) { state } }\"
        }" "${API_URL}")
        echo $response
        
        state=$(echo "$response" | jq -r '.data.boltRunStatus.state')
        
        if [ "$state" != "RUNNING" ]; then
          if [ "$state" != "SUCCESS" ]; then
            exit 1
          fi
          exit 0
        fi
        
        sleep 10
      done
  # Define a timeout for this job
  timeout: 60 minutes

Add the API key and Credential in the GitLab variables

Finally you need to add the API key and credentials generated in the previous step in GitLab CI/CD pipelines.

Set the corresponding values using your credentials for the variable names:

  • API_KEY

  • API_SECRET

View Turbo CI run Logs

As per other bolt schedules, you can inspect run logs for each of the Turbo CI jobs running when a pull request is opened.

pageView Bolt runs logs

Last updated