GitLab

With Paradime Turbo CI, you can run and test code changes to your dbt™️ project prior to merge them in your main branch, by configuring a Bolt schedule to run when a new pull request is opened against your dbt™️ repository.

Paradime will build the models affected by your changed into a temporary schema, and will run tests that you have written against the changes models to validate your test are still passing. When opening a pull request, Paradime Turbo CI run status will be showing in the pull request.

How Turbo CI works

When Turbo CI is configured, an GitLab CI/CD Pipeline will trigger a TurboCI schedule in Paradime executing your configured dbt™️ command to run and test your dbt™️ changes.

When running, models will be built in a temporary schema using the prefix paradime_turbo_ci_pr_ followed by the commit SHA number (e.g paradime_turbo_ci_pr_6d8f55c0). This will enabled you to see the results of your changes associated with the code in your pull request in your production data warehouse

Pre-requisites

To use this feature it is required to have an additional Bolt environment configured in Paradime, where the target is set to ci.

ℹ️ Check our setup guide here based on your data warehouse provider.

Configure a custom schema for CI runs

Customize you dbt™️ schema generation at runtime, to make sure that when a PR is opened, Paradime Turbo CI will build and test your changed dbt™️ models in a temporary schema. We will want to update the dbt™️ generate_schema_name.sql macro.

In your dbt™️ project, in your macros folder create a macro called: generate_schema_name.sql and use a similar logic as below, where when a Bolt run is executed using the ci target the schema where your models will be built will use the paradime_turbo_ci_pr prefix.

{% macro generate_schema_name(custom_schema_name, node) -%}

    {%- set default_schema = target.schema -%}
    {%- if target.name == 'prod' -%}

        {%- if custom_schema_name is none -%}
            {{ default_schema }}
        {%- else -%}
            {{ custom_schema_name | trim }}
        {%- endif -%}

    {%- elif target.name == 'ci' -%}

        {{'PARADIME_TURBO_CI_PR'}}_{{ env_var('PARADIME_SCHEDULE_GIT_SHA', 'unknown_sha') [:8] }}_{{ default_schema | trim }}

    {%- else -%}

        {%- if custom_schema_name is none -%}

            {{ default_schema }}

        {%- else -%}

            {{ default_schema }}_{{ custom_schema_name | trim }}

        {%- endif -%}

    {%- endif -%}

{%- endmacro %}

Deferral and State Comparison explained

When creating the Turbo CI job in Paradime, you can set your execution settings to defer to a previous run state by adding in the deferred_schedule_name configuration the name of the Bolt productions schedule you want to defer to

Paradime will look at the manifest.json from the specified schedule's most recent successful run (unless setting the optional parameter successful_run_only: false) to determined the set of new and modified dbt™️ resources. In your Turbo CI commands, you can then use the state:modified+ argument to only run the modified resources and their downstream dependencies.

A common example is:

dbt build --select state:modified+ --target ci

This will run and test all modified models with the downstream dependencies. Useful to validate that test are still passing after making changes to upstream dbt™️ models.

👉 To read more about state comparison check the dbt Core™️ documentation here.

1. Configuring Bolt Turbo CI job

Setting up your Turbo CI Bolt Schedule is very similar to a normal bolt schedule with the addition of a couple more configurations.

A Turbo CI job differ from a bolt schedule as it involves:

The Bolt schedule to defer to
The commands to use the state:modified+ selector to build / tests only new dbt™️ models and their downstream dependencies. State comparison can only happen when there is a deferred schedule name configured to compare state to.
Being triggered by a pull request opened in your GitLab dbt™️ repository

Check our template:

Test Code Changes On Pull Requests

2. Create a GitLab Pipeline

Generate API keys and find you workspace token

API keys are generate at a workspace level.

To be able to trigger Bolt using the API, you will first need to generate API keys for your workspace. Got to account settings and generate your API keys, make sure to save in your password manager:

API key
API secret
API Endpoint

You will need this later when setting up the secret in GitLab pipelines.

API Keys

Now you will need to create a new .gitlab-ci.yml file at the root of your project your dbt™️ repository. Copy the code block below and enter the values required.

Example GitLab pipelines configuration file

.gitlab-ci.yml

# Define the stages of the pipeline - in this case, just one stage for CI
stages:
 - paradime_turbo_ci

# Main job definition
paradime_turbo_ci:
 # Specify which stage this job belongs to
 stage: paradime_turbo_ci
 
 # Use Python 3.11 as the base Docker image for this job
 image: python:3.11
 
 # Define when this pipeline should be triggered
 rules:
   # This job will run:
   # - When a merge request event occurs (PR opened or updated)
   # - 'when: always' ensures it runs even if previous stages failed
   - if: $CI_PIPELINE_SOURCE == "merge_request_event"
     when: always
 
 # Define environment variables needed for the job
 variables:
   # These variables should be configured in GitLab CI/CD settings
   PARADIME_API_KEY: ${PARADIME_API_KEY}         # API key for Paradime authentication
   PARADIME_API_SECRET: ${PARADIME_API_SECRET}   # API secret for Paradime authentication
   PARADIME_API_ENDPOINT: ${PARADIME_API_ENDPOINT} # Paradime API endpoint URL
   PARADIME_SCHEDULE_NAME: "turbo_ci_run"        # Name of the Paradime schedule to run
 
 # Commands to run before the main script
 before_script:
   # Install the Paradime Python SDK
   - pip install paradime-io==4.7.1  # Check for latest version of the Paradime Python SDK on https://github.com/paradime-io/paradime-python-sdk/releases
 
 # Main script to execute
 script: |
   # Run the Paradime bolt schedule with:
   # - The specified schedule name from variables
   # - The current commit SHA (CI_COMMIT_SHA is a GitLab CI predefined variable)
   # - The --wait flag to make the pipeline wait for completion
   paradime bolt run "$PARADIME_SCHEDULE_NAME" --branch ${CI_COMMIT_SHA} --wait
 
 # Set a timeout of 60 minutes for this job
 timeout: 60 minutes

Add the API key and Credential in the GitLab variables

Finally you need to add the API key and credentials generated in the previous step in GitLab CI/CD pipelines.

Set the corresponding values using your credentials for the variable names:

PARADIME_API_KEY
PARADIME_API_SECRET
PARADIME_API_ENDPOINT

View Turbo CI run Logs

As per other bolt schedules, you can inspect run logs for each of the Turbo CI jobs running when a pull request is opened.

Viewing Run Log History

PreviousGitHub NextParadime Turbo CI Schema Cleanup

Last updated 11 months ago

Was this helpful?