Column-Level Lineage Diff
Overview
The Column-Level Lineage Diff Analysis feature in Paradime enables users to understand the blast radius of their changes directly within pull requests (PRs). By leveraging field-level lineage, this CI check identifies changes to columns in your dbt™ models and creates a report for all impacted downstream objects. This includes renaming or removing columns and changes to the underlying logic of columns in your dbt™ models.
When a PR is opened it GitHub, an automated comment is generated listing all downstream nodes. This allows users to understand the changes introduced at a column level and assess the potential impact on downstream dbt™ models. BI dashboards, and other downstream elements.
Key Features
Field-Level Lineage: Identify changes to columns in your dbt™ models and generate a detailed report of all impacted downstream objects.
Automated Comments: Receive automated comments in your PRs listing all downstream dbt™ models and BI nodes affected by the changes.
Impact Assessment: Understand what nodes and other elements might be impacted by the changes introduced in the PR.
Use cases
Assess all downstream nodes nodes impacted by changes both within a dbt project and in downstream application (example: BI)
For Data Mesh architectures, see how your current project's changes impact other project changes.
Tutorial
Prerequisites
To use theColumn-Level Lineage Diff Analysis features, ensure the following prerequisites are met:
Git Integration: Install the Paradime GitHub app and authorize access to the dbt™ repository used in Paradime or use alternative methods based on your Git Provider. See setup instructions.
Production Connection: Add a production connection with access to your sources and models generated when running production jobs. This allows Paradime to run information schema queries and build field-level lineage. See connection guide for instructions based on your data warehouse provider.
Have at least one Bolt Schedule configured. This is required to generate field-level lineage for your dbt™ project. See Bolt Scheduler for configuration.
Setup Instructions

Troubleshooting
Unable to find public GitHub email address
If a user GitHub is not configured correctly when opening a PR the user will see the below comment in the Pull Request:

To fix this issue, make sure the user opening the Pull Request completed the GitHub in Paradime.

1. Generate API keys
API keys are generate at a workspace level.
To be able to trigger Bolt using the API, you will first need to generate API keys for your workspace. Got to account settings and generate your API keys, make sure to save in your password manager:
API key
API secret
API Endpoint
You will need this later when setting up the secrets in GitLab.
API Keys2. Create a Pipeline
Now you will need to create a new .gitlab-ci.yml file in your dbt™️ repository. Copy the code below.
3. Create a GitLab Access Token
You will need to create GitLab Access Token with API to enable Comments to be generated when opening a Merge Request.
Create a group access token (Only available on GitLab Premium Tier or Higher)
Navigate to your GitLab project
Go to Settings > Access Tokens in the left sidebar
Enter a name for your token (e.g., "API Comment Access")
Select the appropriate scopes:
api(for general API access)Specifically for comments, you'll need at least
apiscope, which includes the ability to write comments
Create a GitLab personal access token
If using a GitLab personal access token, we suggest creating a new user, called "Paradime" and create the access token attached to this user.
Log in to your GitLab account
Go to your user profile > Preferences > Access Tokens (or navigate to
https://gitlab.com/-/profile/personal_access_tokens)Enter a name for your token (e.g., "API Comment Access")
Select the appropriate scopes:
api(for general API access)Specifically for comments, you'll need at least
apiscope, which includes the ability to write comments
4. Add the API key and Credential in the GitLab variables
Finally you need to add the API key and Credentials generated in the previous step in GitLab CI/CD pipelines as well as the GitLab Token.
Set the corresponding values using your credentials for the variable names:
PARADIME_API_KEYPARADIME_API_SECRETPARADIME_API_ENDPOINTGITLAB_TOKEN

1. Generate API keys
API keys are generate at a workspace level.
To be able to trigger Bolt using the API, you will first need to generate API keys for your workspace. Got to account settings and generate your API keys, make sure to save in your password manager:
API key
API secret
API Endpoint
You will need this later when setting up the secrets in Azure pipelines.
API Keys2. Create an Azure Pipeline
Now you will need to create a new azure-pipeline.yml file in your dbt™️ repository. Copy the code below.
3. Add the API keys and Credential in the Azure Pipeline variables
Finally you need to add the API key and credentials generated in the previous step in Azure Pipelines.
Set the corresponding values using your credentials for the variable names:
PARADIME_API_KEYPARADIME_API_SECRETPARADIME_API_ENDPOINT
4. Update Build Service User Permissions
You will need to add the Contribute to pull requests permission to enable Comments to be generated when opening the Pull Request.
Go to Project Settings in Azure DevOps
Navigate to "Repositories" under Repos
Select your repository
Click "Security"
Find "Project Collection Build Service" or "[Project Name] Build Service"
Set "Contribute to pull requests" to allowed


1. Generate API keys
API keys are generate at a workspace level.
To be able to trigger Bolt using the API, you will first need to generate API keys for your workspace. Got to account settings and generate your API keys, make sure to save in your password manager:
API key
API secret
API Endpoint
You will need this later when setting up the secrets in BitBucket.
API Keys2. Create a BitBucket Pipeline
Now you will need to create a new bitbucket-pipelines.yml file in your dbt™️ repository. Copy the code block below.
3. Generate BitBucket Access Token
You will need to create a BitBucket Access Token with pull_request:write to enable Comments to be generated when opening a Pull Request.
Go to bitbucket.org and sign in:
Select your BitBucket Repository
Click "Repository Settings" in the left sidebar
In Settings, scroll to "Security" section
Select "Access tokens"
Create New Access Token
Click "Create repository access token"
Enter the following details:
Name:
ParadimePermissions: Select
Pull requests: Write

4. Add the API keys and Credentials in the BitBucket Pipeline variables
Finally you need to add the API key and credentials generated in the previous step in BitBucket Pipelines as well as the BitBucket Access Token.
Set the corresponding values using your credentials for the variable names:
PARADIME_API_KEYPARADIME_API_SECRETPARADIME_API_ENDPOINTBITBUCKET_ACCESS_TOKEN
Lineage Diff Feature - Supported Use Cases
The lineage diff feature analyzes changes in dbt models to track structural modifications that affect downstream dependencies.
The lineage diff feature focuses on structural changes to SELECT statements that affect the schema and column availability for downstream models. It does not track logic changes, data transformations, or modifications to non-SELECT clauses.
Supported Changes
SQL Structural Changes
The lineage diff feature detects and tracks the following structural modifications:
Column renaming: When a column is renamed in a SELECT statement
Column removal: When a column is removed from a SELECT statement
Column addition: When a new column is added to a SELECT statement
Example - Supported Changes
Non-Supported Changes
Structural Changes to Non-SELECT Statements
WHERE clause modifications: Changes to filtering conditions
JOIN modifications: Adding, removing, or changing JOIN conditions
GROUP BY changes: Modifications to grouping logic
ORDER BY changes: Changes to sorting logic
Structural Changes Used in Non-SELECT Contexts
Even if a change is structural (like renaming a column), the lineage diff feature does not track usage in:
WHERE clauses: Column references in filtering conditions
JOIN conditions: Column references in table joins
GROUP BY clauses: Column references in grouping logic
ORDER BY clauses: Column references in sorting logic
Data Changes
Column calculation changes: Modifications to how a column value is computed
NULL handling changes: Changes in NULL value treatment
Data type transformations: Changes that affect data representation but not structure
Example - Non-Supported Changes
Note: Even though customer_name was structurally renamed to full_name, the lineage diff feature only tracks this change in the SELECT statement itself, not its usage in the ORDER BY clause.
Last updated
Was this helpful?