Lineage Diff

The Lineage Diff module allows you to easily interact with Paradime Lineage Diff feature to trigger, generate and return column level lineage reports.

Trigger Lineage Diff report and wait until report is returned

Triggers a lineage diff report for the specified parameters and wait until generation is completed.

user_email (str): The email of the user triggering the report (pull request author).

pull_request_number (int): The number of the pull request.

repository_name (str): The full name of the repository. E.g. "paradime-io/jaffle-shop".

base_commit_sha (str): The SHA of the base commit.

head_commit_sha (str): The SHA of the head commit.

changed_file_paths (List[str]): A list of file paths that have changed in the pull request.

# First party modules
from paradime import Paradime

# Create a Paradime client with your API credentials
paradime = Paradime( api_endpoint="API_ENDPOINT", api_key="API_KEY", api_secret="API_SECRET")

# Replace with all the arguments to compare the git branches and generate the lineage diff report
BASE_COMMIT_SHA = "476a32a3d6dbba0bf1fcf5fac9a7b4225a752e8a"
HEAD_COMMIT_SHA = "f6465321d1a18d60b633e68aa369b4c4e6454f9d"
CHANGED_FILES = ["dbt/models/marts/core/order_items.sql", "dbt/models/marts/core/new_model.sql"]
PULL_REQUEST_NUMBER = 24
REPO_NAME = "paradime-io/jaffle-shop"
USER_EMAIL = "john@acme.io"

# Trigger lineage diff report and wait until completed
report = paradime.lineage_diff.trigger_report_and_wait(
    base_commit_sha=BASE_COMMIT_SHA,
    head_commit_sha=HEAD_COMMIT_SHA,
    changed_file_paths=CHANGED_FILES,
    pull_request_number=PULL_REQUEST_NUMBER,
    repository_name=REPO_NAME,
    user_email=USER_EMAIL
)

Report object details

The report object provides several options that can be selected with the following attributes:

  • message: A string containing details of the report status.

  • status: Indicates the status of the operation or request.

  • url: A string representing a URL to the Bolt schedule compiling the two git branches.

  • uuid: A unique identifier for the report.

  • result_json: Contains the result in JSON format.

  • result_markdown: Provides the result in Markdown format.

Last updated