BigQuery cost connection

IP RESTRICTIONS

Make sure to allow traffic from one of the Paradime IPs in your firewall depending on the data location selected.

👉 See also: Paradime IP addresses.

BigQuery setup guide

1. Creating a Service Account User in BigQuery

Follow these steps to create a new service account user for Paradime and grant the required permissions at the project level:

  1. Go to Google Cloud Console and navigate to IAM & Admin:

    • Select your project from the project selector at the top.

    • Go to the "IAM & Admin" section.

  2. Create Service Account:

    • Click on "Service Accounts".

    • Click "Create Service Account".

    • Enter the Service Account name (e.g., paradime-radar-user).

    • Click "Create and Continue".

  3. Assign Roles:

    • Assign the necessary roles to the service account:

      • BigQuery Resource Viewer

      • BigQuery Job User

  4. Complete and Save:

    • Click "Done" after assigning the roles.

    • The service account will now be created with the assigned permissions.

  5. Generate Key:

    • Go to the "Keys" tab and click "Add Key" -> "Create New Key".

    • Choose JSON format and save the key file securely. (We will need this later to connect BigQuery to Paradime.)


2. Create the BigQuery Dataset

Make sure to name the BigQuery Dataset as paradime_cost_analytics

To set up your BigQuery Project for Paradime, follow these steps:

  1. Go to Google Cloud Console, navigate to BigQuery and select your project.

  2. Create a Dataset:

    • Name: paradime_cost_analytics

    • Select the appropriate Dataset region (this will be needed for the Paradime connection setup).

  3. Grant Access:

    • Assign the BigQuery User and BigQuery Data Editor roles to the service account user created previously, ensuring these permissions are at the Dataset level.

Connecting multiple BigQuery Projects to Paradime?

If you plan to connect multiple BigQuery Projects to Paradime, follow these steps for each project:

1. Grant Access to the BigQuery Project

Using the existing service account you from step 1, Creating a Service Account, complete the following:

  1. Go to Google Cloud Console and navigate to IAM & Admin.

  2. Select your project from the project selector at the top.

  3. Go to the "IAM & Admin" section and select "IAM".

  4. Assign Roles:

    1. Select "Grant Access".

    2. Assign the following roles to the service account created initially:

      1. BigQuery Resource Viewer

      2. BigQuery Job User

2. Create the BigQuery Dataset in Each Project

Similar to step 2, Create the BigQuery Dataset, complete the following:

  1. Go to Google Cloud Console, navigate to BigQuery, and select your project.

  2. Create a Dataset:

    1. Name: paradime_cost_analytics

    2. Ensure the Paradime Dataset region is the same across all BigQuery projects.

  1. Grant Access:

    1. Assign the BigQuery User and BigQuery Data Editor roles to the service account user created previously.

    2. Ensure these permissions are granted at the Dataset level.


3. Create and configure a GCS Bucket

Leave the rest of the Bucket configuration using the default options unless required by your organization. Ensure the GCS Bucket region is the same as the Dataset region.

  1. Navigate to Cloud Storage and select Buckets

    1. Click "Create".

    2. Name your GCS bucket (e.g., paradime-bucket).

    3. Select your Bucket region (e.g., US).

    4. Click on "Create" to complete creating your GCS Bucket.

  2. Set lifecycle policy

    1. Go to the "Lifecycle" tab for your GCS Bucket.

    2. Select "Add Rule" -> "Delete Object" action.

    3. Set "Age" to 2 days and click "Create" to set the policy.

  3. Grant the Paradime service account user access to the GCS Bucket

    1. Go to the "Permissions" tab for your GCS Bucket.

    2. Add the Paradime Service Account user email as a "New Principal" and assign the Storage Admin role.


4. dbt™️ project setup guide

To enable Paradime to enrich your BigQuery queries with additional metadata you will need to create a new dbt™️ macro called get_query_comment.sql in the macros folder of your project.

  1. Create the Macro (get_query_comment.sql)

get_query_comment.sql dbt™️ macro
get_query_comment.sql
{% macro get_query_comment(node, extra = {}) %}
    {%- set comment_dict = extra -%}
    {%- do comment_dict.update(
        app='dbt',
        dbt_version=dbt_version,
        project_name=project_name,
        target_name=target.name,
        target_database=target.database,
        target_schema=target.schema,
        invocation_id=invocation_id
    ) -%}

    {%- if node is not none -%}
        {%- do comment_dict.update(
            node_name=node.name,
            node_alias=node.alias,
            node_package_name=node.package_name,
            node_original_file_path=node.original_file_path,
            node_database=node.database,
            node_schema=node.schema,
            node_id=node.unique_id,
            node_resource_type=node.resource_type,
            node_meta=node.config.meta,
            node_tags=node.tags,
            full_refresh=flags.FULL_REFRESH,
            which=flags.WHICH,
        ) -%}

        {%- if flags.INVOCATION_COMMAND -%}
            {%- do comment_dict.update(
                invocation_command=flags.INVOCATION_COMMAND
            ) -%}
        {%- endif -%}

        {%- if node.resource_type != ('seed') -%} {# Otherwise this throws an error saying 'Seeds cannot depend on other nodes.' #}
            {%- if node.refs is defined -%}
                {% set refs = [] %}
                {% for ref in node.refs %}
                    {%- if dbt_version >= '1.5.0' -%}
                        {%- do refs.append(ref.name) -%}
                    {%- else -%}
                        {%- do refs.append(ref[0]) -%}
                    {%- endif -%}
                {% endfor %}
                {%- do comment_dict.update(
                    node_refs=refs | unique | list
                ) -%}
            {%- endif -%}
        {%- endif -%}
        {%- if node.resource_type == 'model' -%}
            {%- do comment_dict.update(
                materialized=node.config.materialized,
            ) -%}
        {%- endif -%}
    {%- endif -%}

--- only if using dbt cloud

    {%- if env_var('DBT_CLOUD_PROJECT_ID', False) -%}
    {%- do comment_dict.update(
        dbt_cloud_project_id=env_var('DBT_CLOUD_PROJECT_ID')
    ) -%}
    {%- endif -%}

    {%- if env_var('DBT_CLOUD_JOB_ID', False) -%}
    {%- do comment_dict.update(
        dbt_cloud_job_id=env_var('DBT_CLOUD_JOB_ID')
    ) -%}
    {%- endif -%}

    {%- if env_var('DBT_CLOUD_RUN_ID', False) -%}
    {%- do comment_dict.update(
        dbt_cloud_run_id=env_var('DBT_CLOUD_RUN_ID')
    ) -%}
    {%- endif -%}

    {%- if env_var('DBT_CLOUD_RUN_REASON_CATEGORY', False) -%}
    {%- do comment_dict.update(
        dbt_cloud_run_reason_category=env_var('DBT_CLOUD_RUN_REASON_CATEGORY')
    ) -%}
    {%- endif -%}

    {%- if env_var('DBT_CLOUD_RUN_REASON', False) -%}
    {%- do comment_dict.update(
        dbt_cloud_run_reason=env_var('DBT_CLOUD_RUN_REASON')
    ) -%}
    {%- endif -%}

-- only if using bolt scheduler
    {%- if env_var('PARADIME_SCHEDULE_NAME', False) -%}
    {%- do comment_dict.update(
        paradime_schedule_name=env_var('PARADIME_SCHEDULE_NAME')
    ) -%}
    {%- endif -%}

    {%- if env_var('PARADIME_SCHEDULE_RUN_ID', False) -%}
    {%- do comment_dict.update(
        paradime_schedule_run_id=env_var('PARADIME_SCHEDULE_RUN_ID')
    ) -%}
    {%- endif -%}

    {%- if env_var('PARADIME_SCHEDULE_RUN_START_DTTM', False) -%}
    {%- do comment_dict.update(
        paradime_schedule_run_start_dttm=env_var('PARADIME_SCHEDULE_RUN_START_DTTM')
    ) -%}
    {%- endif -%}

    {%- if env_var('PARADIME_SCHEDULE_TRIGGER', False) -%}
    {%- do comment_dict.update(
        paradime_schedule_trigger=env_var('PARADIME_SCHEDULE_TRIGGER')
    ) -%}
    {%- endif -%}

    {%- if env_var('PARADIME_SCHEDULE_GIT_SHA', False) -%}
    {%- do comment_dict.update(
        paradime_schedule_git_sha=env_var('PARADIME_SCHEDULE_GIT_SHA')
    ) -%}
    {%- endif -%}

    {{ return(tojson(comment_dict)) }}
{% endmacro %}
  1. Update your dbt_project.yml file

This step ensures that with each dbt™️ run, the query comment is appended to the query running in BigQuery.

dbt_project.yml
query-comment:
  comment: '{{ get_query_comment(node) }}'
  append: true

5. Configure Cost connection in Paradime

From the Account Settings page of your Paradime workspace, add your new BigQuery Cost connection using the following details:

  • BigQuery Service Account JSON

  • BigQuery Project IDs

  • Dataset Location for the Dataset used by Paradime

  • GCS Bucket Name used by Paradime

If connecting multiple project, make sure to use commas and no spaces between each BigQuery Project ID.

✅ dev-project,staging-project,production-project

dev-project, staging-project, production-project

Last updated

#350: Bolt - Template fixes pt 1

Change request updated