# BigQuery cost connection

{% hint style="warning" %}
**IP RESTRICTIONS**

Make sure to allow traffic from one of the Paradime IPs in your firewall depending on the data location selected.

👉 See also: [Paradime IP addresses](/app-help/developers/ip-restrictions.md).
{% endhint %}

## BigQuery setup guide

### 1. Creating a Service Account User in BigQuery

Follow these steps to create a new service account user for Paradime and grant the required permissions at the project level:

1. **Go to Google Cloud Console and**[ **navigate to IAM & Admin**](https://console.cloud.google.com/iam-admin/serviceaccounts):
   * Select your project from the project selector at the top.
   * Go to the "IAM & Admin" section.
2. **Create Service Account**:
   * Click on "Service Accounts".
   * Click "Create Service Account".
   * Enter the Service Account name (e.g., `paradime-radar-user`).
   * Click "Create and Continue".
3. **Assign Roles**:
   * Assign the necessary roles to the service account:
     * **`BigQuery Resource Viewer`**
     * **`BigQuery Job User`**
4. **Complete and Save**:
   * Click "Done" after assigning the roles.
   * The service account will now be created with the assigned permissions.
5. **Generate Key**:
   * Go to the "Keys" tab and click "Add Key" -> "Create New Key".
   * Choose JSON format and save the key file securely. (We will need this later to connect BigQuery to Paradime.)

{% @arcade/embed url="<https://app.arcade.software/share/He26zpXuXQ5hOiK5eAgo>" flowId="He26zpXuXQ5hOiK5eAgo" %}

***

### 2. Create the BigQuery Dataset

{% hint style="danger" %}
Make sure to name the BigQuery Dataset as **`paradime_cost_analytics`**
{% endhint %}

To set up your BigQuery Project for Paradime, follow these steps:

1. **Go to Google Cloud Console,** [**navigate to BigQuery**](https://console.cloud.google.com/bigquery) **and select your project.**
2. **Create a Dataset**:
   * Name: `paradime_cost_analytics`
   * Select the appropriate Dataset region (this will be needed for the Paradime connection setup).
3. **Grant Access**:
   * Assign the `BigQuery User` and `BigQuery Data Editor` roles to the service account user created previously, ensuring these permissions are at the Dataset level.

{% @arcade/embed url="<https://app.arcade.software/share/ria8cQB7SqmLezvqKUjq>" flowId="ria8cQB7SqmLezvqKUjq" %}

### 3. Setting up a Logs Routing Sink

{% hint style="warning" %}
To create a log sink in Google Cloud, you need the "**Logs Configuration Writer**" (roles/logging.configWriter) IAM role on the project where you're creating the sink.
{% endhint %}

{% hint style="danger" %}
Make sure to use **`paradime_cost_analytics`** as the destination dataset for the Logs Routing Sink.
{% endhint %}

To set up your Logs Routing Sink for Paradime, follow these steps:

1. In the Google Cloud console, go to the [Log Router page](https://console.cloud.google.com/logs/router). If you use the search bar to find this page, then select the result whose subheading is **Logging**.
2. Select the Google Cloud project in which the log entries that you want to route originate.
3. Select **Create sink**.
4. In the **Sink details** panel, enter the following details:
   * **Sink name**: Provide an identifier for the sink; note that after you create the sink, you can't rename the sink but you can delete it and create a new sink.
   * **Sink description** (optional): Describe the purpose or use case for the sink.
5. In the **Sink destination** panel, select the sink service and destination by using the **Select sink service** menu.
   1. Select **BigQuery dataset**: Select the data set name **`paradime_cost_analytics`** (created in the previous step) to receive the routed log entries.
   2. Select the option ot partition tables
6. Go to the **Choose logs to include in sink** panel and specify the log entries to include:

```sql
resource.type="bigquery_resource"
protoPayload.methodName="jobservice.jobcompleted"
```

7. Select **Create sink**.

{% @arcade/embed url="<https://app.arcade.software/share/RCwlT3KRAIAjqE7oTRZt>" flowId="RCwlT3KRAIAjqE7oTRZt" %}

{% hint style="info" %}

#### Set the partition expiration <a href="#partition-expiration" id="partition-expiration"></a>

After the Logs Router Sink table has been created, you can setup an expiration for partitions older than 7 days.

```sql
ALTER TABLE paradime_cost_analytics.cloudaudit_googleapis_com_data_access
  SET OPTIONS (
    -- Sets partition expiration to 7 days
    partition_expiration_days = 7);
```

{% endhint %}

<details>

<summary>Connecting multiple BigQuery Projects to Paradime?</summary>

If you plan to connect multiple BigQuery Projects to Paradime, follow these steps for each project:

#### 1. Grant Access to the BigQuery Project

Using the existing service account you from step 1, [Creating a Service Account](#id-1.-creating-a-service-account-user-in-bigquery), complete the following:

1. **Go to Google Cloud Console** and [navigate to **IAM & Admin**](https://console.cloud.google.com/projectselector2/iam-admin/iam?supportedpurview=project).
2. **Select your project** from the project selector at the top.
3. Go to the **"IAM & Admin"** section and select **"IAM"**.
4. **Assign Roles**:
   1. Select **"Grant Access"**.
   2. Assign the following roles to the service account created initially:
      1. **`BigQuery Resource Viewer`**
      2. **`BigQuery Job User`**

#### 2. Create the BigQuery Dataset in Each Project

Similar to step 2, [Create the BigQuery Dataset](#id-2.-create-the-bigquery-dataset), complete the following:

1. **Go to Google Cloud Console**, [navigate to **BigQuery**](https://console.cloud.google.com/bigquery), and select your project.
2. **Create a Dataset**:
   1. Name: `paradime_cost_analytics`
   2. Ensure the Paradime Dataset region is the same across all BigQuery projects.
3. **Grant Access**:
   1. Assign the **BigQuery User** and **BigQuery Data Editor** roles to the service account user created previously.
   2. Ensure these permissions are granted at the **Dataset level**.

#### 3. Setting up a Logs Routing Sink for each BigQuery Project

Similar to step 3, [Setting up a Logs Routing Sink](#id-3.-setting-up-a-logs-routing-sink), complete the same process again, for each BigQuery Project to route Logs in their respective BigQuery Dataset in Each Project called `paradime_cost_analytics`

</details>

***

### 4.. Create and configure a GCS Bucket

{% hint style="warning" %}
Leave the rest of the Bucket configuration using the default options unless required by your organization. **Ensure the GCS Bucket region is the same as the Dataset region.**
{% endhint %}

1. **Navigate to Cloud Storage and select Buckets**
   1. Click "Create".
   2. Name your [GCS bucket](https://console.cloud.google.com/storage/create-bucket) (e.g., `paradime-bucket`).
   3. Select your Bucket region (e.g., `US`).
   4. Click on "Create" to complete creating your GCS Bucket.
2. **Set lifecycle policy**
   1. Go to the "Lifecycle" tab for your GCS Bucket.
   2. Select "Add Rule" -> "Delete Object" action.
   3. Set "Age" to **2 days** and click "Create" to set the policy.
3. **Grant the Paradime service account user access to the GCS Bucket**
   1. Go to the "Permissions" tab for your GCS Bucket.
   2. Add the Paradime Service Account user email as a "New Principal" and assign the **`Storage Admin`** role.

{% @arcade/embed url="<https://app.arcade.software/share/E3h5wTGT6mojit3Qu8gT>" flowId="E3h5wTGT6mojit3Qu8gT" %}

***

### 5. dbt™ project setup guide

To enable Paradime to enrich your BigQuery queries with additional metadata you will need to create a new dbt™️ macro called `get_query_comment.sql` in the macros folder of your project.

1. Create the Macro (`get_query_comment.sql`)

<details>

<summary>get_query_comment.sql dbt™️ macro</summary>

{% code title="get\_query\_comment.sql " lineNumbers="true" %}

```sql
{% macro get_query_comment(node, extra = {}) %}
    {%- set comment_dict = extra -%}
    {%- do comment_dict.update(
        app='dbt',
        dbt_version=dbt_version,
        project_name=project_name,
        target_name=target.name,
        target_database=target.database,
        target_schema=target.schema,
        invocation_id=invocation_id
    ) -%}

    {%- if node is not none -%}
        {%- do comment_dict.update(
            node_name=node.name,
            node_alias=node.alias,
            node_package_name=node.package_name,
            node_original_file_path=node.original_file_path,
            node_database=node.database,
            node_schema=node.schema,
            node_id=node.unique_id,
            node_resource_type=node.resource_type,
            node_meta=node.config.meta,
            node_tags=node.tags,
            full_refresh=flags.FULL_REFRESH,
            which=flags.WHICH,
        ) -%}

        {%- if flags.INVOCATION_COMMAND -%}
            {%- do comment_dict.update(
                invocation_command=flags.INVOCATION_COMMAND
            ) -%}
        {%- endif -%}

        {%- if node.resource_type != ('seed') -%} {# Otherwise this throws an error saying 'Seeds cannot depend on other nodes.' #}
            {%- if node.refs is defined -%}
                {% set refs = [] %}
                {% for ref in node.refs %}
                    {%- if (dbt_version.split('.')[0] | int > 1)
                       or (dbt_version.split('.')[0] | int == 1 and dbt_version.split('.')[1] | int >= 5) -%}
                        {%- do refs.append(ref.name) -%}
                    {%- else -%}
                        {%- do refs.append(ref[0]) -%}
                    {%- endif -%}
                {% endfor %}
                {%- do comment_dict.update(
                    node_refs=refs | unique | list
                ) -%}
            {%- endif -%}
        {%- endif -%}
        {%- if node.resource_type == 'model' -%}
            {%- do comment_dict.update(
                materialized=node.config.materialized,
            ) -%}
        {%- endif -%}
    {%- endif -%}

--- only if using dbt cloud

    {%- if env_var('DBT_CLOUD_PROJECT_ID', False) -%}
    {%- do comment_dict.update(
        dbt_cloud_project_id=env_var('DBT_CLOUD_PROJECT_ID')
    ) -%}
    {%- endif -%}

    {%- if env_var('DBT_CLOUD_JOB_ID', False) -%}
    {%- do comment_dict.update(
        dbt_cloud_job_id=env_var('DBT_CLOUD_JOB_ID')
    ) -%}
    {%- endif -%}

    {%- if env_var('DBT_CLOUD_RUN_ID', False) -%}
    {%- do comment_dict.update(
        dbt_cloud_run_id=env_var('DBT_CLOUD_RUN_ID')
    ) -%}
    {%- endif -%}

    {%- if env_var('DBT_CLOUD_RUN_REASON_CATEGORY', False) -%}
    {%- do comment_dict.update(
        dbt_cloud_run_reason_category=env_var('DBT_CLOUD_RUN_REASON_CATEGORY')
    ) -%}
    {%- endif -%}

    {%- if env_var('DBT_CLOUD_RUN_REASON', False) -%}
    {%- do comment_dict.update(
        dbt_cloud_run_reason=env_var('DBT_CLOUD_RUN_REASON')
    ) -%}
    {%- endif -%}

-- only if using bolt scheduler
    {%- if env_var('PARADIME_SCHEDULE_NAME', False) -%}
    {%- do comment_dict.update(
        paradime_schedule_name=env_var('PARADIME_SCHEDULE_NAME')
    ) -%}
    {%- endif -%}

    {%- if env_var('PARADIME_SCHEDULE_RUN_ID', False) -%}
    {%- do comment_dict.update(
        paradime_schedule_run_id=env_var('PARADIME_SCHEDULE_RUN_ID')
    ) -%}
    {%- endif -%}

    {%- if env_var('PARADIME_SCHEDULE_RUN_START_DTTM', False) -%}
    {%- do comment_dict.update(
        paradime_schedule_run_start_dttm=env_var('PARADIME_SCHEDULE_RUN_START_DTTM')
    ) -%}
    {%- endif -%}

    {%- if env_var('PARADIME_SCHEDULE_TRIGGER', False) -%}
    {%- do comment_dict.update(
        paradime_schedule_trigger=env_var('PARADIME_SCHEDULE_TRIGGER')
    ) -%}
    {%- endif -%}

    {%- if env_var('PARADIME_SCHEDULE_GIT_SHA', False) -%}
    {%- do comment_dict.update(
        paradime_schedule_git_sha=env_var('PARADIME_SCHEDULE_GIT_SHA')
    ) -%}
    {%- endif -%}

    {{ return(tojson(comment_dict)) }}
{% endmacro %}
```

{% endcode %}

</details>

2. Update your dbt\_project.yml file

This step ensures that with each dbt™️ run, the query comment is appended to the query running in BigQuery.

{% code title="dbt\_project.yml" %}

```yaml
query-comment:
  comment: '{{ get_query_comment(node) }}'
  append: true
```

{% endcode %}

***

### 6. Configure Cost connection in Paradime

To connect Paradime to BigQuery for cost tracking, follow these steps:

#### **How to Add a BigQuery Cost Connection**

1. Click **"Settings"** in the **top menu bar** of the Paradime interface.
2. Click **"Connections"** in the **left sidebar**.
3. Click **"Add New"** next to the **Radar Environment**.
4. Select **"BigQuery"** as the connection type.

#### **Enter the Required Details**

Once you have selected **BigQuery**, provide the following details to complete the setup:

* **BigQuery Service Account JSON**
* **BigQuery Project IDs**
* **Dataset Location** for the dataset used by Paradime
* **GCS Bucket Name** used by Paradime

{% hint style="info" %}
**Important Note:**

If connecting **multiple projects**, separate **each BigQuery Project ID** using **commas (`,`)** without spaces:\
✅ `dev-project,staging-project,production-project`\
❌ `dev-project, staging-project, production-project`
{% endhint %}

{% @arcade/embed url="<https://app.arcade.software/share/kA9x21sAKnKfbFVvN95J>" flowId="kA9x21sAKnKfbFVvN95J" %}


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.paradime.io/app-help/documentation/settings/connections/cost-connection/bigquery-cost-connection.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
