# BigQuery cost connection

{% hint style="warning" %}
**IP RESTRICTIONS**

Make sure to allow traffic from one of the Paradime IPs in your firewall depending on the data location selected.

👉 See also: [Paradime IP addresses](https://docs.paradime.io/app-help/developers/ip-restrictions).
{% endhint %}

## BigQuery setup guide

### 1. Creating a Service Account User in BigQuery

Follow these steps to create a new service account user for Paradime and grant the required permissions at the project level:

1. **Go to Google Cloud Console and**[ **navigate to IAM & Admin**](https://console.cloud.google.com/iam-admin/serviceaccounts):
   * Select your project from the project selector at the top.
   * Go to the "IAM & Admin" section.
2. **Create Service Account**:
   * Click on "Service Accounts".
   * Click "Create Service Account".
   * Enter the Service Account name (e.g., `paradime-radar-user`).
   * Click "Create and Continue".
3. **Assign Roles**:
   * Assign the necessary roles to the service account:
     * **`BigQuery Resource Viewer`**
     * **`BigQuery Job User`**
4. **Complete and Save**:
   * Click "Done" after assigning the roles.
   * The service account will now be created with the assigned permissions.
5. **Generate Key**:
   * Go to the "Keys" tab and click "Add Key" -> "Create New Key".
   * Choose JSON format and save the key file securely. (We will need this later to connect BigQuery to Paradime.)

{% @arcade/embed flowId="He26zpXuXQ5hOiK5eAgo" url="<https://app.arcade.software/share/He26zpXuXQ5hOiK5eAgo>" %}

***

### 2. Create the BigQuery Dataset

{% hint style="danger" %}
Make sure to name the BigQuery Dataset as **`paradime_cost_analytics`**
{% endhint %}

To set up your BigQuery Project for Paradime, follow these steps:

1. **Go to Google Cloud Console,**  [**navigate to BigQuery**](https://console.cloud.google.com/bigquery) **and select your project.**
2. **Create a Dataset**:
   * Name: `paradime_cost_analytics`
   * Select the appropriate Dataset region (this will be needed for the Paradime connection setup).
3. **Grant Access**:
   * Assign the `BigQuery User` and `BigQuery Data Editor` roles to the service account user created previously, ensuring these permissions are at the Dataset level.

{% @arcade/embed flowId="ria8cQB7SqmLezvqKUjq" url="<https://app.arcade.software/share/ria8cQB7SqmLezvqKUjq>" %}

### 3. Setting up a Logs Routing Sink

{% hint style="warning" %}
To create a log sink in Google Cloud, you need the "**Logs Configuration Writer**" (roles/logging.configWriter) IAM role on the project where you're creating the sink.
{% endhint %}

{% hint style="danger" %}
Make sure to use **`paradime_cost_analytics`** as the destination dataset for the Logs Routing Sink.
{% endhint %}

To set up your Logs Routing Sink for Paradime, follow these steps:

1. In the Google Cloud console, go to the [Log Router page](https://console.cloud.google.com/logs/router). If you use the search bar to find this page, then select the result whose subheading is **Logging**.
2. Select the Google Cloud project in which the log entries that you want to route originate.&#x20;
3. Select **Create sink**.
4. In the **Sink details** panel, enter the following details:
   * **Sink name**: Provide an identifier for the sink; note that after you create the sink, you can't rename the sink but you can delete it and create a new sink.
   * **Sink description** (optional): Describe the purpose or use case for the sink.
5. In the **Sink destination** panel, select the sink service and destination by using the **Select sink service** menu.&#x20;
   1. Select **BigQuery dataset**: Select the data set name **`paradime_cost_analytics`** (created in the previous step) to receive the routed log entries.
   2. Select the option ot partition tables
6. Go to the **Choose logs to include in sink** panel and specify the log entries to include:

```sql
resource.type="bigquery_resource"
protoPayload.methodName="jobservice.jobcompleted"
```

7. Select **Create sink**.

{% @arcade/embed flowId="RCwlT3KRAIAjqE7oTRZt" url="<https://app.arcade.software/share/RCwlT3KRAIAjqE7oTRZt>" %}

{% hint style="info" %}

### Set the partition expiration <a href="#partition-expiration" id="partition-expiration"></a>

After the Logs Router Sink table has been created, you can setup an expiration for partitions older than 7 days.

```sql
ALTER TABLE paradime_cost_analytics.cloudaudit_googleapis_com_data_access
  SET OPTIONS (
    -- Sets partition expiration to 7 days
    partition_expiration_days = 7);
```

{% endhint %}

<details>

<summary>Connecting multiple BigQuery Projects to Paradime?</summary>

If you plan to connect multiple BigQuery Projects to Paradime, follow these steps for each project:

### 1. Grant Access to the BigQuery Project

Using the existing service account you from step 1, [Creating a Service Account](#id-1.-creating-a-service-account-user-in-bigquery), complete the following:

1. **Go to Google Cloud Console** and [navigate to **IAM & Admin**](https://console.cloud.google.com/projectselector2/iam-admin/iam?supportedpurview=project).
2. **Select your project** from the project selector at the top.
3. Go to the **"IAM & Admin"** section and select **"IAM"**.
4. **Assign Roles**:
   1. Select **"Grant Access"**.
   2. Assign the following roles to the service account created initially:
      1. **`BigQuery Resource Viewer`**
      2. **`BigQuery Job User`**

### 2. Create the BigQuery Dataset in Each Project

Similar to step 2, [Create the BigQuery Dataset](#id-2.-create-the-bigquery-dataset), complete the following:

1. **Go to Google Cloud Console**, [navigate to **BigQuery**](https://console.cloud.google.com/bigquery), and select your project.
2. **Create a Dataset**:
   1. Name: `paradime_cost_analytics`
   2. Ensure the Paradime Dataset region is the same across all BigQuery projects.
3. **Grant Access**:
   1. Assign the **BigQuery User** and **BigQuery Data Editor** roles to the service account user created previously.
   2. Ensure these permissions are granted at the **Dataset level**.

### 3. Setting up a Logs Routing Sink for each BigQuery Project

Similar to step 3, [Setting up a Logs Routing Sink](#id-3.-setting-up-a-logs-routing-sink), complete the same process again, for each BigQuery Project to route Logs in their respective BigQuery Dataset in Each Project called `paradime_cost_analytics`&#x20;

</details>

***

### 4.. Create and configure a GCS Bucket

{% hint style="warning" %}
Leave the rest of the Bucket configuration using the default options unless required by your organization. **Ensure the GCS Bucket region is the same as the Dataset region.**
{% endhint %}

1. **Navigate to Cloud Storage and select Buckets**
   1. Click "Create".
   2. Name your [GCS bucket](https://console.cloud.google.com/storage/create-bucket) (e.g., `paradime-bucket`).
   3. Select your Bucket region (e.g., `US`).
   4. Click on "Create" to complete creating your GCS Bucket.
2. **Set lifecycle policy**
   1. Go to the "Lifecycle" tab for your GCS Bucket.
   2. Select "Add Rule" -> "Delete Object" action.
   3. Set "Age" to **2 days** and click "Create" to set the policy.
3. **Grant the Paradime service account user  access to the GCS Bucket**
   1. Go to the "Permissions" tab for your GCS Bucket.
   2. Add the Paradime Service Account user email as a "New Principal" and assign the **`Storage Admin`** role.

{% @arcade/embed flowId="E3h5wTGT6mojit3Qu8gT" url="<https://app.arcade.software/share/E3h5wTGT6mojit3Qu8gT>" %}

***

### 5. dbt™ project setup guide

To enable Paradime to enrich your BigQuery queries with additional metadata you will need to create a new dbt™️ macro called `get_query_comment.sql`  in the macros folder of your project.

1. Create the Macro (`get_query_comment.sql`)

<details>

<summary>get_query_comment.sql dbt™️ macro</summary>

{% code title="get\_query\_comment.sql " lineNumbers="true" %}

```sql
{% macro get_query_comment(node, extra = {}) %}
    {%- set comment_dict = extra -%}
    {%- do comment_dict.update(
        app='dbt',
        dbt_version=dbt_version,
        project_name=project_name,
        target_name=target.name,
        target_database=target.database,
        target_schema=target.schema,
        invocation_id=invocation_id
    ) -%}

    {%- if node is not none -%}
        {%- do comment_dict.update(
            node_name=node.name,
            node_alias=node.alias,
            node_package_name=node.package_name,
            node_original_file_path=node.original_file_path,
            node_database=node.database,
            node_schema=node.schema,
            node_id=node.unique_id,
            node_resource_type=node.resource_type,
            node_meta=node.config.meta,
            node_tags=node.tags,
            full_refresh=flags.FULL_REFRESH,
            which=flags.WHICH,
        ) -%}

        {%- if flags.INVOCATION_COMMAND -%}
            {%- do comment_dict.update(
                invocation_command=flags.INVOCATION_COMMAND
            ) -%}
        {%- endif -%}

        {%- if node.resource_type != ('seed') -%} {# Otherwise this throws an error saying 'Seeds cannot depend on other nodes.' #}
            {%- if node.refs is defined -%}
                {% set refs = [] %}
                {% for ref in node.refs %}
                    {%- if (dbt_version.split('.')[0] | int > 1)
                       or (dbt_version.split('.')[0] | int == 1 and dbt_version.split('.')[1] | int >= 5) -%}
                        {%- do refs.append(ref.name) -%}
                    {%- else -%}
                        {%- do refs.append(ref[0]) -%}
                    {%- endif -%}
                {% endfor %}
                {%- do comment_dict.update(
                    node_refs=refs | unique | list
                ) -%}
            {%- endif -%}
        {%- endif -%}
        {%- if node.resource_type == 'model' -%}
            {%- do comment_dict.update(
                materialized=node.config.materialized,
            ) -%}
        {%- endif -%}
    {%- endif -%}

--- only if using dbt cloud

    {%- if env_var('DBT_CLOUD_PROJECT_ID', False) -%}
    {%- do comment_dict.update(
        dbt_cloud_project_id=env_var('DBT_CLOUD_PROJECT_ID')
    ) -%}
    {%- endif -%}

    {%- if env_var('DBT_CLOUD_JOB_ID', False) -%}
    {%- do comment_dict.update(
        dbt_cloud_job_id=env_var('DBT_CLOUD_JOB_ID')
    ) -%}
    {%- endif -%}

    {%- if env_var('DBT_CLOUD_RUN_ID', False) -%}
    {%- do comment_dict.update(
        dbt_cloud_run_id=env_var('DBT_CLOUD_RUN_ID')
    ) -%}
    {%- endif -%}

    {%- if env_var('DBT_CLOUD_RUN_REASON_CATEGORY', False) -%}
    {%- do comment_dict.update(
        dbt_cloud_run_reason_category=env_var('DBT_CLOUD_RUN_REASON_CATEGORY')
    ) -%}
    {%- endif -%}

    {%- if env_var('DBT_CLOUD_RUN_REASON', False) -%}
    {%- do comment_dict.update(
        dbt_cloud_run_reason=env_var('DBT_CLOUD_RUN_REASON')
    ) -%}
    {%- endif -%}

-- only if using bolt scheduler
    {%- if env_var('PARADIME_SCHEDULE_NAME', False) -%}
    {%- do comment_dict.update(
        paradime_schedule_name=env_var('PARADIME_SCHEDULE_NAME')
    ) -%}
    {%- endif -%}

    {%- if env_var('PARADIME_SCHEDULE_RUN_ID', False) -%}
    {%- do comment_dict.update(
        paradime_schedule_run_id=env_var('PARADIME_SCHEDULE_RUN_ID')
    ) -%}
    {%- endif -%}

    {%- if env_var('PARADIME_SCHEDULE_RUN_START_DTTM', False) -%}
    {%- do comment_dict.update(
        paradime_schedule_run_start_dttm=env_var('PARADIME_SCHEDULE_RUN_START_DTTM')
    ) -%}
    {%- endif -%}

    {%- if env_var('PARADIME_SCHEDULE_TRIGGER', False) -%}
    {%- do comment_dict.update(
        paradime_schedule_trigger=env_var('PARADIME_SCHEDULE_TRIGGER')
    ) -%}
    {%- endif -%}

    {%- if env_var('PARADIME_SCHEDULE_GIT_SHA', False) -%}
    {%- do comment_dict.update(
        paradime_schedule_git_sha=env_var('PARADIME_SCHEDULE_GIT_SHA')
    ) -%}
    {%- endif -%}

    {{ return(tojson(comment_dict)) }}
{% endmacro %}
```

{% endcode %}

</details>

2. Update your dbt\_project.yml file

This step ensures that with each dbt™️ run, the query comment is appended to the query running in BigQuery.

{% code title="dbt\_project.yml" %}

```yaml
query-comment:
  comment: '{{ get_query_comment(node) }}'
  append: true
```

{% endcode %}

***

### 6. Configure Cost connection in Paradime

To connect Paradime to BigQuery for cost tracking, follow these steps:

#### **How to Add a BigQuery Cost Connection**

1. Click **"Settings"** in the **top menu bar** of the Paradime interface.
2. Click **"Connections"** in the **left sidebar**.
3. Click **"Add New"** next to the **Radar Environment**.
4. Select **"BigQuery"** as the connection type.

#### **Enter the Required Details**

Once you have selected **BigQuery**, provide the following details to complete the setup:

* **BigQuery Service Account JSON**
* **BigQuery Project IDs**
* **Dataset Location** for the dataset used by Paradime
* **GCS Bucket Name** used by Paradime

{% hint style="info" %}
**Important Note:**

If connecting **multiple projects**, separate **each BigQuery Project ID** using **commas (`,`)** without spaces:\
✅ `dev-project,staging-project,production-project`\
❌ `dev-project, staging-project, production-project`
{% endhint %}

{% @arcade/embed flowId="kA9x21sAKnKfbFVvN95J" url="<https://app.arcade.software/share/kA9x21sAKnKfbFVvN95J>" %}
