# Hooks & Operational Tasks

Hooks in dbt allow you to execute SQL statements or custom logic at specific points in your dbt workflow. They're powerful tools for automating operational tasks, managing permissions, and enhancing your data pipeline with custom functionality.

### What Are Hooks?

Hooks are snippets of SQL that run at predefined moments during the dbt execution process. They help automate operational tasks like:

* Granting permissions on objects
* Setting table properties or attributes
* Adding comments or metadata
* Creating indexes or optimizing data structures
* Executing database-specific operations

There are four main types of hooks in dbt:

| Hook Type      | Execution Timing                                      | Scope               | Use Cases                              |
| -------------- | ----------------------------------------------------- | ------------------- | -------------------------------------- |
| `pre-hook`     | Before a model, seed, or snapshot is built            | Model-specific      | Data validation, temporary table setup |
| `post-hook`    | After a model, seed, or snapshot is built             | Model-specific      | Adding indexes, granting permissions   |
| `on-run-start` | At the start of dbt commands (run, build, test, etc.) | Project/folder wide | Session configuration, global setup    |
| `on-run-end`   | At the end of dbt commands (run, build, test, etc.)   | Project/folder wide | Cleanup, notifications, audit logging  |

***

### Using Model-Level Hooks

**Pre-hooks and Post-hooks**

You can define pre-hooks and post-hooks directly in your model SQL files using the `config()` function:

```sql
-- models/customers.sql
{{ 
  config(
    materialized='table',
    pre_hook="DELETE FROM {{ this }} WHERE customer_id < 0",
    post_hook="GRANT SELECT ON {{ this }} TO ROLE analyst"
  ) 
}}

SELECT * FROM {{ ref('stg_customers') }}
```

In this example:

* The pre-hook deletes invalid records before the model builds
* The post-hook grants select permissions after the model is created

**Multiple Hooks**

You can specify multiple hooks as a list:

```sql
{{ 
  config(
    post_hook=[
      "GRANT SELECT ON {{ this }} TO ROLE analyst",
      "ANALYZE TABLE {{ this }}",
      "ALTER TABLE {{ this }} ADD COMMENT '{{ doc('customers_table_description') }}'"
    ]
  ) 
}}
```

***

### Project and Folder-Level Hooks

You can configure hooks that apply to groups of models in your `dbt_project.yml` file:

```yaml
models:
  my_project:
    # Hooks for all models
    +post_hook: "GRANT SELECT ON {{ this }} TO ROLE reporter"
    
    # Hooks for specific folders
    marts:
      +post_hook: "ANALYZE TABLE {{ this }}"
    
    staging:
      +pre_hook: "SET query_tag = 'staging_models'"
```

**On-Run-Start and On-Run-End Hooks**

These hooks run once at the beginning or end of your dbt run:

```yaml
on-run-start:
  - "SET timezone = 'America/Los_Angeles'"
  - "SET query_tag = 'dbt_run_{{ run_started_at.strftime('%Y%m%d_%H%M%S') }}'"

on-run-end:
  - "CALL audit.log_dbt_run('{{ run_started_at }}', '{{ invocation_id }}')"
```

{% hint style="info" %}

#### **Hook Context Variables**

In hooks, you can access several useful context variables:

* `{{ this }}` - The relation being built (table/view)
* `{{ target }}` - Information about the current target database
* `{{ run_started_at }}` - Timestamp when the run started
* `{{ invocation_id }}` - Unique ID for the current dbt run
  {% endhint %}

***

### Using Macros in Hooks

You can make your hooks more reusable by calling macros:

```sql
-- models/customers.sql
{{ 
  config(
    post_hook="{{ grant_select(this, 'analyst') }}"
  ) 
}}

SELECT * FROM {{ ref('stg_customers') }}
```

With a corresponding macro defined:

```sql
-- macros/grant_select.sql
{% macro grant_select(relation, role) %}
    grant select on {{ relation }} to role {{ role }};
{% endmacro %}
```

This approach allows you to centralize and reuse your hook logic across multiple models.

You can also configure hooks with macros in your YAML files:

```yaml
# models/schema.yml
models:
  - name: customers
    config:
      post_hook: "{{ grant_select(this, 'analyst') }}"
      
# dbt_project.yml
models:
  my_project:
    +post_hook: "{{ grant_select(this, 'reporter') }}"
```

***

### Common Hook Use Cases

**Permission Management**

A common use for hooks is automating permission grants:

```sql
-- Granting multiple permissions
{{ 
  config(
    post_hook=[
      "GRANT SELECT ON {{ this }} TO ROLE analyst",
      "GRANT SELECT ON {{ this }} TO ROLE reporter"
    ]
  ) 
}}
```

**Database-Specific Operations**

Perform operations specific to your data warehouse:

```sql
-- Snowflake example
{{ 
  config(
    post_hook="ALTER TABLE {{ this }} SET DATA_RETENTION_TIME_IN_DAYS = 90"
  ) 
}}

-- Redshift example
{{ 
  config(
    post_hook="VACUUM {{ this }}"
  ) 
}}
```

**Environment-Specific Actions**

Apply different hooks based on your deployment environment:

```sql
{{ 
  config(
    post_hook=
      {% if target.name == 'prod' %}
        "GRANT SELECT ON {{ this }} TO ROLE business_users"
      {% else %}
        "GRANT SELECT ON {{ this }} TO ROLE dbt_developers"
      {% endif %}
  ) 
}}
```

***

### Operations with run-operation

Operations are a way to execute standalone macros using the `run-operation` command. This is useful for administrative tasks that you want to run on demand, rather than as part of a model build.

**Creating an Operation Macro**

To create an operation, define a macro that performs the desired actions:

```sql
-- macros/grant_select.sql
{% macro grant_select(role) %}
    {% set sql %}
        grant usage on schema {{ target.schema }} to role {{ role }};
        grant select on all tables in schema {{ target.schema }} to role {{ role }};
        grant select on all views in schema {{ target.schema }} to role {{ role }};
    {% endset %}

    {% do run_query(sql) %}
    {% do log("Privileges granted", info=True) %}
{% endmacro %}
```

Note two important points:

1. The SQL is defined within a `{% set sql %}` block
2. The `run_query()` function is used to actually execute the SQL

**Running an Operation**

To run this operation from the command line:

```bash
dbt run-operation grant_select --args '{role: reporter}'
```

This would grant select privileges on all tables in your schema to the 'reporter' role.

{% hint style="info" %}

#### **Key Difference Between Hooks and Operations**

* **Hooks** are automatically executed at specific times during dbt runs
* **Operations** are explicitly run on-demand using the `run-operation` command
* With operations, you must use `run_query()` or a statement block to execute the SQL
  {% endhint %}

***

### Operation Examples

**Refreshing a Snowflake Pipe**

```sql
{% macro refresh_pipe(pipe_name) %}
    {% set sql %}
        ALTER PIPE {{ pipe_name }} REFRESH;
    {% endset %}
    
    {% do run_query(sql) %}
    {% do log("Pipe refreshed: " ~ pipe_name, info=True) %}
{% endmacro %}
```

**Creating Multiple Objects**

```sql
{% macro setup_monitoring() %}
    {% set sql %}
        CREATE SCHEMA IF NOT EXISTS {{ target.schema }}_monitor;
        
        CREATE TABLE IF NOT EXISTS {{ target.schema }}_monitor.audit_log (
            event_time TIMESTAMP_NTZ DEFAULT CURRENT_TIMESTAMP(),
            event_type VARCHAR(100),
            model_name VARCHAR(100),
            duration_seconds FLOAT
        );
    {% endset %}
    
    {% do run_query(sql) %}
    {% do log("Monitoring setup complete", info=True) %}
{% endmacro %}
```

**Passing Complex Arguments**

You can pass complex arguments to operations:

```bash
dbt run-operation create_test_data --args '{"schema": "analytics", "tables": ["customers", "orders"], "row_count": 1000}'
```

With a corresponding macro:

```sql
{% macro create_test_data(schema, tables, row_count) %}
    {% for table in tables %}
        {% set sql %}
            INSERT INTO {{ schema }}.{{ table }} /* Generate test data SQL here */
        {% endset %}
        {% do run_query(sql) %}
        {% do log("Generated " ~ row_count ~ " rows for " ~ schema ~ "." ~ table, info=True) %}
    {% endfor %}
{% endmacro %}
```

***

### Best Practices

#### Hooks

| Best Practice                          | Description                                                           |
| -------------------------------------- | --------------------------------------------------------------------- |
| **Use macros for repeated hook logic** | Create reusable macro functions instead of duplicating hook SQL.      |
| **Keep hooks focused**                 | Each hook should do one thing well.                                   |
| **Consider hook execution order**      | Remember that project-level hooks run before/after model-level hooks. |
| **Be careful with transactions**       | Understand your database's transaction behavior with hooks.           |
| **Test hooks in development**          | Verify hook behavior before deploying to production.                  |

#### Operations

| Best Practice                                    | Description                                               |
| ------------------------------------------------ | --------------------------------------------------------- |
| **Always use `run_query()` or statement blocks** | Operations must explicitly execute SQL.                   |
| **Add logging**                                  | Use `log()` to provide feedback about operation progress. |
| **Handle errors gracefully**                     | Consider `try/except` patterns for complex operations.    |
| **Document operation parameters**                | Make it clear what arguments your operations accept.      |
| **Use for administrative tasks**                 | Operations are perfect for one-off maintenance tasks.     |

{% hint style="info" %}

#### **When to Use Hooks vs. Operations**

Use **hooks** when you need to:

* Execute SQL automatically at specific points in your dbt workflow
* Apply consistent actions across multiple models
* Implement pre/post processing that's tightly coupled to models

Use **operations** when you need to:

* Run administrative tasks on-demand
* Perform one-off database maintenance
* Execute complex logic that doesn't fit into the model build process
* Create setup/teardown scripts for your environment
  {% endhint %}

By mastering hooks and operations, you can significantly extend dbt's capabilities and automate many aspects of database administration and data pipeline management.
