Hooks & Operational Tasks

Hooks in dbt allow you to execute SQL statements or custom logic at specific points in your dbt workflow. They're powerful tools for automating operational tasks, managing permissions, and enhancing your data pipeline with custom functionality.

What Are Hooks?

Hooks are snippets of SQL that run at predefined moments during the dbt execution process. They help automate operational tasks like:

Granting permissions on objects
Setting table properties or attributes
Adding comments or metadata
Creating indexes or optimizing data structures
Executing database-specific operations

There are four main types of hooks in dbt:

Hook Type

Execution Timing

Scope

Use Cases

pre-hook

Before a model, seed, or snapshot is built

Model-specific

Data validation, temporary table setup

post-hook

After a model, seed, or snapshot is built

Model-specific

Adding indexes, granting permissions

on-run-start

At the start of dbt commands (run, build, test, etc.)

Project/folder wide

Session configuration, global setup

on-run-end

At the end of dbt commands (run, build, test, etc.)

Project/folder wide

Cleanup, notifications, audit logging

Using Model-Level Hooks

Pre-hooks and Post-hooks

You can define pre-hooks and post-hooks directly in your model SQL files using the config() function:

-- models/customers.sql
{{ 
  config(
    materialized='table',
    pre_hook="DELETE FROM {{ this }} WHERE customer_id < 0",
    post_hook="GRANT SELECT ON {{ this }} TO ROLE analyst"
  ) 
}}

SELECT * FROM {{ ref('stg_customers') }}

In this example:

The pre-hook deletes invalid records before the model builds
The post-hook grants select permissions after the model is created

Multiple Hooks

You can specify multiple hooks as a list:

{{ 
  config(
    post_hook=[
      "GRANT SELECT ON {{ this }} TO ROLE analyst",
      "ANALYZE TABLE {{ this }}",
      "ALTER TABLE {{ this }} ADD COMMENT '{{ doc('customers_table_description') }}'"
    ]
  ) 
}}

Project and Folder-Level Hooks

You can configure hooks that apply to groups of models in your dbt_project.yml file:

models:
  my_project:
    # Hooks for all models
    +post_hook: "GRANT SELECT ON {{ this }} TO ROLE reporter"
    
    # Hooks for specific folders
    marts:
      +post_hook: "ANALYZE TABLE {{ this }}"
    
    staging:
      +pre_hook: "SET query_tag = 'staging_models'"

On-Run-Start and On-Run-End Hooks

These hooks run once at the beginning or end of your dbt run:

on-run-start:
  - "SET timezone = 'America/Los_Angeles'"
  - "SET query_tag = 'dbt_run_{{ run_started_at.strftime('%Y%m%d_%H%M%S') }}'"

on-run-end:
  - "CALL audit.log_dbt_run('{{ run_started_at }}', '{{ invocation_id }}')"

Hook Context Variables

In hooks, you can access several useful context variables:

{{ this }} - The relation being built (table/view)
{{ target }} - Information about the current target database
{{ run_started_at }} - Timestamp when the run started
{{ invocation_id }} - Unique ID for the current dbt run

Using Macros in Hooks

You can make your hooks more reusable by calling macros:

-- models/customers.sql
{{ 
  config(
    post_hook="{{ grant_select(this, 'analyst') }}"
  ) 
}}

SELECT * FROM {{ ref('stg_customers') }}

With a corresponding macro defined:

-- macros/grant_select.sql
{% macro grant_select(relation, role) %}
    grant select on {{ relation }} to role {{ role }};
{% endmacro %}

This approach allows you to centralize and reuse your hook logic across multiple models.

You can also configure hooks with macros in your YAML files:

# models/schema.yml
models:
  - name: customers
    config:
      post_hook: "{{ grant_select(this, 'analyst') }}"
      
# dbt_project.yml
models:
  my_project:
    +post_hook: "{{ grant_select(this, 'reporter') }}"

Common Hook Use Cases

Permission Management

A common use for hooks is automating permission grants:

-- Granting multiple permissions
{{ 
  config(
    post_hook=[
      "GRANT SELECT ON {{ this }} TO ROLE analyst",
      "GRANT SELECT ON {{ this }} TO ROLE reporter"
    ]
  ) 
}}

Database-Specific Operations

Perform operations specific to your data warehouse:

-- Snowflake example
{{ 
  config(
    post_hook="ALTER TABLE {{ this }} SET DATA_RETENTION_TIME_IN_DAYS = 90"
  ) 
}}

-- Redshift example
{{ 
  config(
    post_hook="VACUUM {{ this }}"
  ) 
}}

Environment-Specific Actions

Apply different hooks based on your deployment environment:

{{ 
  config(
    post_hook=
      {% if target.name == 'prod' %}
        "GRANT SELECT ON {{ this }} TO ROLE business_users"
      {% else %}
        "GRANT SELECT ON {{ this }} TO ROLE dbt_developers"
      {% endif %}
  ) 
}}

Operations with run-operation

Operations are a way to execute standalone macros using the run-operation command. This is useful for administrative tasks that you want to run on demand, rather than as part of a model build.

Creating an Operation Macro

To create an operation, define a macro that performs the desired actions:

-- macros/grant_select.sql
{% macro grant_select(role) %}
    {% set sql %}
        grant usage on schema {{ target.schema }} to role {{ role }};
        grant select on all tables in schema {{ target.schema }} to role {{ role }};
        grant select on all views in schema {{ target.schema }} to role {{ role }};
    {% endset %}

    {% do run_query(sql) %}
    {% do log("Privileges granted", info=True) %}
{% endmacro %}

Note two important points:

The SQL is defined within a {% set sql %} block
The run_query() function is used to actually execute the SQL

Running an Operation

To run this operation from the command line:

dbt run-operation grant_select --args '{role: reporter}'

This would grant select privileges on all tables in your schema to the 'reporter' role.

Key Difference Between Hooks and Operations

Hooks are automatically executed at specific times during dbt runs
Operations are explicitly run on-demand using the run-operation command
With operations, you must use run_query() or a statement block to execute the SQL

Operation Examples

Refreshing a Snowflake Pipe

{% macro refresh_pipe(pipe_name) %}
    {% set sql %}
        ALTER PIPE {{ pipe_name }} REFRESH;
    {% endset %}
    
    {% do run_query(sql) %}
    {% do log("Pipe refreshed: " ~ pipe_name, info=True) %}
{% endmacro %}

Creating Multiple Objects

{% macro setup_monitoring() %}
    {% set sql %}
        CREATE SCHEMA IF NOT EXISTS {{ target.schema }}_monitor;
        
        CREATE TABLE IF NOT EXISTS {{ target.schema }}_monitor.audit_log (
            event_time TIMESTAMP_NTZ DEFAULT CURRENT_TIMESTAMP(),
            event_type VARCHAR(100),
            model_name VARCHAR(100),
            duration_seconds FLOAT
        );
    {% endset %}
    
    {% do run_query(sql) %}
    {% do log("Monitoring setup complete", info=True) %}
{% endmacro %}

Passing Complex Arguments

You can pass complex arguments to operations:

dbt run-operation create_test_data --args '{"schema": "analytics", "tables": ["customers", "orders"], "row_count": 1000}'

With a corresponding macro:

{% macro create_test_data(schema, tables, row_count) %}
    {% for table in tables %}
        {% set sql %}
            INSERT INTO {{ schema }}.{{ table }} /* Generate test data SQL here */
        {% endset %}
        {% do run_query(sql) %}
        {% do log("Generated " ~ row_count ~ " rows for " ~ schema ~ "." ~ table, info=True) %}
    {% endfor %}
{% endmacro %}

Best Practices

Hooks

Best Practice

Description

Use macros for repeated hook logic

Create reusable macro functions instead of duplicating hook SQL.

Keep hooks focused

Each hook should do one thing well.

Consider hook execution order

Remember that project-level hooks run before/after model-level hooks.

Be careful with transactions

Understand your database's transaction behavior with hooks.

Test hooks in development

Verify hook behavior before deploying to production.

Operations

Best Practice

Description

Always use run_query() or statement blocks

Operations must explicitly execute SQL.

Add logging

Use log() to provide feedback about operation progress.

Handle errors gracefully

Consider try/except patterns for complex operations.

Document operation parameters

Make it clear what arguments your operations accept.

Use for administrative tasks

Operations are perfect for one-off maintenance tasks.

When to Use Hooks vs. Operations

Use hooks when you need to:

Execute SQL automatically at specific points in your dbt workflow
Apply consistent actions across multiple models
Implement pre/post processing that's tightly coupled to models

Use operations when you need to:

Run administrative tasks on-demand
Perform one-off database maintenance
Execute complex logic that doesn't fit into the model build process
Create setup/teardown scripts for your environment

By mastering hooks and operations, you can significantly extend dbt's capabilities and automate many aspects of database administration and data pipeline management.

PreviousCustom Tests NextPackages

Last updated 4 months ago

Was this helpful?