Packages

dbt packages allow you to import pre-built models, macros, and tests into your project, helping you solve common data modeling challenges without reinventing the wheel. This guide explains how to use, install, and create dbt packages.

What Are dbt Packages?

dbt packages are essentially standalone dbt projects that can be imported into your project. They contain reusable models, macros, tests, and other resources that extend dbt's functionality and help solve common data modeling challenges.

Packages enable you to:

  • Leverage community-contributed solutions

  • Standardize transformations across projects

  • Import specialized functionality for specific data sources

  • Apply consistent testing patterns

  • Avoid reinventing solutions for common problems


Adding Packages to Your Project

Using packages in your dbt project is a simple three-step process:

  1. Create a packages.yml file in your project root (next to your dbt_project.yml)

  2. Define the packages you want to use

  3. Run dbt deps to install the packages

Basic Package Configuration

# packages.yml
packages:
  - package: dbt-labs/dbt_utils
    version: 1.1.1
  
  - package: calogica/dbt_expectations
    version: 0.8.5

When you run dbt deps, dbt will install these packages into a dbt_packages/ directory in your project. By default, this directory is ignored by git to avoid duplicating code.


Package Installation Methods

dbt supports several methods for specifying package sources, depending on where your package is stored.

Hub Packages (Recommended)

The simplest way to install packages is from the dbt Hub:

packages:
  - package: dbt-labs/snowplow
    version: 0.7.3

You can also specify version ranges using semantic versioning:

packages:
  - package: dbt-labs/snowplow
    version: [">=0.7.0", "<0.8.0"]

This approach is recommended because the Hub can handle duplicate dependencies automatically.

Git Packages

For packages stored in Git repositories:

packages:
  - git: "https://github.com/dbt-labs/dbt-utils.git"
    revision: 0.9.2

The revision parameter can be:

  • A branch name

  • A tag name

  • A specific commit (40-character hash)

Local Packages

For packages on your local filesystem:

packages:
  - local: relative/path/to/package

This is useful for testing package changes or working with monorepos.

Package Versioning Best Practices

  • Always pin package versions in production projects

  • Use semantic versioning ranges for minor updates

  • Test package updates thoroughly before deploying to production

  • Beginning with dbt v1.7, running dbt deps automatically pins packages by creating a package-lock.yml file


Using Package Functionality

Once installed, you can use the resources from packages in your project.

Using Package Macros

Call macros from the package in your models:

-- Using dbt_utils.generate_surrogate_key
SELECT
    {{ dbt_utils.generate_surrogate_key(['customer_id', 'order_date']) }} as order_sk,
    customer_id,
    order_date,
    amount
FROM {{ ref('stg_orders') }}

Using Package Tests

Apply tests provided by packages in your schema files:

models:
  - name: customers
    columns:
      - name: email
        tests:
          - dbt_expectations.expect_column_values_to_match_regex:
              regex: '^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$'

Referencing Package Models

Reference models from packages using the standard ref function:

-- Reference a model from a package
SELECT * FROM {{ ref('snowplow', 'snowplow_page_views') }}

When referencing models from packages, you can include the package name as the first argument to ref.


Configuring Packages

Many packages allow you to configure their behavior using variables in your dbt_project.yml file:

# dbt_project.yml

vars:
  # Configure the snowplow package
  snowplow:
    'snowplow:timezone': 'America/New_York'
    'snowplow:page_ping_frequency': 10
    'snowplow:events': "{{ ref('sp_base_events') }}"

# Override package configurations
models:
  snowplow:
    +schema: snowplow_models

You can also override materializations, schemas, or other configurations defined in the package.


Here are some widely-used packages that can enhance your dbt projects:

Package
Purpose
Key Features

dbt-utils

General utilities

Cross-database macros, SQL helpers, schema tests

dbt-expectations

Data quality testing

Advanced testing functions inspired by Great Expectations

dbt-date

Date/time functionality

Date spine generation, fiscal periods, holiday calendars

dbt-audit-helper

Auditing and comparison

Model comparison, reconciliation helpers

codegen

Code generation

Auto-generate source definitions and base models

dbt-meta-testing

Document and test coverage

Test your documentation and test coverage


Working with Private Packages

For organizations with internal packages, dbt supports several methods for authentication.

Private Hub Packages

You can use private packages with the proper authentication:

packages:
  - private: dbt-labs/internal-package
    provider: "github"  # Specify if you have multiple git providers configured

Git Token Method

For HTTPS authentication with a token:

packages:
  - git: "https://{{env_var('GIT_CREDENTIAL')}}@github.com/dbt-labs/internal-package.git"

Environment Variables

When using environment variables with dbt, ensure they're available in your execution environment. You can set these as environment variables in your operating system or in your CI/CD pipeline.

SSH Key Method (Command Line)

For command-line users with SSH authentication:

packages:
  - git: "[email protected]:dbt-labs/internal-package.git"

Package Maintenance

Updating Packages

To update packages:

  1. Change the version/revision in packages.yml

  2. Run dbt deps to install the updated packages

  3. Test the changes thoroughly before deploying

Uninstalling Packages

To remove a package:

  1. Delete it from your packages.yml file

  2. Run dbt clean to remove the installed package

  3. Run dbt deps to reinstall remaining packages


Advanced Package Techniques

Handling Package Conflicts

When using multiple packages, you might encounter naming conflicts. You can resolve these by:

  1. Using fully-qualified references:

    {{ dbt_utils.generate_surrogate_key(['id']) }}
  2. Overriding package macros in your project:

    {% macro generate_surrogate_key(field_list) %}
        {# Your custom implementation #}
    {% endmacro %}

Subdirectory Configuration

For packages nested in subdirectories (e.g., in monorepos):

packages:
  - git: "https://github.com/dbt-labs/dbt-labs-experimental-features"
    subdirectory: "materialized-views"

Last updated

Was this helpful?