Working with Tags in your dbt™️ Project

Overview

Tags in dbt™️ are powerful metadata identifiers that can be applied to various resources in your dbt™️ project. They enable flexible resource selection and help organize your project components. This guide covers how to implement and use tags effectively in your dbt™️ projects.

Implementation Methods

1. Config Property

The config property allows you to apply tags directly to individual resources using a config block in your SQL files.

{{ config(
    tags=["finance", "daily_refresh"]
) }}

select ...

2. Config Block in dbt_project.yml

Tags can be applied at various levels in your dbt_project.yml file, affecting multiple resources hierarchically.

models:
  project_name:
    +tags: "base_tag"  # Single tag
    
    staging:
      +tags:            # Multiple tags
        - "staging"
        - "raw_data"
    
    marts:
      +tags:
        - "mart"
        - "business_logic"

Key Features

Tag Inheritance and Hierarchy

Tags are additive and accumulate hierarchically. Resources inherit tags from their parent folders and can have additional tags applied directly.

Example inheritance:

models/
  staging/
    customers.sql   → Tags: ["base_tag", "staging", "raw_data"]
    orders.sql      → Tags: ["base_tag", "staging", "raw_data"]
  marts/
    dim_customer.sql → Tags: ["base_tag", "mart", "business_logic"]

Resource Type Support

Tags can be applied to multiple resource types:

  1. Models

models:
  <resource-path>:
    +tags: <string> | [<string>]
  1. Snapshots

snapshots:
  <resource-path>:
    +tags: <string> | [<string>]
  1. Seeds

yamlCopyseeds:
  <resource-path>:
    +tags: <string> | [<string>]
  1. Other Resources (in schema.yml)

version: 2

sources:
  - name: source_name
    tags: ['source_tag']
    
exposures:
  - name: exposure_name
    tags: ['exposure_tag']

columns:
  - name: column_name
    tags: ['pii']

CLI Usage

Basic Selection

# Run models with specific tag
dbt run --select tag:daily

# Run seeds with specific tag
dbt seed --select tag:marketing

# Run snapshots with specific tag
dbt snapshot --select tag:weekly

Advanced Selection

# Combine multiple tags, run all models with tag 'daily' or 'critical'
dbt run --select tag:daily tag:critical

# Exclude specific tags
dbt run --select tag:daily --exclude tag:deprecated

# Combine with other selection methods, run models that are tagged 'finance' and 'staging'
dbt run --select tag:finance, staging

Best Practices

  1. Consistent Naming: Use consistent tag naming conventions across your project

# Good
+tags: ["daily_refresh", "finance_data"]

# Avoid
+tags: ["Daily", "Finance", "financial-data"]
  1. Documentation: Document your tagging strategy in your project's README

## Tags
- daily_refresh: Models that refresh daily
- finance_data: Contains financial information
- pii: Contains personally identifiable information
  1. Granular Tags: Use specific tags rather than overly broad ones

# Good
+tags: ["customer_metrics", "daily_refresh"]

# Too broad
+tags: ["metrics", "regular"]
  1. Layer-Based Tags: Consider using tags to denote your data modeling layers

models:
  staging:
    +tags: ["bronze_layer"]
  intermediate:
    +tags: ["silver_layer"]
  marts:
    +tags: ["gold_layer"]

Common Use Cases

  1. Refresh Schedules

models:
  project_name:
    hourly:
      +tags: ["hourly_refresh"]
    daily:
      +tags: ["daily_refresh"]
  1. Data Classification

models:
  project_name:
    +tags: ["contains_pii"]
    public:
      +tags: ["public_data"]
  1. Testing Strategies

models:
  project_name:
    critical:
      +tags: ["critical_path", "requires_alert"]

Troubleshooting

  1. Tag Inheritance Issues

    • Check parent folder configurations in dbt_project.yml

    • Verify tag syntax and formatting

    • Use dbt ls to verify applied tags

  2. Selection Syntax

    • Ensure tag names match exactly (case-sensitive)

    • Check for proper quoting in command line arguments

    • Verify tag exists in project using dbt ls

Remember that tags are a powerful organizational tool, but they should be used thoughtfully and consistently across your project for maximum benefit.

Last updated