Working with Tags
Tags in dbt are powerful metadata labels that can be applied to various resources in your project. They enable flexible model selection, improved workflow management, and better project organization.
What Are Tags?
Tags are simple text labels you can assign to models, sources, snapshots, and other dbt resources. They help you organize, categorize, and select specific subsets of your project for execution or documentation.
By leveraging tags, you can:
Group models logically – Categorize models based on refresh schedule, function, or ownership
Control execution – Run or exclude specific sets of models
Optimize CI/CD pipelines – Target models for incremental builds and tests
Improve project maintainability – Standardize workflows across teams
How to Apply Tags
Tags can be applied in two primary ways: directly in model files or in your project configuration file.
1. Defining Tags in a Model File
Tags can be assigned directly within SQL models using the config()
function:
This assigns both the "finance" and "daily_refresh" tags to this specific model.
2. Defining Tags in dbt_project.yml
dbt_project.yml
Tags can also be applied at the project level, affecting entire folders or groups of models:
Tag Inheritance
Models inside a folder inherit the parent folder's tags unless overridden. This creates a hierarchical tagging system that is easy to maintain.
Example:
Individual models can add to inherited tags:
This model would have tags: ["staging", "raw_data", "critical"]
Applying Tags to Different Resource Types
Tags can be applied to various dbt resource types:
Snapshots
Seeds
Sources
Using Tags for Selection
Once tags are defined, you can use them with dbt commands to select specific resources.
Selection Examples
dbt run --select tag:daily_refresh
Run only models with the daily_refresh
tag
dbt run --select tag:daily_refresh tag:critical
Run models with either daily_refresh
OR critical
tags
dbt run --select tag:daily_refresh --exclude tag:deprecated
Run models with daily_refresh
but exclude those with deprecated
tag
dbt run --select staging,tag:finance
Run all models tagged finance
in the staging
folder
dbt run --select tag:critical+
Run critical
models and their downstream dependencies
Tag Selection Patterns
tag:name
All resources with this tag
dbt run --select tag:nightly
tag:name1 tag:name2
Resources with either tag
dbt run --select tag:nightly tag:critical
tag:name+
Tagged resources and downstream dependencies
dbt run --select tag:base+
+tag:name
Tagged resources and upstream dependencies
dbt run --select +tag:reporting
--exclude tag:name
Everything except resources with this tag
dbt run --exclude tag:deprecated
Best Practices for Using Tags
Use Consistent Naming Conventions
Standardized naming improves clarity and prevents confusion.
Document Your Tagging Strategy
Clearly define tag meanings in your project's documentation.
Use Granular Tags
Avoid broad, generic tags. Instead, use precise labels for better control.
Tag Models by Layer
Use tags to represent data modeling layers in your project.
Common Use Cases for Tags
Refresh Scheduling
Define tags based on refresh frequency for better execution control:
Then in your orchestration tool, schedule different runs:
Data Classification
Differentiate datasets based on sensitivity or access level:
This helps implement appropriate security controls and auditing.
Testing Strategy
Prioritize critical models in testing workflows:
Run critical tests more frequently:
Troubleshooting Tag Issues
If dbt isn't selecting resources correctly based on tags, consider these troubleshooting steps:
Tag Inheritance Issues
Verify parent folder configurations in dbt_project.yml
:
Selection Syntax Errors
Ensure tag names match exactly (case-sensitive):
Using dbt ls to Validate Tags
Use the dbt ls
command to check which models have specific tags:
By effectively implementing a tagging strategy, you can organize your dbt project more efficiently, streamline your workflows, and gain better control over how your transformations are executed.
Last updated
Was this helpful?