Defining Your Sources in sources.yml
Working with Tags in Your dbt™ Project
Tags in dbt™ provide a flexible way to categorize and manage models, allowing teams to organize resources, optimize execution workflows, and selectively run models. By applying tags, users can efficiently filter and execute transformations within their dbt projects.
Why Use Tags?
Using tags helps teams streamline model selection and execution while improving project organization. Tags are particularly useful for:
Grouping related models – Helps organize models into logical categories.
Controlling execution workflows – Selectively run specific models or exclude others.
Improving maintainability – Easily adjust execution without modifying code.
Supporting data classification – Differentiate datasets (e.g., PII, financial data).
Simplifying CI/CD automation – Automate dbt runs by tagging critical models.
Defining Tags in dbt™
Tags can be applied in multiple ways within a dbt project.
Applying Tags in Model Configurations
Tags can be set directly within a model's SQL file using the config()
function.
Applying Tags in dbt_project.yml
dbt_project.yml
Tags can be set at the directory level in dbt_project.yml
, applying them to all models within that directory.
Tag Inheritance and Resource Application
Tags inherit from parent folders unless explicitly overridden, simplifying large-scale tag management.
How Tag Inheritance Works
Tags applied at the folder level automatically apply to all models inside that folder.
If a model has its own tag defined, it adds to the inherited tags rather than replacing them.
Applying Tags to Other Resource Types
Tags can also be applied to snapshots, seeds, and sources to improve organization and execution flexibility.
Tagging Snapshots
Snapshots track historical data. Apply tags to snapshots in dbt_project.yml
for better resource selection.
Tagging Seeds
For preloaded datasets, assign tags to seeds to categorize static data.
Tagging Sources
Apply tags to external data sources within sources.yml
for better organization.
This allows you to filter and execute specific sources using dbt commands like:
Using Tags in dbt Commands
Once tags are defined, you can use them to execute specific resources.
Basic Selection
Run only models, seeds, or snapshots with a specific tag.
Advanced Selection
Combine multiple tags, exclude tags, or mix selection criteria.
Best Practices for Using Tags
To ensure an effective tagging strategy, follow these best practices:
1. Use Consistent Naming Conventions
Standardized naming improves clarity and prevents confusion.
2. Document Your Tagging Strategy
Clearly define tag meanings in your project's documentation.
3. Use Granular Tags
Avoid broad, generic tags. Instead, use precise labels for better control.
4. Tag Models by Layer
Use tags to represent data modeling layers in your project.
Common Use Cases for Tags
Refresh Scheduling
Define tags based on refresh frequency for better execution control.
Data Classification
Differentiate datasets based on sensitivity or access level.
Testing Strategy
Prioritize critical models in testing workflows.
Troubleshooting Tag Issues
If dbt isn't selecting resources correctly based on tags, consider these troubleshooting steps:
Tag Inheritance Issues
Verify parent folder configurations in
dbt_project.yml
.Use
dbt ls
to confirm applied tags.
Selection Syntax Errors
Ensure tag names match exactly (case-sensitive
Last updated
Was this helpful?