Testing Data Quality
Why Testing Matters in dbt™
Ensuring data quality is a critical part of any data transformation pipeline. Inaccurate, incomplete, or inconsistent data can lead to incorrect business decisions. dbt™ provides built-in testing capabilities to help catch issues early, enforce data integrity, and maintain confidence in your transformations.
Benefits of dbt™ Testing
Ensures Data Integrity – Prevents duplicates, null values, and referential mismatches.
Validates Business Logic – Confirms that data meets expected criteria.
Catches Issues Early – Detects errors before they affect downstream analytics.
Automates Quality Checks – Reduces the need for manual data validation.
Supports Collaboration – Helps teams align on data expectations.
Types of dbt™ Tests
dbt™ provides two main types of tests: generic tests (pre-built) and singular tests (user-defined).
1. Generic Tests (Built-in)
dbt™ includes four generic tests that can be applied to columns in your models.
not_null
Ensures a column contains no NULL values.
unique
Ensures a column has no duplicate values.
accepted_values
Validates that column values are within a specified list.
relationships
Ensures referential integrity between tables.
2. Singular Tests
If built-in tests are not enough, dbt™ allows users to define singular tests using SQL and Jinja. These are specific, one-off tests that are written as SQL queries that return failing records.
Adding Tests to Your dbt™ Models
Tests in dbt™ are typically defined in YAML files, inside a .yml
file for a model.
Example: Model and Corresponding Test File
If you have a model named orders
, its schema file containing tests should follow this structure:
Inside schema.yml
, you can define the tests for the orders
model:
How This Works:
not_null
: Ensurescustomer_id
is always populated.unique
: Prevents duplicateemail
addresses.accepted_values
: Ensuresstatus
is only'active'
,'inactive'
, or'pending'
.relationships
: Verifies that everycountry_id
incustomers
exists in thecountries
table.
Defining Singular Tests
If your data quality needs are more specific, you can create singular tests using SQL.
Example: Singular Test for Negative Order Totals
If this query returns any rows, the test fails, indicating an issue with negative order amounts.
Running Tests in dbt™
dbt™ makes it easy to run tests on demand or as part of a CI/CD pipeline.
Run All Tests in Your Project
Run Tests on a Specific Model
Run a Single Test
Filter Tests by a Tag
Troubleshooting Test Failures
If a test fails, dbt™ provides a log with details. Here's how to investigate issues:
Use
dbt debug
to check for project configuration issues.Check the generated SQL by inspecting the
target/compiled
directory.Store failing rows for review using
store_failures: true
in your test configuration.Filter test runs with
dbt test --select tag:critical
to focus on high-priority data issues.
Running tests frequently and resolving failures quickly keeps your data pipeline reliable and trustworthy.
Best Practices for dbt™ Testing
Test All Critical Fields – Apply
unique
,not_null
, andrelationships
tests on key columns.Use a Mix of Generic and Singular Tests – Generic tests catch common issues, while singular tests validate business rules.
Write Clear, Descriptive Test Names – Make it easy to understand test failures.
Integrate Tests into CI/CD Pipelines – Run tests automatically before deploying changes.
Review and Update Tests Regularly – Data models evolve, and tests should too.
Advanced Testing Concepts
Unit Testing
For more complex testing needs, dbt™ also supports unit testing concepts to validate transformation logic. Unit tests allow you to verify that specific transformations produce expected outputs given controlled inputs. More details on unit testing can be found in our Unit Testing documentation.
Last updated
Was this helpful?