Add and Run dbt™ tests

Testing is a crucial part of any data transformation pipeline. dbt provides built-in testing capabilities to ensure the quality and reliability of your data models.

Purpose

Adding and running tests in dbt serves several important functions:

Validates the integrity of your data transformations
Ensures that your models meet expected criteria
Catches errors early in the development process
Provides confidence in the reliability of your data pipeline
Facilitates collaboration by setting clear expectations for data quality

Key Components

Built-in Generic Tests

dbt provides four generic tests out of the box:

not_null: Ensures a column contains no null values
unique: Checks for duplicate values in a column
accepted_values: Validates that all values in a column are within a specified list
relationships: Checks referential integrity between tables

Implementing Tests in YAML

Tests are typically defined in YAML files. Here's an example:

version: 2

models:
  - name: customers
    columns:
      - name: customer_id
        tests:
          - unique
          - not_null
      - name: email
        tests:
          - unique
      - name: status
        tests:
          - accepted_values:
              values: ['active', 'inactive', 'pending']
      - name: country_id
        tests:
          - relationships:
              to: ref('countries')
              field: id

Custom Test Names

You can provide custom names for your tests:

models:
  - name: orders
    columns:
      - name: status
        tests:
          - accepted_values:
              name: valid_order_status
              values: ['placed', 'shipped', 'completed', 'returned']

Alternative Test Definition Format

For complex tests, you can use an alternative format:

models:
  - name: orders
    columns:
      - name: status
        tests:
          - name: valid_order_status
            test_name: accepted_values
            values: ['placed', 'shipped', 'completed', 'returned']
            config:
              where: "order_date = current_date"

Running Tests

To run all tests in your project:

dbt test

To run tests on a specific model:

dbt test --models model_name

Best Practices

Implement tests for all critical data quality assumptions
Use a combination of generic and custom tests for comprehensive coverage
Write clear, descriptive names for your tests
Include tests that validate your business logic, not just data integrity
Run tests frequently, ideally as part of your CI/CD pipeline
Review and update tests as your data models evolve
Document the purpose and expected outcomes of your tests

Remember, thorough testing is key to building reliable data pipelines and maintaining trust in your data.

PreviousBuild your Final Model NextWorking with Tags in your dbt™ Project

Last updated 2 months ago

Was this helpful?