Build your first dbt™ model

A dbt model is a SQL select statement that transforms your data. Building your first dbt model is a crucial step in creating a data transformation pipeline using dbt.

Purpose

Creating dbt models serves several important functions:

  • Transforms raw data into a more usable format

  • Implements business logic in SQL

  • Creates reusable and modular data transformations

  • Enables version control of your data transformations

Key Components

Creating a Model File

To create a new model, add a .sql file in your models directory. For example:

-- models/stg_customers.sql

SELECT 
    id,
    first_name,
    last_name,
    email
FROM {{ source('jaffle_shop', 'raw_customers') }}

Using the source() Function

The source() function references the sources defined in your sources.yml file:

version: 2

sources:
  - name: jaffle_shop
    database: raw
    schema: jaffle_shop
    tables:
      - name: raw_customers
      - name: raw_orders

In the model, source('jaffle_shop', 'raw_customers') refers to the 'raw_customers' table within the 'jaffle_shop' source. dbt uses this to construct the full table name (e.g., raw.jaffle_shop.raw_customers).

Running Your Model

Execute your model with the following dbt command:

dbt run --models stg_customers

This compiles the SQL and materializes the results in your data warehouse.

Best Practices

  1. Ensure sources are properly defined in sources.yml before referencing them in models

  2. Use meaningful names for your models that reflect their purpose

  3. Keep models focused on a single task or transformation

  4. Comment your SQL to explain complex transformations

  5. Use dbt's testing functionality to ensure data quality in your models

  6. Regularly review and update your models as your data needs evolve

Remember, effective dbt models start with properly defined sources and make intelligent use of dbt's source() function to reference your raw data.

Last updated