# Introduction

dbt (data build tool) transforms how data teams work by bringing software engineering practices to data transformation. It enables analysts and engineers to build reliable, modular, and tested data pipelines using simple SQL.

### What is dbt?

dbt is a transformation framework that works with your existing data warehouse. You write SQL SELECT statements, and dbt handles the complexity of turning them into tables and views while managing dependencies between models.

```sql
-- A simple dbt model example
SELECT 
    orders.id as order_id,
    orders.status,
    customers.name as customer_name,
    customers.email
FROM raw_data.orders
JOIN raw_data.customers ON orders.customer_id = customers.id
WHERE orders.status != 'cancelled'
```

This SQL becomes a fully-managed transformation with version control, testing, and documentation.

***

### Why dbt Matters

Modern data teams face growing complexity:

| Challenge                 | Impact                                       | dbt Solution                                 |
| ------------------------- | -------------------------------------------- | -------------------------------------------- |
| Scattered transformations | Inconsistent business logic, duplicated work | Centralized repository of transformations    |
| Manual SQL processes      | Errors, slow iterations, impossible to audit | Automated, repeatable transformation runs    |
| No testing                | Data quality issues affecting decisions      | Built-in testing framework                   |
| Poor documentation        | Knowledge silos, difficult onboarding        | Auto-generated, always-current documentation |

***

### The ELT Approach: Why It Matters

Modern data workflows have shifted from ETL (Extract, Transform, Load) to ELT (Extract, Load, Transform), and dbt is specifically built for the "T" in ELT.

In the ELT approach:

1. Raw data lands in your warehouse without transformation
2. dbt runs SQL transformations directly in the warehouse
3. Analysts work with cleaned, tested data models

This approach leverages your warehouse's processing power and keeps all transformations in a single, manageable location.

***

### How dbt Works in Practice

Let's follow the journey of a typical dbt workflow:

1. **Define a Source**: Tell dbt where your raw data lives

   ```yaml
   # sources.yml
   sources:
     - name: raw_data
       tables:
         - name: orders
         - name: customers
   ```
2. **Create a Model**: Write SQL that transforms this data

   ```sql
   -- models/staging/stg_orders.sql
   SELECT
     id as order_id,
     customer_id,
     status,
     created_at,
     -- Additional business logic here
     CASE 
       WHEN status = 'shipped' THEN 'completed'
       WHEN status = 'processing' THEN 'in_progress'
       ELSE 'other'
     END as order_status_normalized
   FROM {{ source('raw_data', 'orders') }}
   ```
3. **Add Tests**: Ensure data quality

   ```yaml
   # schema.yml
   models:
     - name: stg_orders
       columns:
         - name: order_id
           tests:
             - unique
             - not_null
   ```
4. **Run dbt**: Transform and test your data

   ```bash
   dbt run
   dbt test
   ```
5. **Document & Share**: Automatically generate documentation

   ```bash
   dbt docs generate
   dbt docs serve
   ```

***

### The dbt Ecosystem

dbt fits into a modern data stack alongside other specialized tools:

* **Extraction tools** (Fivetran, Airbyte) bring data to your warehouse
* **dbt** transforms this data into analytics-ready models
* **BI tools** (Tableau, Looker) visualize the transformed data

This modular approach allows each tool to focus on what it does best, creating a more maintainable data platform.

***

### Key Benefits

* **Write just SQL**: No new language to learn
* **Version-controlled transformations**: Track changes with Git
* **Automated testing**: Ensure data quality
* **Self-documenting models**: Always up-to-date documentation
* **Development workflows**: Build and test locally before deploying
* **Modular design**: Reusable patterns and dependencies

By centralizing transformations in dbt, data teams can build more reliable data pipelines, collaborate more effectively, and spend more time on analysis instead of maintenance.

{% hint style="info" %}
**Want to start using dbt in Paradime for free?**

Check out our [Paradime 101 guide](https://docs.paradime.io/app-help/guides/paradime-101) to set up your first dbt project!
{% endhint %}
