# Data Contract Checks with DinoAI & .dinorules

Use DinoAI and `.dinorules` to automatically enforce schema integrity whenever dbt™ code is generated or updated — before any file is touched.

{% hint style="info" %}
**What this does:** DinoAI runs a scoped column-level lineage scan on every structural change. The scan completes before DinoAI writes a single line of code, surfacing every breaking change as a structured impact report.

dbt™ model contracts catch type mismatches at compile time, but they can't detect a downstream model referencing a renamed column. DinoAI + `.dinorules` fills that gap automatically.
{% endhint %}

### How the scoped scan works

Instead of walking the full DAG, DinoAI uses a **column-level blast radius**: one level upstream to confirm the input schema is still compatible, and two levels downstream to catch the models most likely to break — but only if they actually *select* the changed column.

```
[ Source / seed ]  →  [ Changed model ]  →  [ +1 child ]  →  [ +2 child ]  →  [ Further downstream ]
     -1 upstream           trigger             scanned          scanned            NOT scanned ⚠
```

{% hint style="warning" %}
**Scoped by design:** the report always notes that only -1 / +2 was checked. If your change is in a widely shared staging model, manually verify anything beyond +2 before merging.
{% endhint %}

{% hint style="info" %}
**Column-aware only:** at each level DinoAI checks whether the downstream model's `SELECT` actually references the changed column. Models that `ref()` the changed model but don't use the column are skipped — no false positives.
{% endhint %}

***

### What triggers a check

A data contract check is automatically triggered whenever any of the following occur:

| Trigger                                                | Example                                                     |
| ------------------------------------------------------ | ----------------------------------------------------------- |
| Column renamed, removed, or type/logic changed         | `race_id` → `race_key` in `stg_f1__races.sql`               |
| Source schema modified                                 | Editing any `*_sources.yml` file                            |
| Staging or intermediate output columns altered         | Adding or removing a column from `stg_f1__account.sql`      |
| Mart grain, primary key, or key metric columns changed | Changing the grain of `fct_f1__constructor_race_result.sql` |
| dbt™ test added, removed, or modified on a key field   | Removing `not_null` from `constructor_id` in a `.yml` file  |
| Model deleted or renamed                               | Renaming `f1__account_master.sql`                           |

{% hint style="info" %}
The check is never skipped. It always runs before any file is generated or modified.
{% endhint %}

***

### What gets scanned

For every triggering change, the following are checked within the -1 / +2 window:

**-1 upstream**\
Confirm the direct parent model or source still exposes the expected column with a compatible type.

**+1 downstream (column-aware)**\
Find all direct children that `ref()` the changed model and check whether their `SELECT` uses the changed column. Skip those that don't.

**+2 downstream (column-aware)**\
Repeat the same column-aware check one level further. Only flag models where the column actually propagates.

**YAML documentation files**\
Verify that column entries in `_<model_name>.yml` still match the model's actual output at the changed layer.

**Tests**\
Confirm no tests on the changed column are orphaned within the scanned range.

{% hint style="info" %}
If the changed model references a seed within the -1 / +2 window, seed schema compatibility is also checked.
{% endhint %}

***

### Add the rule to .dinorules

In your project root, open or [create `.dinorules` ](/app-help/documentation/dino-ai/dino-rules.md)and add the following block:

```yaml
Data Contract Checks:
  # Triggers — runs before DinoAI writes any file
  - TRIGGER: Column renamed, removed, or type/logic changed in any SQL or YAML file
  - TRIGGER: Source schema modified in *_sources.yml
  - TRIGGER: Staging or intermediate output columns altered
  - TRIGGER: Mart grain, primary key, or key metric columns changed
  - TRIGGER: dbt test added, removed, or modified on a key field
  - TRIGGER: Model deleted or renamed

  # Scan scope — column-level lineage, -1 upstream and +2 downstream only
  - UPSTREAM: Check -1 parent model or source for input schema compatibility
  - DOWNSTREAM: Scan +1 and +2 children using column-level lineage only
      Only flag a downstream model if its SELECT actually references the changed column
      Skip models that ref() the changed model but do not use the column
  - YAML: Verify _<model_name>.yml column entries at the changed layer
  - TESTS: Confirm no tests on the changed column are orphaned within scan range
  - SEEDS: Check seed schema compatibility if referenced within the scan window

  # Reporting
  - Always produce a Data Contract Impact Report after the scan
  - Always include a scoped scan notice:
      "Scan limited to -1 / +2 column-level lineage.
       Verify models beyond +2 manually if this column is widely consumed."
  - If no assets are affected: "No downstream contract violations detected."

  # Enforcement
  - Never skip — complete the scan before generating or modifying any files
  - Prefer column aliasing over renaming for backward compatibility
  - Flag PK / grain changes on mart models as HIGH severity
  - Add ⚠️ Breaking Contract Changes section to PR description for breaking changes
```

{% hint style="info" %}
Commit `.dinorules` to version control alongside your dbt™ project. Every team member using DinoAI will automatically pick up the same contract enforcement rules.
{% endhint %}

***

### Reading the impact report

After every scan DinoAI produces a structured impact report. The format is always the same:

```markdown
## Data Contract Impact Report

### Triggering Change
- `wins` renamed to `total_race_wins` in models/staging/stg_f1__constructor_standings.sql

### Scan Scope
-1 upstream · +1 column-aware · +2 column-aware

### Affected Assets
| Level | Asset | File Path | Impact |
|-------|-------|-----------|--------|
| -1 | f1_raw.constructor_standings | models/staging/_stg_f1__sources.yml | Source column `wins` confirmed present ✓ |
| +1 | int_f1__race_results_standings | models/intermediate/int_f1__race_results_standings.sql | Selects `wins` — must update to `total_race_wins` |
| +2 | fct_f1__constructor_race_result | models/marts/fct_f1__constructor_race_result.sql | Does not select `wins` directly — no change needed ✓ |
| —  | _stg_f1__constructor_standings.yml | models/staging/_stg_f1__constructor_standings.yml | Column entry `wins` orphaned — update to `total_race_wins` |

### Action Required
- Update `int_f1__race_results_standings.sql`: replace `wins` with `total_race_wins`
- Update `_stg_f1__constructor_standings.yml`: rename column entry
- Re-run `dbt test -s int_f1__race_results_standings`

⚠ Scan limited to -1 / +2 column-level lineage.
  If `wins` is consumed beyond +2, verify those models manually before merging.
```

{% hint style="info" %}
If no downstream assets are affected, the report will explicitly state: **"No downstream contract violations detected."**
{% endhint %}

***

### Severity levels

| Severity      | When it applies                                                                          | Required action                                                                                    |
| ------------- | ---------------------------------------------------------------------------------------- | -------------------------------------------------------------------------------------------------- |
| 🟢 **LOW**    | New column added — no downstream consumers affected within scan window                   | Document in PR description                                                                         |
| 🟡 **MEDIUM** | Non-key column renamed or removed — one or more +1 / +2 models select the changed column | Update affected consumers; prefer aliasing; include impact report in PR description                |
| 🔴 **HIGH**   | PK or mart grain changed                                                                 | Explicit reviewer sign-off + `⚠️ Breaking Contract Changes` section in PR + manual check beyond +2 |

{% hint style="warning" %}
For HIGH severity changes the scoped scan still applies — there is no automatic full-DAG walk. The scoped scan notice and reviewer sign-off requirement handle the residual risk for models beyond +2.
{% endhint %}

***

### Enforcement rules

| Rule                       | Detail                                                                                                                            |
| -------------------------- | --------------------------------------------------------------------------------------------------------------------------------- |
| **Never skip**             | The scan runs on every triggering event before any file is touched                                                                |
| **Column-aware only**      | At +1 and +2, only flag models whose `SELECT` actually uses the changed column; skip models that merely `ref()` the changed model |
| **Prefer aliasing**        | Keep the old column name as an alias while introducing the new one to preserve backward compatibility                             |
| **Always note scan scope** | The report must state that only -1 / +2 was checked and prompt manual verification further downstream                             |
| **HIGH severity sign-off** | PK / grain changes require explicit reviewer sign-off; scoped scan still applies                                                  |

### End-to-end workflow

The following example walks through renaming `wins` → `total_race_wins` in `stg_f1__constructor_standings.sql`.

{% stepper %}
{% step %}
**Make the change using an alias**

Keep the old column name as a backward-compat alias so +1 / +2 consumers keep working while the rename rolls out:

```sql
-- stg_f1__constructor_standings.sql
wins AS total_race_wins,
wins AS wins  -- backward-compat alias; remove after consumers are updated
```

{% endstep %}

{% step %}
**DinoAI detects the trigger and runs the scoped scan**

Before writing any file, DinoAI:

1. Checks `-1` for input schema compatibility
2. Walks `+1` column-aware — flags `int_f1__race_results_standings` because it selects `wins`
3. Walks `+2` column-aware — skips `fct_f1__constructor_race_result` because it does not select `wins` directly
   {% endstep %}

{% step %}
**Review the impact report and scope notice**

Check every flagged asset. Note the scoped scan warning at the bottom of the report. If the changed column is used in widely shared models, manually check anything beyond +2 before proceeding.
{% endstep %}

{% step %}
**DinoAI generates or updates the files**

With the scan complete, DinoAI writes the SQL and YAML changes. HIGH severity items pause generation and prompt for explicit confirmation before continuing.
{% endstep %}

{% step %}
**Ask DinoAI to run dbt™ tests and raise the PR**

Once the files are generated, ask DinoAI to run the tests and create the PR in one go:

```
Run dbt run and dbt test on all models affected by this change, then raise a PR.
Use the Data Contract Impact Report from this session as the content for the
## Data Contract Check section of the PR description.
```

Paste the impact report into the PR description under `## Data Contract Check`. Request at least one reviewer before merging.

DinoAI will:

1. Run `dbt run` and `dbt test` on all affected models and surface any failures before the PR is opened
2. Create the PR with the correct `[type]:` title based on the change made
3. Populate the PR description with the impact report — including the scan scope, affected assets table, action items, and the scoped scan notice — under `## Data Contract Check`
4. Request a reviewer if one is configured in your repo settings

{% hint style="warning" %}
For HIGH severity changes, DinoAI will add a `⚠️ Breaking Contract Changes` section to the PR description automatically and pause for your confirmation before opening the PR.
{% endhint %}
{% endstep %}
{% endstepper %}


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.paradime.io/app-help/guides/paradime-101/getting-started-with-the-paradime-ide/dinoai-accelerating-your-analytics-engineering-workflow/dinoai-agent/data-contract-checks-with-dinoai-and-.dinorules.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
