# dbt™️ Modifiers

## `generate-missing-sources`

{% hint style="info" %}
**What it does**

If any source is missing this hook tries to create it.

**When to use it**

You are too lazy to define schemas manually :D.
{% endhint %}

**Arguments**

`--manifest`: location of `manifest.json` file. Usually `target/manifest.json`. This file contains a full representation of dbt project. **Default: `target/manifest.json`**.\
`--schema-file`: Location of schema.yml file. Where new source tables should be created.

**Example**

```yaml
repos:
  - repo: https://github.com/dbt-checkpoint/dbt-checkpoint
    rev: v1.0.0
    hooks:
      - id: generate-missing-sources
        args: ["--schema-file", "models/schema.yml", "--"]
```

⚠️ do not forget to include `--` as the last argument. Otherwise `pre-commit` would not be able to separate a list of files with args.

**Requirements**

|                    Model exists in `manifest.json`*1*                   | Model exists in `catalog.json` *2* |
| :---------------------------------------------------------------------: | :--------------------------------: |
| ❌ Not needed since this hook tries to generate even non-existent source |            ❌ Not needed            |

1 It means that you need to run `dbt parse` before run this hook (dbt >= 1.5).\
2 It means that you need to run `dbt docs generate` before run this hook.

**How it works**

* Hook takes all changed `SQL` files.
* SQL is parsed to find all sources.
* If the source exists in the manifest, nothing is done.
* If not, a new source is created in specified `schema-file` and the hook fails.

**Known limitations**

Source "envelope" has to exist in specified `schema-file`, something like this:

```yaml
version: 2
sources:
- name: <source_name>
```

Otherwise, it is not possible to automatically generate a new source table.

Unfortunately, this hook breaks your formatting.

***

## `unify-column-description`

{% hint style="info" %}
**What it does**

Unify column descriptions across all models.

**When to use it**

You want the descriptions of the same columns to be the same. E.g. in two of your models, you have `customer_id` with the description `This is cutomer_id`, but there is one model where column `customer_id` has a description `Something else`.

This hook finds discrepancies between column descriptions and replaces them. So as the results all columns going to have the description `This is customer_id`
{% endhint %}

**Arguments**

`--ignore`: Columns for which do not check whether have a different description.

**Example**

```yaml
repos:
  - repo: https://github.com/dbt-checkpoint/dbt-checkpoint
    rev: v1.0.0
    hooks:
      - id: generate-missing-sources
        args: ["--schema-file", "models/schema.yml", "--"]
```

⚠️ do not forget to include `--` as the last argument. Otherwise `pre-commit` would not be able to separate a list of files with args.

**Requirements**

|          Model exists in `manifest.json` *1*          | Model exists in `catalog.json` *2* |
| :---------------------------------------------------: | :--------------------------------: |
| ❌ Not needed since this hook is using only yaml files |            ❌ Not needed            |

1 It means that you need to run `dbt parse` before run this hook (dbt >= 1.5).\
2 It means that you need to run `dbt docs generate` before run this hook.

**How it works**

* Hook takes all changed `YAML` files.
* From those files columns are parsed and compared.
* If one column name has more than one (not empty) description, the description with the most occurrences is taken and the hook fails.
* If it is not possible to decide which description is dominant, no changes are made.

**Known limitations**

If it is not possible to decide which description is dominant, no changes are made.

***

## `replace-script-table-names`

{% hint style="info" %}
**What it does**

Replace table names with `source` or `ref` macros in the script.

**When to use it**

You are running and debugging your `SQL` in the editor. This editor does not know `source` or `ref` macros. So every time you copy the script from the editor into `dbt` project you need to rewrite all table names to `source` or `ref`. That's boring and error-prone. If you run this hook it will replace all table names with macros instead of you.
{% endhint %}

**Arguments**

`--manifest`: location of `manifest.json` file. Usually `target/manifest.json`. This file contains a full representation of dbt project. **Default: `target/manifest.json`**.

**Example**

```yaml
repos:
  - repo: https://github.com/dbt-checkpoint/dbt-checkpoint
    rev: v1.0.0
    hooks:
      - id: replace-script-table-names
```

⚠️ do not forget to include `--` as the last argument. Otherwise `pre-commit` would not be able to separate a list of files with args.

**Requirements**

| Model exists in `manifest.json` *1* | Model exists in `catalog.json` *2* |
| :---------------------------------: | :--------------------------------: |
|                ✅ Yes                |            ❌ Not needed            |

1 It means that you need to run `dbt parse` before run this hook (dbt >= 1.5).\
2 It means that you need to run `dbt docs generate` before run this hook.

**How it works**

* Hook takes all changed `SQL` files.
* SQL is parsed and table names are found.
* Firstly it tries to find table name in models - `ref`.
* Then it tries to find a table in sources - `source`.
* If nothing is found it creates unknown `source` as `source('<schema_name>', '<table_name>')`
* If the script contains only `ref` and `source` macros, the hook success.

***

## `generate-model-properties-file`

{% hint style="info" %}
**What it does**

Generate model properties file if does not exist.

**When to use it**

You are running and debugging your `SQL` in the editor. This editor does not know `source` or `ref` macros. So every time you copy the script from the editor into `dbt` project you need to rewrite all table names to `source` or `ref`. That's boring and error-prone. If you run this hook it will replace all table names with macros instead of you.
{% endhint %}

**Arguments**

`--manifest`: location of `manifest.json` file. Usually `target/manifest.json`. This file contains a full representation of dbt project. **Default: `target/manifest.json`**.\
`--catalog`: location of `catalog.json` file. Usually `target/catalog.json`. dbt uses this file to render information like column types and table statistics into the docs site. In dbt-checkpoint is used for column operations. **Default: `target/catalog.json`**\
`--properties-file`: Location of file where new model properties should be generated. Suffix has to be `yml` or `yaml`. It can also include {database}, {schema}, {name} and {alias} variables. E.g. /models/{schema}/{name}.yml for model `foo.bar` will create properties file in /models/foo/bar.yml. If path already exists, properties are appended.

**Example**

```yaml
repos:
  - repo: https://github.com/dbt-checkpoint/dbt-checkpoint
    rev: v1.0.0
    hooks:
      - id: generate-model-properties-file
        args: ["--properties-file", "models/{schema}/{name}.yml", "--"]
```

⚠️ do not forget to include `--` as the last argument. Otherwise `pre-commit` would not be able to separate a list of files with args.

**Requirements**

| Model exists in `manifest.json` *1* | Model exists in `catalog.json` *2* |
| :---------------------------------: | :--------------------------------: |
|                ✅ Yes                |                ❌ Yes               |

1 It means that you need to run `dbt parse` before run this hook (dbt >= 1.5).\
2 It means that you need to run `dbt docs generate` before run this hook.

**How it works**

* Hook takes all changed `SQL` files.
* The model name is obtained from the `SQL` file name.
* The manifest is scanned for a model.
* The catalog is scanned for a model.
* If the model does not have `patch_path` in the manifest, the new schema is written to the specified path. The hook fails.

**Known limitations**

Unfortunately, this hook breaks your formatting in the written file.

***

## `remove-script-semicolon`

{% hint style="info" %}
**What it does**

Remove the semicolon at the end of the script.

**When to use it**

You are too lazy or forgetful to delete one character at the end of the script.
{% endhint %}

**Example**

```yaml
repos:
  - repo: https://github.com/dbt-checkpoint/dbt-checkpoint
    rev: v1.0.0
    hooks:
      - id: remove-script-semicolon
```

**Requirements**

| Model exists in `manifest.json` *1* | Model exists in `catalog.json` *2* |
| :---------------------------------: | :--------------------------------: |
|             ❌ Not needed            |            ❌ Not needed            |

1 It means that you need to run `dbt parse` before run this hook (dbt >= 1.5).\
2 It means that you need to run `dbt docs generate` before run this hook.

**How it works**

* Hook takes all changed `SQL` files.
* If the file contains a semicolon at the end of the file, it is removed and the hook fails.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.paradime.io/app-help/documentation/integrations/code-ide/pre-commit/dbt-tm-checkpoint-hooks/dbt-tm-modifiers.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
