# dbt™️ Modifiers

## `generate-missing-sources`

{% hint style="info" %}
**What it does**

If any source is missing this hook tries to create it.

**When to use it**

You are too lazy to define schemas manually :D.
{% endhint %}

**Arguments**

`--manifest`: location of `manifest.json` file. Usually `target/manifest.json`. This file contains a full representation of dbt project. **Default: `target/manifest.json`**.\
`--schema-file`: Location of schema.yml file. Where new source tables should be created.

**Example**

```yaml
repos:
  - repo: https://github.com/dbt-checkpoint/dbt-checkpoint
    rev: v1.0.0
    hooks:
      - id: generate-missing-sources
        args: ["--schema-file", "models/schema.yml", "--"]
```

⚠️ do not forget to include `--` as the last argument. Otherwise `pre-commit` would not be able to separate a list of files with args.

**Requirements**

|                    Model exists in `manifest.json`*1*                   | Model exists in `catalog.json` *2* |
| :---------------------------------------------------------------------: | :--------------------------------: |
| ❌ Not needed since this hook tries to generate even non-existent source |            ❌ Not needed            |

1 It means that you need to run `dbt parse` before run this hook (dbt >= 1.5).\
2 It means that you need to run `dbt docs generate` before run this hook.

**How it works**

* Hook takes all changed `SQL` files.
* SQL is parsed to find all sources.
* If the source exists in the manifest, nothing is done.
* If not, a new source is created in specified `schema-file` and the hook fails.

**Known limitations**

Source "envelope" has to exist in specified `schema-file`, something like this:

```yaml
version: 2
sources:
- name: <source_name>
```

Otherwise, it is not possible to automatically generate a new source table.

Unfortunately, this hook breaks your formatting.

***

## `unify-column-description`

{% hint style="info" %}
**What it does**

Unify column descriptions across all models.

**When to use it**

You want the descriptions of the same columns to be the same. E.g. in two of your models, you have `customer_id` with the description `This is cutomer_id`, but there is one model where column `customer_id` has a description `Something else`.&#x20;

This hook finds discrepancies between column descriptions and replaces them. So as the results all columns going to have the description `This is customer_id`
{% endhint %}

**Arguments**

`--ignore`: Columns for which do not check whether have a different description.

**Example**

```yaml
repos:
  - repo: https://github.com/dbt-checkpoint/dbt-checkpoint
    rev: v1.0.0
    hooks:
      - id: generate-missing-sources
        args: ["--schema-file", "models/schema.yml", "--"]
```

⚠️ do not forget to include `--` as the last argument. Otherwise `pre-commit` would not be able to separate a list of files with args.

**Requirements**

|          Model exists in `manifest.json` *1*          | Model exists in `catalog.json` *2* |
| :---------------------------------------------------: | :--------------------------------: |
| ❌ Not needed since this hook is using only yaml files |            ❌ Not needed            |

1 It means that you need to run `dbt parse` before run this hook (dbt >= 1.5).\
2 It means that you need to run `dbt docs generate` before run this hook.

**How it works**

* Hook takes all changed `YAML` files.
* From those files columns are parsed and compared.
* If one column name has more than one (not empty) description, the description with the most occurrences is taken and the hook fails.
* If it is not possible to decide which description is dominant, no changes are made.

**Known limitations**

If it is not possible to decide which description is dominant, no changes are made.

***

## `replace-script-table-names`

{% hint style="info" %}
**What it does**

Replace table names with `source` or `ref` macros in the script.

**When to use it**

You are running and debugging your `SQL` in the editor. This editor does not know `source` or `ref` macros. So every time you copy the script from the editor into `dbt` project you need to rewrite all table names to `source` or `ref`. That's boring and error-prone. If you run this hook it will replace all table names with macros instead of you.
{% endhint %}

**Arguments**

`--manifest`: location of `manifest.json` file. Usually `target/manifest.json`. This file contains a full representation of dbt project. **Default: `target/manifest.json`**.

**Example**

```yaml
repos:
  - repo: https://github.com/dbt-checkpoint/dbt-checkpoint
    rev: v1.0.0
    hooks:
      - id: replace-script-table-names
```

⚠️ do not forget to include `--` as the last argument. Otherwise `pre-commit` would not be able to separate a list of files with args.

**Requirements**

| Model exists in `manifest.json` *1* | Model exists in `catalog.json` *2* |
| :---------------------------------: | :--------------------------------: |
|                ✅ Yes                |            ❌ Not needed            |

1 It means that you need to run `dbt parse` before run this hook (dbt >= 1.5).\
2 It means that you need to run `dbt docs generate` before run this hook.

**How it works**

* Hook takes all changed `SQL` files.
* SQL is parsed and table names are found.
* Firstly it tries to find table name in models - `ref`.
* Then it tries to find a table in sources - `source`.
* If nothing is found it creates unknown `source` as `source('<schema_name>', '<table_name>')`
* If the script contains only `ref` and `source` macros, the hook success.

***

## `generate-model-properties-file`

{% hint style="info" %}
**What it does**

Generate model properties file if does not exist.

**When to use it**

You are running and debugging your `SQL` in the editor. This editor does not know `source` or `ref` macros. So every time you copy the script from the editor into `dbt` project you need to rewrite all table names to `source` or `ref`. That's boring and error-prone. If you run this hook it will replace all table names with macros instead of you.
{% endhint %}

**Arguments**

`--manifest`: location of `manifest.json` file. Usually `target/manifest.json`. This file contains a full representation of dbt project. **Default: `target/manifest.json`**.\
`--catalog`: location of `catalog.json` file. Usually `target/catalog.json`. dbt uses this file to render information like column types and table statistics into the docs site. In dbt-checkpoint is used for column operations. **Default: `target/catalog.json`**\
`--properties-file`: Location of file where new model properties should be generated. Suffix has to be `yml` or `yaml`. It can also include {database}, {schema}, {name} and {alias} variables. E.g. /models/{schema}/{name}.yml for model `foo.bar` will create properties file in /models/foo/bar.yml. If path already exists, properties are appended.

**Example**

```yaml
repos:
  - repo: https://github.com/dbt-checkpoint/dbt-checkpoint
    rev: v1.0.0
    hooks:
      - id: generate-model-properties-file
        args: ["--properties-file", "models/{schema}/{name}.yml", "--"]
```

⚠️ do not forget to include `--` as the last argument. Otherwise `pre-commit` would not be able to separate a list of files with args.

**Requirements**

| Model exists in `manifest.json` *1* | Model exists in `catalog.json` *2* |
| :---------------------------------: | :--------------------------------: |
|                ✅ Yes                |                ❌ Yes               |

1 It means that you need to run `dbt parse` before run this hook (dbt >= 1.5).\
2 It means that you need to run `dbt docs generate` before run this hook.

**How it works**

* Hook takes all changed `SQL` files.
* The model name is obtained from the `SQL` file name.
* The manifest is scanned for a model.
* The catalog is scanned for a model.
* If the model does not have `patch_path` in the manifest, the new schema is written to the specified path. The hook fails.

**Known limitations**

Unfortunately, this hook breaks your formatting in the written file.

***

## `remove-script-semicolon`

{% hint style="info" %}
**What it does**

Remove the semicolon at the end of the script.

**When to use it**

You are too lazy or forgetful to delete one character at the end of the script.
{% endhint %}

**Example**

```yaml
repos:
  - repo: https://github.com/dbt-checkpoint/dbt-checkpoint
    rev: v1.0.0
    hooks:
      - id: remove-script-semicolon
```

**Requirements**

| Model exists in `manifest.json` *1* | Model exists in `catalog.json` *2* |
| :---------------------------------: | :--------------------------------: |
|             ❌ Not needed            |            ❌ Not needed            |

1 It means that you need to run `dbt parse` before run this hook (dbt >= 1.5).\
2 It means that you need to run `dbt docs generate` before run this hook.

**How it works**

* Hook takes all changed `SQL` files.
* If the file contains a semicolon at the end of the file, it is removed and the hook fails.
