# Anomaly tests parameters

## Introduction

Elementary data anomaly detection tests monitor specific metrics and compare recent values to historical data to detect significant changes that may indicate data reliability issues. This page outlines the parameters available for configuring these tests.

{% hint style="info" %}
If your dataset doesn't hasve a timestamp column representing the creation time of a field, it's highly recommended to configure one (ex. `date_added_dttm`). This allows Elementary to create time buckets and filter the table effectively.
{% endhint %}

## Parameters overview

Below it is a list of all parameters available for each type of  test provided by Elementary.

### Common Parameters for All Anomaly Detection Tests

<table><thead><tr><th width="253">Parameters</th><th>Parameters Config</th></tr></thead><tbody><tr><td><a href="#timestamp_column"><strong>timestamp_column</strong></a></td><td><pre class="language-yaml"><code class="lang-yaml">timestamp_column: column name
</code></pre></td></tr><tr><td><a href="#where_expression"><strong>where_expression</strong></a></td><td><pre class="language-yaml"><code class="lang-yaml">where_expression: sql expression
</code></pre></td></tr><tr><td><a href="#anomaly_sensitivity"><strong>anomaly_sensitivity</strong></a></td><td><pre class="language-yaml"><code class="lang-yaml">anomaly_sensitivity: [int]
</code></pre></td></tr><tr><td><a href="#anomaly_direction"> <strong>anomaly_direction</strong></a></td><td><pre class="language-yaml"><code class="lang-yaml">anomaly_direction: [both | spike | drop]
</code></pre></td></tr><tr><td><a href="#ignore_small_changes"><strong>ignore_small_changes</strong></a></td><td><pre class="language-yaml"><code class="lang-yaml">ignore_small_changes: 
  spike_failure_percent_threshold: int 
  drop_failure_percent_threshold: int
</code></pre></td></tr><tr><td><a href="#anomaly_exclude_metrics"><strong>anomaly_exclude_metrics</strong></a></td><td><pre class="language-yaml"><code class="lang-yaml">anomaly_exclude_metrics: [SQL expression]
</code></pre></td></tr></tbody></table>

### Anomaly Detection Tests With Timestamp Column:

<table><thead><tr><th width="258">Parameters</th><th>Parameters Config</th></tr></thead><tbody><tr><td><a href="#training_period"><strong>training_period</strong></a></td><td><pre class="language-yaml"><code class="lang-yaml">training_period: int
  period: [hour | day | week | month]
  count: int
</code></pre></td></tr><tr><td><a href="#detection_period"><strong>detection_period</strong></a></td><td><pre class="language-yaml"><code class="lang-yaml">detection_period: int
  period: [hour | day | week | month]
  count: int
</code></pre></td></tr><tr><td><a href="#time_bucket"><strong>time_bucket</strong></a></td><td><pre class="language-yaml"><code class="lang-yaml">time_bucket:
  period: [hour | day | week | month]
  count: int
</code></pre></td></tr><tr><td><a href="#seasonality"><strong>seasonality</strong></a></td><td><pre class="language-yaml"><code class="lang-yaml">seasonality: day_of_week
</code></pre></td></tr><tr><td><a href="#detection_delay"><strong>detection_delay</strong></a></td><td><pre class="language-yaml"><code class="lang-yaml">detection_delay:
  period: [hour | day | week | month]
  count: int
</code></pre></td></tr></tbody></table>

### Volume Anomaly Tests

<table><thead><tr><th width="268">Parameters</th><th>Parameters Config</th></tr></thead><tbody><tr><td><a href="#fail_on_zero"><strong>fail_on_zero</strong></a></td><td><pre class="language-yaml"><code class="lang-yaml">fail_on_zero: [true | false]
</code></pre></td></tr></tbody></table>

### All columns anomalies test:

<table><thead><tr><th width="276">Parameters</th><th>Parameters Config</th></tr></thead><tbody><tr><td><a href="#column_anomalies"><strong>column_anomalies</strong></a></td><td><pre class="language-yaml"><code class="lang-yaml">column_anomalies: column monitors list
</code></pre></td></tr><tr><td><a href="#exclude_prefix"><strong>exclude_prefix</strong></a></td><td><pre class="language-yaml"><code class="lang-yaml">exclude_prefix: string
</code></pre></td></tr><tr><td><a href="#exclude_regexp"><strong>exclude_regexp</strong></a></td><td><pre class="language-yaml"><code class="lang-yaml">exclude_regexp: regex
</code></pre></td></tr></tbody></table>

### Dimension anomalies test:

<table><thead><tr><th width="286">Parameters</th><th>Parameters Config</th></tr></thead><tbody><tr><td><a href="#dimensions"><strong>dimensions</strong></a></td><td><pre class="language-yaml"><code class="lang-yaml">dimensions: sql expression
</code></pre></td></tr></tbody></table>

### Event freshness anomalies:

<table><thead><tr><th width="288">Parameters</th><th>Parameters Config</th></tr></thead><tbody><tr><td><a href="#event_timestamp_column"><strong>event_timestamp_column</strong></a></td><td><pre class="language-yaml"><code class="lang-yaml">event_timestamp_column: column name
</code></pre></td></tr><tr><td><a href="#update_timestamp_column"><strong>update_timestamp_column</strong></a></td><td><pre class="language-yaml"><code class="lang-yaml">update_timestamp_column: column name
</code></pre></td></tr></tbody></table>

### **Example configurations**

{% tabs %}
{% tab title="properties.yml" %}

```yml
version: 2

models:
  - name: <model_name>
    config:
      elementary:
        timestamp_column: < model timestamp column >
    tests: < here you will add elementary monitors as tests >

  - name: <your model with no timestamp>
    ## if no timestamp is configured, elementary will monitor without time filtering
    tests: <here you will add elementary monitors as tests>
```

{% endtab %}

{% tab title="properties.yml example" %}

```yml
version: 2

models:
  - name: login_events
    config:
      elementary:
        timestamp_column: updated_at
    tests:
      - elementary.freshness_anomalies:
          tags: ["elementary"]
      - elementary.all_columns_anomalies:
          tags: ["elementary"]

  - name: users
    ## if no timestamp is configured, elementary will monitor without time filtering
    tests:
      - elementary.volume_anomalies:
          tags: ["elementary"]
```

{% endtab %}

{% tab title="sources\_properties.yml" %}

```yml
sources:
  - name: < some name >
    database: < database >
    schema: < schema >
    tables:
      - name: < table_name >
        ## sources don't have config, so elementary config is placed under 'meta'
        meta:
          elementary:
            timestamp_column: < source timestamp column >
        tests: <here you will add elementary monitors as tests>
```

{% endtab %}

{% tab title="sources\_properties.yml example" %}

```yml
sources:
  - name: "my_non_dbt_table"
    database: "raw_events"
    schema: "product"
    tables:
      - name: "raw_product_login_events"
        ## sources don't have config, so elementary config is placed under 'meta'
        meta:
          elementary:
            timestamp_column: "loaded_at"
        tests:
          - elementary.volume_anomalies
          - elementary.all_columns_anomalies:
              column_anomalies:
                - null_count
                - missing_count
                - zero_count
        columns:
          - name: user_id
            tests:
              - elementary.column_anomalies
```

{% endtab %}
{% endtabs %}

***

## Parameters details

Below is a list of each parameter and configuration details.

### timestamp\_column

`timestamp_column: [column name]`

{% hint style="info" %}
If your data set has a timestamp column that represents the creation time of a field, it is highly recommended configuring it as a `timestamp_column`.
{% endhint %}

Elementary anomaly detection tests will use this column to create time buckets and filter the table. It is highly recommended to configure a timestamp column (if there is one). The best column for this would be an `updated_at`/`created_at`/`loaded_at` timestamp for each row (date type also works).

* When you specify a `timestamp_column`, when the test runs it splits the data to buckets according to the timestamp in this column, calculates the metric for each bucket and checks for anomalies between these buckets. This also means that if the table has enough historical data, the test can start working right away.
* When you do not specify a `timestamp_column`, each time the test runs it calculates the metric for all of the data in the table, and checks for anomalies between the metric from previous runs. This also means that it will take the test `training_period` days to start working, as it needs to the time to collect the necessary metrics.

If undefined, default is null (no time buckets).

{% hint style="info" %}
*Default: `none`*

*Relevant tests: All anomaly detection tests*
{% endhint %}

#### Example configuration:

{% tabs %}
{% tab title="Test" %}

```yml
models:
  - name: this_is_a_model
    tests:
      - elementary.volume_anomalies:
          timestamp_column: created_at
```

{% endtab %}

{% tab title="Model" %}

```yml
models:
  - name: this_is_a_model
    config:
      elementary:
        timestamp_column: updated_at
```

{% endtab %}

{% tab title="Source" %}

```yaml
sources:
  - name: my_non_dbt_tables
    schema: raw
    tables:
      - name: source_table
        meta:
          elementary:
            timestamp_column: loaded_at
```

{% endtab %}

{% tab title="dbt\_project.yml" %}

```yaml
vars:
  timestamp_column: loaded_at
```

{% endtab %}
{% endtabs %}

***

### where\_expression

`where_expression: [sql expression]`

Filter the tested data using a valid sql expression.

{% hint style="info" %}
*Default: `None`*

*Relevant tests: All anomaly detection tests*
{% endhint %}

#### Example configuration:

{% tabs %}
{% tab title="Test" %}

```yml
models:
  - name: this_is_a_model
    tests:
      - elementary.volume_anomalies:
          where_expression: "user_name != 'test'"
```

{% endtab %}

{% tab title="Model" %}

```yml
models:
  - name: this_is_a_model
    config:
      elementary:
        where_expression: "loaded_at is not null"
```

{% endtab %}

{% tab title="dbt\_project.yml" %}

```yml
vars:
  where_expression: "loaded_at > '2022-01-01'"
```

{% endtab %}
{% endtabs %}

***

### anomaly\_sensitivity

`anomaly_sensitivity: [int]`

Configuration to define how the expected range is calculated. A sensitivity of 3 means that the expected range is within 3 standard deviations from the average of the training set. Smaller sensitivity means this range will be reduced and more values will be potentially flagged as anomalies. Larger values will have the opposite effect and will reduce the number of anomalies as the expected range will be larger.

{% hint style="info" %}
*Default: `3`*

*Relevant tests: All anomaly detection tests*
{% endhint %}

#### Example configuration:

{% tabs %}
{% tab title="Test" %}

```yml
models:
  - name: this_is_a_model
    tests:
      - elementary.volume_anomalies:
          anomaly_sensitivity: 2.5

      - elementary.all_columns_anomalies:
          column_anomalies:
            - null_count
            - missing_count
            - zero_count
          anomaly_sensitivity: 4
```

{% endtab %}

{% tab title="Model" %}

```yml
models:
  - name: this_is_a_model
    config:
      elementary:
        anomaly_sensitivity: 3.5
```

{% endtab %}

{% tab title="dbt\_project.yml" %}

```yml
vars:
  anomaly_sensitivity: 3
```

{% endtab %}
{% endtabs %}

***

### anomaly\_direction

`anomaly_direction: both | spike | drop`

By default, data points are compared to the expected range and check if these are below or above it. For some data monitors, you might only want to flag anomalies if they are above the range and not under it, and vice versa. For example - when monitoring for freshness, we only want to detect data delays and not data that is “early”. The anomaly\_direction configuration is used to configure the direction of the expected range, and can be set to both, spike or drop.

{% hint style="info" %}
*Default: `both`*

*Supported values: `both`, `spike`, `drop`*

*Relevant tests: All anomaly detection tests*
{% endhint %}

#### Example configuration:

{% tabs %}
{% tab title="Test" %}

```yml
models:
  - name: this_is_a_model
    tests:
      - elementary.volume_anomalies:
          anomaly_direction: drop

      - elementary.all_columns_anomalies:
          column_anomalies:
            - null_count
            - missing_count
            - zero_count
          anomaly_direction: spike
```

{% endtab %}

{% tab title="Model" %}

```yml
models:
  - name: this_is_a_model
    config:
      elementary:
        anomaly_direction: drop
```

{% endtab %}

{% tab title="dbt\_project.yml" %}

```yml
vars:
  anomaly_direction: both
```

{% endtab %}
{% endtabs %}

***

### training\_period

```yaml
training_period:
  period: < time period > # supported periods: day, week, month
  count: < number of periods >
```

The maximal timeframe for which the test will collect data. This timeframe includes the training period and detection period. If a detection delay is defined, the whole training period is being delayed.

{% hint style="info" %}
*Default: `14 days`*

*Relevant tests: Anomaly detection tests with `timestamp_column`*
{% endhint %}

#### Example configuration:

{% tabs %}
{% tab title="Test" %}

```yml
models:
  - name: this_is_a_model
    tests:
      - elementary.volume_anomalies:
          training_period:
            period: day
            count: 30
```

{% endtab %}

{% tab title="Model" %}

```yml
models:
  - name: this_is_a_model
    config:
      elementary:
        detection_delay:
          period: week
          count: 1
```

{% endtab %}

{% tab title="dbt\_project.yml" %}

```yml
vars:
  detection_delay:
    period: month
    count: 1
```

{% endtab %}
{% endtabs %}

<details>

<summary><span data-gb-custom-inline data-tag="emoji" data-code="1f4a1">💡</span><strong>How it works?</strong></summary>

The `training_period` param only works for tests that have `timestamp_column` configuration.

It works differently according to the table materialization:

* **Regular tables and views** - The values of the full `training_period` period is calculated on each run.
* **Incremental models and sources** - The values of the full `training_period` period is calculated on the first test run, and on full refresh. The following test runs will only calculate the values of the `detection_period` period.

**Changes from default:**

* **Full time buckets** - Elementary will increase the `training_period` automatically to insure full time buckets. For example if the `time_bucket` of the test is `period: week`, and 14 days `training_period` result in Tuesday, the test will collect 2 more days back to complete a week (starting on Sunday).
* **Seasonality training set** - If seasonality is configured, Elementary will increase the `training_period` automatically to ensure there are enough training set values to calculate an anomaly. For example if the `seasonality` of the test is `day_of_week`, `training_period` will be increased to ensure enough Sundays, Mondays, Tuesdays, etc. to calculate an anomaly for each.

[**​**](https://docs.elementary-data.com/data-tests/anomaly-detection-configuration/training-period#the-impact-of-changing-training-period)**The impact of changing `training_period`**

If you **increase `training_period`** your test training set will be larger. This means a larger sample size for calculating the expected range, which should make the test less sensitive to outliers. This means less chance of false positive anomalies, but also less sensitivity so anomalies have a higher threshold.

If you **decrease `training_period`** your test training set will be smaller. This means a smaller sample size for calculating the expected range, which might make the test more sensitive to outliers. This means more chance of false positive anomalies, but also more sensitivity as anomalies have a lower threshold.

</details>

***

### detection\_period

```yaml
detection_period:
  period: < time period > # supported periods: day, week, month
  count: < number of periods >
```

Configuration to define the detection period. If the detection\_period are set to 2 days, only data points in the last 2 days will be included in the detection period and could be flagged anomalous. If detection\_period is set to 7 days, the detection period will be 7 days long.

For incremental models, this is also the period for re-calculating metrics. If metrics for buckets in the detection period were already calculated, Elementary will overwrite them. The reason behind it is to monitor recent backfills of data, if there were any. This configuration should be changed according to your data delays.

{% hint style="info" %}
*Default: `2 days`*

*Relevant tests: Anomaly detection tests with `timestamp_column`*
{% endhint %}

#### Example configuration:

{% tabs %}
{% tab title="Test" %}

```yml
models:
  - name: this_is_a_model
    tests:
      - elementary.volume_anomalies:
          detection_period:
            period: day
            count: 30
```

{% endtab %}

{% tab title="Model" %}

```yml
models:
  - name: this_is_a_model
    config:
      elementary:
        detection_period:
          period: month
          count: 1
```

{% endtab %}

{% tab title="dbt\_project.yml" %}

```yml
vars:
  detection_period:
    period: week
    count: 2
```

{% endtab %}
{% endtabs %}

<details>

<summary><span data-gb-custom-inline data-tag="emoji" data-code="1f4a1">💡</span><strong>How it works?</strong></summary>

The `detection_period` param only works for tests that have `timestamp_column` configuration.

It works differently according to the table materialization:

* **Regular tables and views** - `detection_period` defines the detection period.
* **Incremental models and sources** - `detection_period` defines the detection period, and the period for which metrics will be re-calculated

</details>

***

### time\_bucket

```yaml
time_bucket:
  period: < time period > # supported periods: hour, day, week, month
  count: < number of periods >
```

This configuration controls the duration of the time buckets.

To calculate how data changes over time and detect issues, we split the data into consistent time buckets. For example, if we use daily (period=`day`, count=`1`) time bucket and monitor for row count anomalies, we will count new rows per day.

Depending on the nature of your data, it may make sense to modify this parameter. For example, if you want to detect volume anomalies in an hourly resolution, you should set the time bucket to period=`hour` and count=`1`.

{% hint style="info" %}
*Default: daily buckets. `time_bucket: {period: day, count: 1}`*

*Relevant tests: Anomaly detection tests with `timestamp_column`*
{% endhint %}

#### Example configuration:

{% tabs %}
{% tab title="Test" %}

```yml
models:
  - name: this_is_a_model
    tests:
      - elementary.volume_anomalies:
          time_bucket:
            period: day
            count: 2
```

{% endtab %}

{% tab title="Model" %}

```yml
models:
  - name: this_is_a_model
    config:
      elementary:
        time_bucket:
          period: hour
          count: 4
```

{% endtab %}

{% tab title="dbt\_project.yml" %}

```yml
vars:
  time_bucket:
    period: hour
    count: 12
```

{% endtab %}
{% endtabs %}

<details>

<summary><span data-gb-custom-inline data-tag="emoji" data-code="1f4a1">💡</span><strong>How it works?</strong></summary>

* The `training_period` and `detection_period` of the test might be extended to ensure full time buckets (for example, full week Sunday-Saturday).
* Weekly buckets start at the day that is configured as week start on the data warehouse.

</details>

***

### seasonality

`seasonality: day_of_week | hour_of_day | hour_of_week`

Some data sets have patterns that repeat over a time period, and are expected. This is the normal behavior of these data sets. This means that when we try to detect outliers from the normal and expected range, ignoring this patterns might cause false positives or make us miss anomalies. The seasonality configuration is used to overcome this challenge and account for expected patterns.

**Supported seasonality configurations:**

* `day_of_week` - Uses the same day of week as a training set for each daily bucket (Compares Sunday to Sundays, Monday to Mondays, etc.).
* `hour_of_day` - Uses the same hour as a training set for each hourly bucket (For example will compare 10:00-11:00AM to 10:00-11:00AM on previous days, instead of any previous hour).
* `hour_of_week` - Uses the same hour and day of week as a training set for each hourly bucket (For example will compare 10:00-11:00AM on Sunday to 10:00-11:00AM on previous sundays).

**Use case:**

Many data sets have lower volume over the weekend, and higher volume over the week days. This means that the expected range for different days of the week is different. The `day_of_week` seasonality uses the same day of week as a training set for each daily time bucket data point. The expected range for Monday will be based on a training set of previous Mondays, and so on.

{% hint style="info" %}
*Default: `none`*

*Supported values: `day_of_week`, `hour_of_day`, `hour_of_week`*

*Relevant tests: Anomaly detection tests with `timestamp_column` and 1 day `time_bucket`*
{% endhint %}

#### Example configuration:

{% tabs %}
{% tab title="Test" %}

```yml
models:
  - name: this_is_a_model
    tests:
      - elementary.volume_anomalies:
          seasonality: day_of_week
```

{% endtab %}

{% tab title="Model" %}

```yml
models:
  - name: this_is_a_model
    config:
      elementary:
        seasonality: day_of_week
```

{% endtab %}

{% tab title="dbt\_project.yml" %}

```yml
vars:
  seasonality: day_of_week
```

{% endtab %}
{% endtabs %}

<details>

<summary><span data-gb-custom-inline data-tag="emoji" data-code="1f4a1">💡</span><a href="https://docs.elementary-data.com/data-tests/anomaly-detection-configuration/seasonality#how-it-works"><strong>​</strong></a><strong>How it works?</strong></summary>

* The test will compare the value of a bucket to previous bucket with the same seasonality attribute, and not to the adjacent previous data points.
* The `training_period` of the test will be changed by default to assure a minimal training set. When `seasonality: day_of_week` is configured, `training_period` is by default multiplied by 7.

</details>

***

### column\_anomalies

`column_anomalies: [column monitors list]`

Select which monitors to activate as part of the test.

{% hint style="info" %}
*Default: default monitors*

*Relevant tests: `all_column_anomalies`, `column_anomalies`*

*Configuration level: test*
{% endhint %}

#### Example configuration:

{% tabs %}
{% tab title="Test" %}

```yml
models:
  - name: this_is_a_model
    tests:
      - elementary.column_anomalies:
          column_anomalies:
            - null_count
            - missing_count
            - average
```

{% endtab %}
{% endtabs %}

**Default monitors by type:**

| Data quality metric  | Column Type |
| -------------------- | ----------- |
| `null_count`         | any         |
| `null_percent`       | any         |
| `min_length`         | string      |
| `max_length`         | string      |
| `average_length`     | string      |
| `missing_count`      | string      |
| `missing_percent`    | string      |
| `min`                | numeric     |
| `max`                | numeric     |
| `average`            | numeric     |
| `zero_count`         | numeric     |
| `zero_percent`       | numeric     |
| `standard_deviation` | numeric     |
| `variance`           | numeric     |

**Opt-in monitors by type:**

| Data quality metric | Column Type |
| ------------------- | ----------- |
| `sum`               | numeric     |

***

### exclude\_prefix

`exclude_prefix: [string]`

Param for the `all_columns_anomalies` test only, which enables to exclude a column from the tests based on prefix match.

{% hint style="info" %}
*Default: `None`*

*Relevant tests: `all_column_anomalies`*

*Configuration level: test*
{% endhint %}

#### Example configuration:

{% tabs %}
{% tab title="Test" %}

```yml
models:
  - name: this_is_a_model
    tests:
      - elementary.column_anomalies:
          exclude_prefix: "id_"
```

{% endtab %}
{% endtabs %}

***

### exclude\_regexp

`exclude_regexp: [regex]`

Param for the `all_columns_anomalies` test only, which enables to exclude a column from the tests based on regular expression match.\\

{% hint style="info" %}
*Default: `None`*

*Relevant tests: `all_column_anomalies`*

*Configuration level: test*
{% endhint %}

#### Example configuration:

{% tabs %}
{% tab title="Test" %}

```yml
models:
  - name: this_is_a_model
    tests:
      - elementary.column_anomalies:
          exclude_regexp: ".*SDC$"
```

{% endtab %}
{% endtabs %}

***

### dimensions

`dimensions: [list of SQL expressions]`

Configuration for the test `dimension_anomalies`. The test counts rows grouped by given column / columns / valid select sql expression. Under `dimensions` you can configure the group by expression.

This test monitors the frequency of values in the configured dimension over time, and alerts on unexpected changes in the distribution. It is best to configure it on low-cardinality fields.

{% hint style="info" %}
*Default: `None`*

*Relevant tests: `dimension_anomalies`*

*Configuration level: test*
{% endhint %}

#### Example configuration:

{% tabs %}
{% tab title="Test" %}

```yml
models:
  - name: model_name
    config:
      elementary:
        timestamp_column: updated_at
    tests:
      - elementary.dimension_anomalies:
          dimensions:
            - device_os
            - device_browser
```

{% endtab %}
{% endtabs %}

***

### event\_timestamp\_column

`event_timestamp_column: [column name]`

Configuration for the test `event_freshness_anomalies`. This test compliments the freshness\_anomalies test and is primarily intended for data that is updated in a continuous / streaming fashion.

The test can work in a couple of modes:

* If only an event\_timestamp\_column is supplied, the test measures over time the difference between the current timestamp (“now”) and the most recent event timestamp.
* If both an event\_timestamp\_column and an update\_timestamp\_column are provided, the test will measure over time the difference between these two columns.

{% hint style="info" %}
*Default: `None`*

*Relevant tests: `event_freshness_anomalies`*

*Configuration level: test*
{% endhint %}

#### Example configuration:

{% tabs %}
{% tab title="Test" %}

```yml
models:
  - name: this_is_a_model
    tests:
      - elementary.event_timestamp_column:
          event_timestamp_column: "event_timestamp"
          update_timestamp_column: "created_at"
```

{% endtab %}
{% endtabs %}

***

### update\_timestamp\_column

`update_timestamp_column: [column name]`

Configuration for the test `event_freshness_anomalies`. This test compliments the freshness\_anomalies test and is primarily intended for data that is updated in a continuous / streaming fashion.

The test can work in a couple of modes:

* If only an event\_timestamp\_column is supplied, the test measures over time the difference between the current timestamp (“now”) and the most recent event timestamp.
* If both an event\_timestamp\_column and an update\_timestamp\_column are provided, the test will measure over time the difference between these two columns.

{% hint style="info" %}
*Default: `None`*

*Relevant tests: `event_freshness_anomalies`*

*Configuration level: test*
{% endhint %}

#### Example configuration:

{% tabs %}
{% tab title="Test" %}

```yml
models:
  - name: this_is_a_model
    tests:
      - elementary.event_timestamp_column:
          event_timestamp_column: "event_timestamp"
          update_timestamp_column: "created_at"
```

{% endtab %}
{% endtabs %}

***

### ignore\_small\_changes

```yaml
ignore_small_changes:
  spike_failure_percent_threshold: [int]
  drop_failure_percent_threshold: [int]
```

If defined, an anomaly test will fail only if all the following conditions hold:

* The z-score of the metric within the detection period is anomoulous
* One of the following holds:
  * The metric within the detection period is higher than `spike_failure_percent_threshold` percentages of the mean value in the training period, if defined.
  * The metric within the detection period is lower than `drop_failure_percent_threshold` percentages of the mean value in the training period, if defined

Those settings can help to deal with situations where your metrics are stable and small changes causes to high z-scores, and therefore to anomaly.

If undefined, default is null for both spike and drop.

{% hint style="info" %}
*Default: `none`*

*Relevant tests: All anomaly detection tests*
{% endhint %}

#### Example configuration:

{% tabs %}
{% tab title="Test" %}

```yml
models:
  - name: this_is_a_model
    tests:
      - elementary.volume_anomalies:
          ignore_small_changes:
            spike_failure_percent_threshold: 2
            drop_failure_percent_threshold: 50
```

{% endtab %}

{% tab title="Model" %}

```yml
models:
  - name: this_is_a_model
    config:
      elementary:
        ignore_small_changes:
          spike_failure_percent_threshold: 2
```

{% endtab %}

{% tab title="Source" %}

```yml
sources:
  - name: my_non_dbt_tables
    schema: raw
    tables:
      - name: source_table
        meta:
          elementary:
            ignore_small_changes:
              drop_failure_percent_threshold: 50
```

{% endtab %}

{% tab title="dbt\_project.yml" %}

```yml
vars:
  ignore_small_changes:
    spike_failure_percent_threshold: 10
```

{% endtab %}
{% endtabs %}

***

### fail\_on\_zero

`fail_on_zero: true/false`

Elementary anomaly detection tests will fail if there is a zero metric value within the detection period. If undefined, default is false.

{% hint style="info" %}
*Default: `false`*

*Relevant tests: All anomaly detection tests*
{% endhint %}

#### Example configuration:

{% tabs %}
{% tab title="Test" %}

```yml
models:
  - name: this_is_a_model
    tests:
      - elementary.volume_anomalies:
          fail_on_zero: true
```

{% endtab %}

{% tab title="Model" %}

```yml
models:
  - name: this_is_a_model
    config:
      elementary:
        fail_on_zero: true
```

{% endtab %}

{% tab title="dbt\_project.yml" %}

```yml
vars:
  fail_on_zero: true
```

{% endtab %}
{% endtabs %}

***

### detection\_delay

```yaml
detection_delay:
  period: < time period > # supported periods: hour, day, week, month
  count: < number of periods >
```

The duration for retracting the detection period. That’s useful in cases which the latest data should be excluded from the test. For example, this can happen because of scheduling issues- if the test is running before the table is populated for some reason. The detection delay is the period of time to ignore, after the detection period.

{% hint style="info" %}
*Default: `0`*

*Relevant tests: Anomaly detection tests with `timestamp_column`*
{% endhint %}

#### Example configuration:

{% tabs %}
{% tab title="Test" %}

```yml
models:
  - name: this_is_a_model
    tests:
      - elementary.volume_anomalies:
          detection_delay:
            period: day
            count: 1
```

{% endtab %}

{% tab title="Model" %}

```yml
models:
  - name: this_is_a_model
    config:
      elementary:
        detection_delay:
          period: day
          count: 1
```

{% endtab %}

{% tab title="dbt\_project.yml" %}

```yml
vars:
  detection_delay:
    period: day
    count: 1
```

{% endtab %}
{% endtabs %}

<details>

<summary><span data-gb-custom-inline data-tag="emoji" data-code="1f4a1">💡</span><strong>How it works?</strong></summary>

The `detection_delay` param only works for tests that have `timestamp_column` configuration. It does not affect the other duration parameters, like `detection_period` or `training_period`.

</details>

***

### anomaly\_exclude\_metrics

`anomaly_exclude_metrics: [SQL where expression on fields metric_date / metric_time_bucket / metric_value]`

By default, data points are compared to the all data points in the training set. Using this param, you can exclude metrics from the training set, to improve the test accuracy.

The filter can be configured using an SQL where expression syntax, and the following fields:

1. `metric_date` - The date of the relevant bucket (even if the bucket is not daily).
2. `metric_time_bucket` - The exact time bucket.
3. `metric_value` - The value of the metric.

{% hint style="info" %}
*Supported values: valid SQL where expression on the columns metric\_date / metric\_time\_bucket / metric\_value*

*Relevant tests: All anomaly detection tests*
{% endhint %}

#### Example configuration:

{% tabs %}
{% tab title="Test" %}

```yml
models:
  - name: this_is_a_model
    tests:
      - elementary.volume_anomalies:
          anomaly_exclude_metrics: metric_value < 10

      - elementary.all_columns_anomalies:
          column_anomalies:
            - null_count
            - missing_count
            - zero_count
          anomaly_exclude_metrics: metric_time_bucket >= '2023-10-01 06:00:00' and metric_time_bucket <= '2023-10-01 07:00:00'
```

{% endtab %}

{% tab title="Model" %}

```yml
models:
  - name: this_is_a_model
    config:
      elementary:
        anomaly_exclude_metrics: metric_date = '2023-10-01'
```

{% endtab %}

{% tab title="dbt\_project.yml" %}

```yml
vars:
  anomaly_exclude_metrics: metric_date = '2023-10-01'
```

{% endtab %}
{% endtabs %}


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.paradime.io/app-help/documentation/integrations/observability/elementary-data/anomaly-detection-tests/anomaly-tests-parameters.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
