Volume anomalies

The elementary.volume_anomalies test monitors the row count of your table over time per time bucket. If configured without a timestamp_column, it will count total table rows.

How it works

  1. Data is split into time buckets (daily by default, configurable with the time_bucket field).

  2. Row count is computed per bucket for the last training_period days (14 days by default).

  3. The test compares the row count of each bucket within the detection period (last 2 days by default, configured as detection_period) to the row count of previous time buckets.

  4. The test only runs on completed time buckets. For example, with daily buckets, a test run in the middle of today would only count yesterday as a complete bucket.

  5. If any anomalies are detected during the detection period, the test will fail.

Configuration

models:
  - name: < model name >
    tests:
      - elementary.volume_anomalies:
          timestamp_column: < timestamp column >
          where_expression: < sql expression >
          time_bucket: # Daily by default
            period: < time period >
            count: < number of periods >

Test configuration

No mandatory configuration, however it is highly recommended to configure a timestamp_column.

Last updated

Was this helpful?