Volume anomalies
The elementary.volume_anomalies
test monitors the row count of your table over time per time bucket. If configured without a timestamp_column
, it will count total table rows.
How it works
Data is split into time buckets (daily by default, configurable with the
time_bucket
field).Row count is computed per bucket for the last
training_period
days (14 days by default).The test compares the row count of each bucket within the detection period (last 2 days by default, configured as
detection_period
) to the row count of previous time buckets.The test only runs on completed time buckets. For example, with daily buckets, a test run in the middle of today would only count yesterday as a complete bucket.
If any anomalies are detected during the detection period, the test will fail.
Configuration
models:
- name: < model name >
tests:
- elementary.volume_anomalies:
timestamp_column: < timestamp column >
where_expression: < sql expression >
time_bucket: # Daily by default
period: < time period >
count: < number of periods >
Test configuration
No mandatory configuration, however it is highly recommended to configure a timestamp_column
.
tests:
— elementary.volume_anomalies:
timestamp_column: column name
where_expression: sql expression
anomaly_sensitivity: int
anomaly_direction: [both | spike | drop]
detection_period:
period: [hour | day | week | month]
count: int
training_period:
period: [hour | day | week | month]
count: int
time_bucket:
period: [hour | day | week | month]
count: int
seasonality: day_of_week
fail_on_zero: [true | false]
ignore_small_changes:
spike_failure_percent_threshold: int
drop_failure_percent_threshold: int
detection_delay:
period: [hour | day | week | month]
count: int
anomaly_exclude_metrics: [SQL expression]
Last updated
Was this helpful?