Dimension anomalies
The elementary.dimension_anomalies test counts rows grouped by given dimensions (columns/expressions). It monitors the frequency of values in the configured dimension over time and alerts on unexpected changes in the distribution. This test is best configured on low-cardinality fields.
How it works
If
timestamp_columnis configured, the distribution is collected pertime_bucket.If not, it counts the total rows per dimension.
The test alerts on unexpected changes in the distribution of dimension values over time.
models:
- name: < model name >
config:
elementary:
timestamp_column: < timestamp column >
tests:
- elementary.dimension_anomalies:
dimensions: < columns or sql expressions of columns >
# optional - configure a where a expression to accurate the dimension monitoring
where_expression: < sql expression >
time_bucket: # Daily by default
period: < time period >
count: < number of periods >Test configuration
tests:
— elementary.dimension_anomalies:
dimensions: sql expression
timestamp_column: column name
where_expression: sql expression
anomaly_sensitivity: int
anomaly_direction: [both | spike | drop]
detection_period:
period: [hour | day | week | month]
count: int
training_period:
period: [hour | day | week | month]
count: int
time_bucket:
period: [hour | day | week | month]
count: int
seasonality: day_of_week
detection_delay:
period: [hour | day | week | month]
count: int
ignore_small_changes:
spike_failure_percent_threshold: int
drop_failure_percent_threshold: int
anomaly_exclude_metrics: [SQL expression]Important Notes
Required configuration:
dimensionsThe test is best suited for low-cardinality fields.
If
timestamp_columnis not configured, the test will monitor without time filtering.Tags can be used to run elementary tests on a dedicated run.
Severity can be optionally changed in the config section.
The
where_expressioncan be used to refine the scope of dimension monitoring.
Last updated
Was this helpful?