Freshness anomalies

The elementary.freshness_anomalies test monitors the freshness of your table over time, measuring the expected time between data updates. Monitors the freshness of your table over time, as the expected time between data updates.

How it works

  1. Data is split into time buckets (daily by default, configurable with the time_bucket field).

  2. The maximum freshness value is computed per bucket for the last training_period (14 days by default).

  3. The test compares the freshness of each bucket within the detection period (last 2 days by default, controlled by the detection_period var) to the freshness of previous time buckets.

  4. If any anomalies are detected during the detection period, the test will fail.

models:
  - name: < model name >
    tests:
      - elementary.freshness_anomalies:
          timestamp_column: < timestamp column > # Mandatory
          where_expression: < sql expression >
          time_bucket: # Daily by default
            period: < time period >
            count: < number of periods >

Test configuration

tests:
  — elementary.freshness_anomalies:
    timestamp_column: column name
    where_expression: sql expression
    anomaly_sensitivity: int
    detection_period:
      period: [hour | day | week | month]
      count: int
    training_period:
      period: [hour | day | week | month]
      count: int
    time_bucket:
      period: [hour | day | week | month]
      count: int
    detection_delay:
      period: [hour | day | week | month]
      count: int
    ignore_small_changes:
      spike_failure_percent_threshold: int
      drop_failure_percent_threshold: int
    anomaly_exclude_metrics: [SQL expression]

Notes:

  • Required Configuration: timestamp_column

  • Default configuration: anomaly_direction: spike to alert only on delays.

Last updated