Event freshness anomalies


Monitors the freshness of event data over time, as the expected time it takes each event to load - that is, the time between when the event actually occurs (the event timestamp), and when it is loaded to the database (the update timestamp).

This test compliments the freshness_anomalies test and is primarily intended for data that is updated in a continuous / streaming fashion.

The test can work in a couple of modes:

  • If only an event_timestamp_column is supplied, the test measures over time the difference between the current timestamp (“now”) and the most recent event timestamp.

  • If both an event_timestamp_column and an update_timestamp_column are provided, the test will measure over time the difference between these two columns.

  - name: < model name >
      - elementary.event_freshness_anomalies:
          event_timestamp_column: < timestamp column > # Mandatory
          update_timestamp_column: < timestamp column > # Optional
          where_expression: < sql expression >
          time_bucket: # Daily by default
            period: < time period >
            count: < number of periods >

Test configuration

Required configuration: event_timestamp_column Default configuration: anomaly_direction: spike to alert only on delays.

  — elementary.event_freshness_anomalies:
    event_timestamp_column: column name
    update_timestamp_column: column name
    where_expression: sql expression
    anomaly_sensitivity: int
      period: [hour | day | week | month]
      count: int
      period: [hour | day | week | month]
      count: int
      period: [hour | day | week | month]
      count: int
    seasonality: day_of_week
      period: [hour | day | week | month]
      count: int
      spike_failure_percent_threshold: int
      drop_failure_percent_threshold: int
    anomaly_exclude_metrics: [SQL expression]

Last updated