How to Detect Data Drifts Over Time
One of Mona's core capabilities is the possibility to detect drifts in data over time.
This is possible with the AverageDrift verse.
Here is an example of configuring Mona to detect data drifts.
{
"stanzas": {
"general": {
"metrics": [
"output_score",
"confidence_interval"
],
"segment_by": [
"customer_id",
"social_media_platform"
],
"verses": [
{
"type": "AverageDrift",
"min_anomaly_level": 0.25,
"min_segment_size": 5000,
"target_set_period": "3w",
"benchmark_set_period": "8w"
}
]
}
}
}
This verse configures Mona to look for drifts in the average of the aforementioned metrics in each customer_id and each social_media_platform in the last 3 weeks, compared to the 8 weeks prior to the last 3, that have an anomaly level of at least 0.25 (i.e the diff between target and benchmark averages, normalized by overall STD) and have over 5000 instances.
When Mona finds anomalies which adhere to these limitations, a new insight will be generated in the insights page and an alert will be triggered. (If email alerting is set, an email will be sent to the user).
The new insight based on this verse will look like this:
This insight was generated for the metric "output_score" for a specific customer_id and shows a drift from an average value of 0.08 to 0.11. An anomaly level of 0.33.
Updated about 2 years ago