Drift Detection

The first verse type we will use is "AvergeDrift". This verse type configures Mona to find segments, in which a metric's average differs significantly between a target data set and a benchmark data set.

In our case, we want to track the average of the following metrics:
"credit_score", "offered_amount", "approved_amount", "credit_label_delta" and "offered_approved_delta_normalized".
We want to track these metrics in different segments of our data, using several segmentations, such as: "occupation", "purpose", "stage", "model_version" and "city".

Create your first Stanza and add your first Verse

Let's define this verse via Mona's config GUI.
Each verse you define in Mona must be inside a stanza. Each stanza groups verses with similar parameters and can hold as many verses as needed. So let's first create our stanza.

On the configurations page, under the "Stanzas" tab, click on "Add stanza". Now you can name the stanza. We will call it "general".

Now let's create our verse. Under the "Verses" tab, click on the add button.
On the left, you will see a list of all possible verse types. Let's choose AverageDrift.

The verse window is divided into different categories, for different possible params that can be defined in the verse. The first category is "Basic", and it holds all the basic params needed to configure the verse.

📘

All verse params have a default value which will be defined if no other value is given to override

Define "Basic" verse params

In the following params, click on "override" and add the following:

metrics - "credit_score", "offered_amount", "approved_amount", "credit_label_delta", "offered_approved_delta_normalized".
segment_by - "occupation", "purpose", "stage", "model_version", "city".
min_anomaly_level - 0.2.
min_segment_size_fraction - 0.02.
trend_directions - "desc".

All verses added or edited in the GUI will reflect in the configuration JSON file, which can be downloaded on the configurations page.
Here is how this verse will be defined in our JSON config file:

{
  "stanzas": {
    "general": {
      "verses": [
        {
          "type": "AverageDrift",
          "trend_directions": [
            "desc"
          ],
          "metrics": [
            "credit_score",
            "offered_amount",
            "approved_amount",
            "credit_label_delta",
            "offered_approved_delta_normalized"
          ],
          "segment_by": [
            "occupation",
            "purpose",
            "stage",
            "model_version",
            "city"
          ],
          "target_set_period": "2w",
          "benchmark_set_period": "6w",
          "min_anomaly_level": 0.2,
          "min_segment_size_fraction": 0.02
        }
      ]
    }
  }
}

With this verse configuration, we are overriding the default values of AverageDrift (defined under "type"), and we are looking for statistically significant decreases (not increases due to the "trend_directions" param) in the average of the given "metrics" in any specific values of the segmentation fields ("segment_by") (or any intersection of values).
For time periods, we are using the default values which are 2 weeks in the "target" dataset, and 6 weeks as the "benchmark" dataset.
We are using "min_anomaly_level" to define that drifts occur when the change in averages between the benchmark and target sets is at least 0.2 standard deviations.
The "min_segment_size_fraction" param will filter out segments that are smaller than 2% of the data.

📘

Low thresholds

Note that we are using low thresholds such as min_anomaly_level and min_segment_size_fraction in order to get insights. After getting insights, users can raise the thresholds to get only relevant and significant insights.

Save new verse and stanza

Once all params have been defined, click on "Add verse", and then "Add stanza".

Once this is defined and saved in the config, Mona will start searching for anomalies that match these parameters. When done, new insights will be generated on the insights page.
You can configure Mona to send notifications on new insights via Email, Slack, PagerDuty, and more. We will go over this in the next chapters.

Check new insights

AvergeDrift insights that match these params will look like this:

This insight shows a drift in the average of "credit_score" when looking at the segment "purpose": "education" and "city": "CA_Pasadena". The drift is a decline from 0.58 to 0.46 and has an anomaly level of 0.41.

When clicking on the insight card, you will open the single insight page which will show you additional information regarding this anomaly.

Besides the data shown also in the insight card, you can also see here the distribution of values for "credit_score".

More information on how to read an insight can be found here


Did this page help you?