How to Consider Different Baselines for Different Monitoring Use-Cases

In specific cases where some data segments can create a skew in metrics' distributions (e.g., data coming from pilot customers or from a beta environment), Mona allows you to exclude or include specific data segments from the baseline, making sure insights are generated only due to calculations on relevant data.

Exclude Segments

For example, let's say we have a language detection model (see more [here](doc:how-to-create-relevant-metrics-for-classification-models#multi-class-classification)) which determines what language is written in a text message. Let's suppose that we are only trying to optimize our model's performance on texts longer than 50 characters.

However, in the data exported to Mona are all the text messages no matter the number of characters.

If we use all the data as a baseline for our verses, the text messages with under 50 characters (which have less accurate predictions, lower scores and that we generally aren't trying to optimize for), will create a skew in the data, which will affect the insights generated by Mona.

In order to exclude these messages specifically when monitoring relevant language detection metrics, all of Mona's verses support a param called "exclude_segments" which is defined in the following way:

{
  "YOUR-USER-ID": {
    "TEXT": {
      "fields": {...},
      "stanzas": {
        "name_of_stanza": {
          "exclude_segments": [
            {
              "num_of_characters": [
                {
                  "min_value": 0,
                  "max_value": 50
                }
              ]
            }
          ],
          "metrics": [
            "model_selected_language_score"
          ],
          "segment_by": [
            "country_of_text",
            "customer-id"
          ],
          "verses": [
            {
              "type": "AverageDrift",
              "min_anomaly_level": 0.25,
              "target_set_period": "14d",
              "benchmark_set_period": "46d"
            }
          ]
        }
      }
    }
  }
}

Baseline Segment

Another way of choosing the specific data we want to check, is to determine what data the baseline will consist of to begin with.
For example, let's say we only want to check which language is detected in text messages which were sent in France. We can use "baseline_segment" param to that end in the following way:

{
  "YOUR-USER-ID": {
    "TEXT": {
      "fields": {...},
      "stanzas": {
        "name_of_stanza": {
          "baseline_segment": {
            "country_of_text": [
              {
                "value": "France"
              }
            ]
          },
          "metrics": [
            "model_selected_language_score"
          ],
          "segment_by": [
            "model_selected_language"
          ],
          "verses": [
            {
              "type": "AverageDrift",
              "min_anomaly_level": 0.25,
              "target_set_period": "14d",
              "benchmark_set_period": "46d"
            }
          ]
        }
      }
    }
  }
}

In this use-case, the insights that will be generated will be only insights with France as their baseline.

Always Segment Baseline By

In some cases, we might want to consider different segments completely separately due to real world considerations.

For example, let's suppose that we are training a different model per customer, and that therefore mixing data from different customers for measuring language detection performance metrics makes no sense.

To that end, we can use a param called always_segment_baseline_by in the following manner:

{
  "YOUR-USER-ID": {
    "TEXT": {
      "fields": {...},
      "stanzas": {
        "name_of_stanza": {
          "always_segment_baseline_by": [
            "customer-id"
          ],
          "metrics": [
            "model_selected_language_score"
          ],
          "segment_by": [
            "model_selected_language"
          ],
          "verses": [
            {
              "type": "AverageDrift",
              "min_anomaly_level": 0.25,
              "target_set_period": "14d",
              "benchmark_set_period": "46d"
            }
          ]
        }
      }
    }
  }
}

Now any generated insight will only consider data from a specific customer and will have a specific "customer-id" value as its baseline.