SegmentSizeDrift

Description

'Segment Size Drift' verses configure Mona to find segments, whose size (the number of contexts in this segment - either absolute normalized by the length of the time period it represents, or relative to the baseline segment size), differs significantly between a target data set and a benchmark data set.

In order to measure if such a signal for a given segment should be created we allow two methods: Time-series or Overall drift. Switching between modes is possible with the time_resolution param, and the anomaly level will be calculated differently.

{
  "stanzas": {
    "stanza_name": {
      "verses": [
        {
          "type": "SegmentSizeDrift",
          "segment_baseline_by": [
            "company_id",
            "country"
          ],
          "segment_by": [
            "detected_language"
          ],
          "min_anomaly_level": 0.1,
          "trend_directions": [
            "desc"
          ],
          "time_resolution": "",
          "use_relative_sizes": true,
          "normalize_relative_size_drift": false
          "target_set_period": "7d",
          "benchmark_set_period": "28d"
        }
      ]
    }
  }
}

In this example we see a SegmentSizeDrift verse which is configured to search for statistically significant drop (trend_directions) in the size of any specific "detected_language", within any different values of "country" and "company_id" as the baseline.
This verse looks for changes between a "target" dataset from the last 7 days, and "benchmark" dataset from the 28 days prior to that.
Note that this verse type requires no "metrics" param.

The definition of "time_resolution", "use_relative_sizes" and "normalize_relative_size_drift" as shown above, when the "min_anomaly_level" is 0.1 will cause the verse to produce only insights where the change in size is at least 10% from the benchmark.

Basic Params

see more
benchmark_set_period
NameDescriptionTypeDefault
benchmark_set_periodTime period for benchmark set. By default means the period just before the target set period. Expected format is "" where can be any positive integer, and options currently include: "d" (days), or "w" (weeks). e.g, "1d" means 1 day period.TimePeriodOrEmpty6w
{ "benchmark_set_period": "50d" }
cadence
NameDescriptionTypeDefault
cadenceThe cadence for evaluation of this verse. Only the following cadences are valid: Minutes: 1m, 5m, 10m, 15m, 20m, 30m. Hours: 1h, 2h, 3h, 4h, 6h, 8h, 12h. Days: 1d, 2d, 3d, 4d, 5d, 6d. Weeks: 1w, 2w, 3w, 4w, 5w.Cadence1d
{ "cadence": "6h" }
default_urgency
NameDescriptionTypeDefault
default_urgencyThe urgency class for insights created using this verse. Currently, supports two values: "normal" (default) and "high". If set to "normal", then specific thresholds for "high" urgency can be set using other parameters prefixed with "highurgency". If set to "high", then threshold parameters prefixed with "highurgency" are not considered at all - since all insights of this verse will be considered as having a "high" urgency.Urgencynormal
{ "default_urgency": "high" }
description
NameDescriptionTypeDefault
descriptionVerse description.str
{ "description": "searches for asc drifts in output_score" }
metrics
NameDescriptionTypeDefault
metricsRelevant metrics to search anomalies for in the verse, relevant only for types who search for anomalies in metrics behavior.MetricsList()
{ "metrics": [ "top_score", "delta_top_to_second_score" ] }
min_anomaly_level
NameDescriptionTypeDefault
min_anomaly_levelThis parameter sets the threshold for the minimal anomaly level for which an insight will be generated. Anomaly level in this verse is: In Time-series Drift mode - The difference between the average of target time series and average of benchmark time series, normalized by the standard deviation of the joined timeseries of target and benchmark. In Overall Drift mode - the anomaly level is the diff between target set size and benchmark set size, (normalized by benchmark set size if normalize_relative_size_drift is true).PositiveFloat0.3
{ "min_anomaly_level": 0.5 }
min_segment_size
NameDescriptionTypeDefault
min_segment_sizeMinimal segment size for the united benchmark+target segments.PositiveInt100
{ "min_segment_size": 100 }
min_segment_size_fraction
NameDescriptionTypeDefault
min_segment_size_fractionMinimal segment size in fraction from baseline segment, which a segment must have in order to be considered in the search.InclusiveFraction0
{ "min_segment_size_fraction": 0.05 }
name
NameDescriptionTypeDefault
name(Required) The name of the verse. Please note, a verse's name must be different from other verses in the same stanza.strNone
{ "name": "confidence_outliers" }
segment_by
NameDescriptionTypeDefault
segment_byThe dimensions to use to segment the data in order to search for anomalies. This list must be a sublist of all arc class' dimensions. Limiting the possible values of a specific segmentation field on which insights can be generated can be done using the "avoid_values" and the "include_only_values" keys in the segmentation JSON object, as seen in the example.SegmentationsList()
{ "segment_by": [ "city", "bot_id", {"name": "provider-code", "avoid_values": ["zoom"]}, {"name": "selected-language", "avoid_values": ["eng", "spa"]}, {"name": "country", "include_only_values": ["jpn"]} ] }
target_set_period
NameDescriptionTypeDefault
target_set_periodTime period for the target set, ending on the day of the latest available data. Format detailed in common/util.py's get_time_period_for_string.TimePeriodOrEmpty2w
{ "target_set_period": "1w" }
trend_directions
NameDescriptionTypeDefault
trend_directionsA list of allowed anomalies trends directions - either 'asc' for ascending (anomalies in which the found value is LARGER THAN the relevant benchmark), or 'desc' for descending (anomalies in which the found value is SMALLER THAN the relevant benchmark).TrendDirections('asc', 'desc')
{ "trend_directions": [ "asc" ] }
use_relative_sizes
NameDescriptionTypeDefault
use_relative_sizesIf true, use segment's relative sizes of the relevant baseline segment. If false, use segment's absolute sizes normalized by time period lengths.boolFalse
{ "use_relative_sizes": true }

Advanced Misc Params

see more
avoid_same_field_for_segment_and_metric
NameDescriptionTypeDefault
avoid_same_field_for_segment_and_metricIf True, insights would not be created for segments based on the same field as the given metric.boolTrue
{ "avoid_same_field_for_segment_and_metric": false }
cookbook
NameDescriptionTypeDefault
cookbookInstructions on how to read an insight generated by this verse. Expected format is MarkDown.Cookbook
{ "cookbook": "Use **this param** to add instructions using [markdown](https://daringfireball.net/projects/markdown/syntax) syntax on how to read insights generated from this `verse`, and what should the insight recipient do with it." }
create_extra_adjacent_signals
NameDescriptionTypeDefault
create_extra_adjacent_signalsIf set to true (default), will cause Mona to create new signals from existing signals with adjacent numeric segments. So if there are two signals defined on 1 <= x < 2 and 2 <= x < 3 - Mona will automatically create a new signal with 1 <= x < 3. This will allow the later clustering algorithm to create an insight with the most relevant segment for its main signal.boolTrue
{ "create_extra_adjacent_signals": false }
disabled
NameDescriptionTypeDefault
disabledIf set to True - this verse won't be used when searching for new insights.boolFalse
{ "disabled": true }
expire_after
NameDescriptionTypeDefault
expire_afterInsights detected by this verse will continue to be considered active for at least this amount of time after the last time they were detected.TimePeriodOrEmpty3d
{ "expire_after": "2d" }
normalize_relative_size_drift
NameDescriptionTypeDefault
normalize_relative_size_driftIf False, the anomaly_level will be the difference (and not ratio) between the relative sizes of the target and the benchmark. For example, if the relative size of the benchmark is 0.04 and the target is 0.06, when normalize_relative_size_drift is true (default), the anomaly level will be 0.5 (growth of 50%) while when set to false the anomaly level will be 0.02 (a 2% difference relative to the baseline segment). Note: this can be set to False only when use_relative_sizes is set to true and time_resolution is empty.boolTrue
{ "use_relative_sizes": true, "time_resolution": "", "normalize_relative_size_drift": false }
relevant_data_time_buffer
NameDescriptionTypeDefault
relevant_data_time_bufferAdds an end-time buffer to the insight generation. For example - If this param's value is "1d", then insights are generated for a day before the latest received data. This is useful for processes in which it takes a specific period of time to get all the healthy monitoring data in place.TimePeriodOrEmpty
{ "relevant_data_time_buffer": "1d" }
timestamp_field_name
NameDescriptionTypeDefault
timestamp_field_nameThe field that is used as the time dimension for insight generation.TimestampFieldtimestamp
{ "timestamp_field_name": "run_end_time" }
timezone
NameDescriptionTypeDefault
timezoneThe timezone used to aggregate daily data points. Accepts any IANA time zone ID: (https://en.wikipedia.org/wiki/List_of_tz_database_time_zones)TimezoneUTC
{ "timezone": "Asia/Hong_Kong" }

Advanced Score Calculation Params

see more
score_anomaly_level_exponent
NameDescriptionTypeDefault
score_anomaly_level_exponentAn exponent to put on the anomaly level in the score after multiplying it by the given multiplier.float1
{ "score_anomaly_level_exponent": 0.5 }
score_anomaly_level_multiplier
NameDescriptionTypeDefault
score_anomaly_level_multiplierMultiplier for an anomaly level to use before using the exponent.float1
{ "score_anomaly_level_multiplier": 1.2 }
score_segment_size_exponent
NameDescriptionTypeDefault
score_segment_size_exponentAn exponent to put on the segment's size (or relative size) in the combined score. If score_segment_size_log_base is not 0, the exponent will be applied before the logarithm will.float0.5
{ "score_segment_size_exponent": 1.5 }
score_segment_size_log_base
NameDescriptionTypeDefault
score_segment_size_log_baseChanges the log base used for the segment's size (or relative size) in the combined score, or remove the log altogether by setting 0 here. Unless its 0 this value must be larger than 1.float0
{ "score_segment_size_log_base": 5 }
score_use_segment_absolute_size
NameDescriptionTypeDefault
score_use_segment_absolute_sizeIf true, use the segment absolute size in the combined score, otherwise use the segment's size relative to its baseline (fraction).boolTrue
{ "score_use_segment_absolute_size": false }

Anomaly Thresholds Params

see more
high_urgency_min_anomaly_level
NameDescriptionTypeDefault
high_urgency_min_anomaly_levelThreshold for separating between "high" and "normal" urgency insights with regards to min_anomaly_level. See "min_anomaly_level" param for more details on the functionality of this param.PositiveFloatOrNoneNone
{ "high_urgency_min_anomaly_level": 1.5 }
high_urgency_min_score
NameDescriptionTypeDefault
high_urgency_min_scoreThreshold for separating between "high" and "normal" urgency insights with regards to min_score. See "min_score" param for more details on the functionality of this param.FloatOrNoneNone
{ "high_urgency_min_score": 20 }
min_anomaly_level
NameDescriptionTypeDefault
min_anomaly_levelThis parameter sets the threshold for the minimal anomaly level for which an insight will be generated. Anomaly level in this verse is: In Time-series Drift mode - The difference between the average of target time series and average of benchmark time series, normalized by the standard deviation of the joined timeseries of target and benchmark. In Overall Drift mode - the anomaly level is the diff between target set size and benchmark set size, (normalized by benchmark set size if normalize_relative_size_drift is true).PositiveFloat0.3
{ "min_anomaly_level": 0.5 }
min_score
NameDescriptionTypeDefault
min_scoreThe minimal score for a signal to be considered as an anomaly.float0
{ "min_score": 4 }

Data Filtering Params

see more
avoid_segmenting_on_missing
NameDescriptionTypeDefault
avoid_segmenting_on_missingWhen true, insights will not be generated for segments which are (partially or fully) defined by a missing field.boolFalse
{ "avoid_segmenting_on_missing": true }
baseline_segment
NameDescriptionTypeDefault
baseline_segmentThe baseline segment of this verse. This segment defines "the world" as far as this verse is concerned. Only data from this segment will be considered when finding insights.Segment{}
{ "baseline_segment": { "model_version": [ { "value": "V1" } ] } }
benchmark_baseline_segment
NameDescriptionTypeDefault
benchmark_baseline_segmentBenchmark baseline segment. This segment is intersected with any data we search for in the benchmark segments.Segment{}
{ "benchmark_baseline_segment": { "model_version": [ { "value": "V2" } ] } }
enhance_exclude_segments
NameDescriptionTypeDefault
enhance_exclude_segmentsIf True, when exclude segments are added to any level of configuration (either in the verse, the stanza or the stanzas_global_defaults) they are ADDED to the excluded segments of higher level defaults, if exists any. For example, if we have in stanzas_global_default a single excluded segment of {dimensionA: MISSING}, and the stanza (or verse) has a single excluded segment of {dimensionB: 0}, then if enhance_exclude_segments is True, the excluded segments will include both {dimensionA: MISSING} and {dimensionB: 0} and will filter either one. Otherwise (if enhance_exclude_segments is False) it will be overridden to just the one segment in the verse {dimensionB: 0}.boolFalse
{ "enhance_exclude_segments": true }
exclude_segments
NameDescriptionTypeDefault
exclude_segmentsSegments to exclude in the baseline of this verse. Each data we search for will not include these segments - both tested segments as well as any benchmarks used to find the anomalies. Notice that whether or not this param will override definitions for exclude_segments in other levels is decided by enhance_exclude_segments.SegmentsList()
{ "exclude_segments": [ { "text_length": [ { "min_value": 0, "max_value": 100 } ] } ] }
target_baseline_segment
NameDescriptionTypeDefault
target_baseline_segmentTarget baseline segment. This segment is intersected with any data we search for in the tested segments.Segment{}
{ "target_baseline_segment": { "model_version": [ { "value": "V1" } ] } }

Related Anomalies Params

see more
avoid_related_anomalies_for
NameDescriptionTypeDefault
avoid_related_anomalies_forA list of fields to avoid checking for correlated anomalies to the main anomaly in a generated insight. See "find_related_anomalies_for" for further details.MetricsList()
{ "avoid_related_anomalies_for": ["delta_top_to_second_score"] }
find_related_anomalies_for
NameDescriptionTypeDefault
find_related_anomalies_forA list of fields to check for correlated anomalies to the main anomaly in a generated insight. These correlated anomalies might help with understanding the possible cause of an insight. Leave empty to search in all fields.MetricsList()
{ "find_related_anomalies_for": ["sentiment_score", "confidence_interval"] }
related_anomalies_min_correlation
NameDescriptionTypeDefault
related_anomalies_min_correlationMinimal Pearson correlation between the metric on which an anomaly was found and another metric with an anomaly on the same segment, below which Mona will not use the other metric as a related anomaly.NonNegativeFloat0.3
{ "related_anomalies_min_correlation": 0.5 }

Required Params

see more
name
NameDescriptionTypeDefault
name(Required) The name of the verse. Please note, a verse's name must be different from other verses in the same stanza.strNone
{ "name": "confidence_outliers" }

Segmentation Params

see more
always_segment_baseline_by
NameDescriptionTypeDefault
always_segment_baseline_byA list of dimensions to always segment the baseline segment by. This is useful when separating the world to completely unrelated parts - e.g., in a case where you have a different model developed for each customer and there's no need to look for insights across different customers. Limiting the possible values of a specific segmentation field on which insights can be generated can be done using the "avoid_values" and the "include_only_values" keys in the segmentation JSON object, as seen in the example.SegmentationsList()
{ "always_segment_baseline_by": [ "country", {"name": "city", "avoid_values": ["Tel Aviv"]}, ] }
avoid_segmenting_on_missing
NameDescriptionTypeDefault
avoid_segmenting_on_missingWhen true, insights will not be generated for segments which are (partially or fully) defined by a missing field.boolFalse
{ "avoid_segmenting_on_missing": true }
max_segment_baseline_by_depth
NameDescriptionTypeDefault
max_segment_baseline_by_depthThe maximum number of fields Mona should combine for segmenting the baseline (if "segment_baseline_by" fields given).PositiveInt2
{ "max_segment_baseline_by_depth": 3 }
max_segment_by_depth
NameDescriptionTypeDefault
max_segment_by_depthThe maximum number of fields Mona should combine to create sub-segments to search in. Baseline segment fields and parent fields are "free", and are not counted for depth. Notice this parameter has a exponential effect on the performance and should be kept within SLAs.PositiveInt2
{ "max_segment_by_depth": 3 }
min_segment_baseline_by_depth
NameDescriptionTypeDefault
min_segment_baseline_by_depthThe minimum number of fields Mona should combine for segmenting the baseline (if "segment_baseline_by" fields given).NonNegativeInt0
{ "min_segment_baseline_by_depth": 1 }
min_segment_by_depth
NameDescriptionTypeDefault
min_segment_by_depthThe minimum number of fields Mona should combine to create sub-segments to search in.NonNegativeInt0
{ "min_segment_by_depth": 1 }
segment_baseline_by
NameDescriptionTypeDefault
segment_baseline_byA list of dimensions to potentially segment the baseline segment by. Limiting the possible values of a specific segmentation field on which insights can be generated can be done using the "avoid_values" and the "include_only_values" keys in the segmentation JSON object.SegmentationsList()
{ "segment_baseline_by": [ "model_version" ] }
segment_by
NameDescriptionTypeDefault
segment_byThe dimensions to use to segment the data in order to search for anomalies. This list must be a sublist of all arc class' dimensions. Limiting the possible values of a specific segmentation field on which insights can be generated can be done using the "avoid_values" and the "include_only_values" keys in the segmentation JSON object, as seen in the example.SegmentationsList()
{ "segment_by": [ "city", "bot_id", {"name": "provider-code", "avoid_values": ["zoom"]}, {"name": "selected-language", "avoid_values": ["eng", "spa"]}, {"name": "country", "include_only_values": ["jpn"]} ] }

Size Thresholds Params

see more
baseline_min_segment_size
NameDescriptionTypeDefault
baseline_min_segment_sizeMinimal segment size for the baseline segment.PositiveFloat1
{ "baseline_min_segment_size": 100 }
benchmark_baseline_min_segment_size
NameDescriptionTypeDefault
benchmark_baseline_min_segment_sizeMinimal segment size for the benchmark baseline segment.PositiveFloat1
{ "benchmark_baseline_min_segment_size": 100 }
benchmark_max_segment_size
NameDescriptionTypeDefault
benchmark_max_segment_sizeMaximal benchmark segment size in number of records. Leave empty to not have such a threshold.PositiveIntOrNoneNone
{ "benchmark_max_segment_size": 1000 }
benchmark_max_segment_size_fraction
NameDescriptionTypeDefault
benchmark_max_segment_size_fractionMaximal benchmark segment size in fraction from baseline segment. Leave empty to not have such a threshold.NonInclusiveFractionOrNoneNone
{ "benchmark_max_segment_size_fraction": 0.2 }
benchmark_min_segment_size
NameDescriptionTypeDefault
benchmark_min_segment_sizeMinimal benchmark segment size in number of records.PositiveInt100
{ "benchmark_min_segment_size": 50 }
benchmark_min_segment_size_fraction
NameDescriptionTypeDefault
benchmark_min_segment_size_fractionMinimal benchmark segment size in fraction from baseline segment.InclusiveFraction0
{ "benchmark_min_segment_size_fraction": 0.05 }
high_urgency_baseline_min_segment_size
NameDescriptionTypeDefault
high_urgency_baseline_min_segment_sizeThreshold for separating between "high" and "normal" urgency insights with regards to baseline_min_segment_size. See "baseline_min_segment_size" param for more details on the functionality of this param.PositiveFloatOrNoneNone
{ "high_urgency_baseline_min_segment_size": 1000 }
high_urgency_benchmark_baseline_min_segment_size
NameDescriptionTypeDefault
high_urgency_benchmark_baseline_min_segment_sizeThreshold for separating between "high" and "normal" urgency insights with regards to benchmark_baseline_min_segment_size. See "benchmark_baseline_min_segment_size" param for more details on the functionality of this param.PositiveFloatOrNoneNone
{ "high_urgency_benchmark_baseline_min_segment_size": 1000 }
high_urgency_benchmark_min_segment_size
NameDescriptionTypeDefault
high_urgency_benchmark_min_segment_sizeThreshold for separating between "high" and "normal" urgency insights with regards to benchmark_min_segment_size. See "benchmark_min_segment_size" param for more details on the functionality of this param.PositiveIntOrNoneNone
{ "high_urgency_benchmark_min_segment_size": 500 }
high_urgency_benchmark_min_segment_size_fraction
NameDescriptionTypeDefault
high_urgency_benchmark_min_segment_size_fractionThreshold for separating between "high" and "normal" urgency insights with regards to benchmark_min_segment_size_fraction. See "benchmark_min_segment_size_fraction" param for more details on the functionality of this param.InclusiveFractionOrNoneNone
{ "high_urgency_benchmark_min_segment_size_fraction": 0.3 }
high_urgency_min_segment_size
NameDescriptionTypeDefault
high_urgency_min_segment_sizeThreshold for separating between "high" and "normal" urgency insights with regards to min_segment_size. See "min_segment_size" param for more details on the functionality of this param.PositiveIntOrNoneNone
{ "high_urgency_min_segment_size": 1000 }
high_urgency_min_segment_size_fraction
NameDescriptionTypeDefault
high_urgency_min_segment_size_fractionThreshold for separating between "high" and "normal" urgency insights with regards to min_segment_size_fraction. See "min_segment_size_fraction" param for more details on the functionality of this param.InclusiveFractionOrNoneNone
{ "high_urgency_min_segment_size_fraction": 0.2 }
high_urgency_target_baseline_min_segment_size
NameDescriptionTypeDefault
high_urgency_target_baseline_min_segment_sizeThreshold for separating between "high" and "normal" urgency insights with regards to target_baseline_min_segment_size. See "target_baseline_min_segment_size" param for more details on the functionality of this param.PositiveFloatOrNoneNone
{ "high_urgency_target_baseline_min_segment_size": 500 }
high_urgency_target_min_segment_size
NameDescriptionTypeDefault
high_urgency_target_min_segment_sizeThreshold for separating between "high" and "normal" urgency insights with regards to target_min_segment_size. See "target_min_segment_size" param for more details on the functionality of this param.PositiveIntOrNoneNone
{ "high_urgency_target_min_segment_size": 200 }
high_urgency_target_min_segment_size_fraction
NameDescriptionTypeDefault
high_urgency_target_min_segment_size_fractionThreshold for separating between "high" and "normal" urgency insights with regards to target_min_segment_size_fraction. See "target_min_segment_size_fraction" param for more details on the functionality of this param.InclusiveFractionOrNoneNone
{ "high_urgency_target_min_segment_size_fraction": 0.2 }
max_segment_size
NameDescriptionTypeDefault
max_segment_sizeMaximal segment size which a segment must have (bigger segments won't be considered in the search). Leave empty to not have such a threshold.PositiveIntOrNoneNone
{ "max_segment_size": 10000 }
max_segment_size_fraction
NameDescriptionTypeDefault
max_segment_size_fractionMaximal segment size in fraction from baseline segment, which a segment must have. Leave empty to not have such a threshold.NonInclusiveFractionOrNoneNone
{ "max_segment_size_fraction": 0.2 }
min_exist_freq
NameDescriptionTypeDefault
min_exist_freqThe minimum fraction of the timeseries frames in which the segment must appear. see example in time_series_verse.py.InclusiveFraction0.25
{ "min_exist_freq": 0.5 }
min_segment_size
NameDescriptionTypeDefault
min_segment_sizeMinimal segment size for the united benchmark+target segments.PositiveInt100
{ "min_segment_size": 100 }
min_segment_size_fraction
NameDescriptionTypeDefault
min_segment_size_fractionMinimal segment size in fraction from baseline segment, which a segment must have in order to be considered in the search.InclusiveFraction0
{ "min_segment_size_fraction": 0.05 }
target_baseline_min_segment_size
NameDescriptionTypeDefault
target_baseline_min_segment_sizeMinimal segment size for the target baseline segment.NonNegativeFloat0
{ "target_baseline_min_segment_size": 0.8 }
target_max_segment_size
NameDescriptionTypeDefault
target_max_segment_sizeMaximal target segment size in number of records. Leave empty to not have such a threshold.PositiveIntOrNoneNone
{ "target_max_segment_size": 10000 }
target_max_segment_size_fraction
NameDescriptionTypeDefault
target_max_segment_size_fractionMaximal target segment size in fraction from baseline segment. Leave empty to not have such a threshold.NonInclusiveFractionOrNoneNone
{ "target_max_segment_size_fraction": 0.2 }
target_min_segment_size
NameDescriptionTypeDefault
target_min_segment_sizeMinimal target segment size in number of records.NonNegativeInt0
{ "target_min_segment_size": 200 }
target_min_segment_size_fraction
NameDescriptionTypeDefault
target_min_segment_size_fractionMinimal target segment size in fraction from baseline segment.InclusiveFraction0
{ "target_min_segment_size_fraction": 0.05 }

Time Related Params

see more
benchmark_set_period
NameDescriptionTypeDefault
benchmark_set_periodTime period for benchmark set. By default means the period just before the target set period. Expected format is "" where can be any positive integer, and options currently include: "d" (days), or "w" (weeks). e.g, "1d" means 1 day period.TimePeriodOrEmpty6w
{ "benchmark_set_period": "50d" }
benchmark_set_period_type
NameDescriptionTypeDefault
benchmark_set_period_typeSets the end time of the benchmark set period. Supports 'previous_to_target' (benchmark ends when target starts) and "latest" (both ends on the same date).BenchmarkSetPeriodprevious_to_target
{ "benchmark_set_period_type": "latest" }
cadence
NameDescriptionTypeDefault
cadenceThe cadence for evaluation of this verse. Only the following cadences are valid: Minutes: 1m, 5m, 10m, 15m, 20m, 30m. Hours: 1h, 2h, 3h, 4h, 6h, 8h, 12h. Days: 1d, 2d, 3d, 4d, 5d, 6d. Weeks: 1w, 2w, 3w, 4w, 5w.Cadence1d
{ "cadence": "6h" }
expire_after
NameDescriptionTypeDefault
expire_afterInsights detected by this verse will continue to be considered active for at least this amount of time after the last time they were detected.TimePeriodOrEmpty3d
{ "expire_after": "2d" }
relevant_data_time_buffer
NameDescriptionTypeDefault
relevant_data_time_bufferAdds an end-time buffer to the insight generation. For example - If this param's value is "1d", then insights are generated for a day before the latest received data. This is useful for processes in which it takes a specific period of time to get all the healthy monitoring data in place.TimePeriodOrEmpty
{ "relevant_data_time_buffer": "1d" }
target_set_period
NameDescriptionTypeDefault
target_set_periodTime period for the target set, ending on the day of the latest available data. Format detailed in common/util.py's get_time_period_for_string.TimePeriodOrEmpty2w
{ "target_set_period": "1w" }
time_resolution
NameDescriptionTypeDefault
time_resolutionIf empty, use Overall Drift mode. Otherwise, use Time-series mode, and this is the time period to serve as time resolution to use when creating the time series for both the target and the benchmark-set, on which we measure the difference. Expected format is "" where can be any positive integer, and options currently include: "d" (days), or "w" (weeks). e.g, "1d" means 1 day period.TimeResolutionOrEmpty1d
{ "time_resolution": "1w" }
timestamp_field_name
NameDescriptionTypeDefault
timestamp_field_nameThe field that is used as the time dimension for insight generation.TimestampFieldtimestamp
{ "timestamp_field_name": "run_end_time" }
timezone
NameDescriptionTypeDefault
timezoneThe timezone used to aggregate daily data points. Accepts any IANA time zone ID: (https://en.wikipedia.org/wiki/List_of_tz_database_time_zones)TimezoneUTC
{ "timezone": "Asia/Hong_Kong" }

Urgency Params

see more
default_urgency
NameDescriptionTypeDefault
default_urgencyThe urgency class for insights created using this verse. Currently, supports two values: "normal" (default) and "high". If set to "normal", then specific thresholds for "high" urgency can be set using other parameters prefixed with "highurgency". If set to "high", then threshold parameters prefixed with "highurgency" are not considered at all - since all insights of this verse will be considered as having a "high" urgency.Urgencynormal
{ "default_urgency": "high" }
high_urgency_baseline_min_segment_size
NameDescriptionTypeDefault
high_urgency_baseline_min_segment_sizeThreshold for separating between "high" and "normal" urgency insights with regards to baseline_min_segment_size. See "baseline_min_segment_size" param for more details on the functionality of this param.PositiveFloatOrNoneNone
{ "high_urgency_baseline_min_segment_size": 1000 }
high_urgency_benchmark_baseline_min_segment_size
NameDescriptionTypeDefault
high_urgency_benchmark_baseline_min_segment_sizeThreshold for separating between "high" and "normal" urgency insights with regards to benchmark_baseline_min_segment_size. See "benchmark_baseline_min_segment_size" param for more details on the functionality of this param.PositiveFloatOrNoneNone
{ "high_urgency_benchmark_baseline_min_segment_size": 1000 }
high_urgency_benchmark_min_segment_size
NameDescriptionTypeDefault
high_urgency_benchmark_min_segment_sizeThreshold for separating between "high" and "normal" urgency insights with regards to benchmark_min_segment_size. See "benchmark_min_segment_size" param for more details on the functionality of this param.PositiveIntOrNoneNone
{ "high_urgency_benchmark_min_segment_size": 500 }
high_urgency_benchmark_min_segment_size_fraction
NameDescriptionTypeDefault
high_urgency_benchmark_min_segment_size_fractionThreshold for separating between "high" and "normal" urgency insights with regards to benchmark_min_segment_size_fraction. See "benchmark_min_segment_size_fraction" param for more details on the functionality of this param.InclusiveFractionOrNoneNone
{ "high_urgency_benchmark_min_segment_size_fraction": 0.3 }
high_urgency_min_anomaly_level
NameDescriptionTypeDefault
high_urgency_min_anomaly_levelThreshold for separating between "high" and "normal" urgency insights with regards to min_anomaly_level. See "min_anomaly_level" param for more details on the functionality of this param.PositiveFloatOrNoneNone
{ "high_urgency_min_anomaly_level": 1.5 }
high_urgency_min_score
NameDescriptionTypeDefault
high_urgency_min_scoreThreshold for separating between "high" and "normal" urgency insights with regards to min_score. See "min_score" param for more details on the functionality of this param.FloatOrNoneNone
{ "high_urgency_min_score": 20 }
high_urgency_min_segment_size
NameDescriptionTypeDefault
high_urgency_min_segment_sizeThreshold for separating between "high" and "normal" urgency insights with regards to min_segment_size. See "min_segment_size" param for more details on the functionality of this param.PositiveIntOrNoneNone
{ "high_urgency_min_segment_size": 1000 }
high_urgency_min_segment_size_fraction
NameDescriptionTypeDefault
high_urgency_min_segment_size_fractionThreshold for separating between "high" and "normal" urgency insights with regards to min_segment_size_fraction. See "min_segment_size_fraction" param for more details on the functionality of this param.InclusiveFractionOrNoneNone
{ "high_urgency_min_segment_size_fraction": 0.2 }
high_urgency_require_all_criteria
NameDescriptionTypeDefault
high_urgency_require_all_criteriaDecide if to use 'AND'/'OR' condition between all high_urgency threshold params.boolTrue
{ "high_urgency_require_all_criteria": false }
high_urgency_target_baseline_min_segment_size
NameDescriptionTypeDefault
high_urgency_target_baseline_min_segment_sizeThreshold for separating between "high" and "normal" urgency insights with regards to target_baseline_min_segment_size. See "target_baseline_min_segment_size" param for more details on the functionality of this param.PositiveFloatOrNoneNone
{ "high_urgency_target_baseline_min_segment_size": 500 }
high_urgency_target_min_segment_size
NameDescriptionTypeDefault
high_urgency_target_min_segment_sizeThreshold for separating between "high" and "normal" urgency insights with regards to target_min_segment_size. See "target_min_segment_size" param for more details on the functionality of this param.PositiveIntOrNoneNone
{ "high_urgency_target_min_segment_size": 200 }
high_urgency_target_min_segment_size_fraction
NameDescriptionTypeDefault
high_urgency_target_min_segment_size_fractionThreshold for separating between "high" and "normal" urgency insights with regards to target_min_segment_size_fraction. See "target_min_segment_size_fraction" param for more details on the functionality of this param.InclusiveFractionOrNoneNone
{ "high_urgency_target_min_segment_size_fraction": 0.2 }

Visuals and Enrichments Params

see more
field_vectors
NameDescriptionTypeDefault
field_vectorsThis attribute lists metric vectors for the FE to show on an insight card of this verse. A value in this field can either be a string (in which case the string should correspond to a kapi_vector name in the config) or an array (in which case the array should be treated as an ad-hoc kapi vector defined specifically for this verse).FieldVectorsList()
{ "field_vectors": [ "field_vector_group_1", "field_vector_group_2", "field_vector_group_3" ] }
investigate_no_drill
NameDescriptionTypeDefault
investigate_no_drillDictates the link to the investigations page to add to the found insights. If True, the link will point to investigations page with a drilldown to the segment that was found. If it's false the link will point to the investigations page without drilldown, but with the found segment selected so it can be compared with a benchmark of a higher level.boolFalse
{ "investigate_no_drill": true }
time_resolution
NameDescriptionTypeDefault
time_resolutionIf empty, use Overall Drift mode. Otherwise, use Time-series mode, and this is the time period to serve as time resolution to use when creating the time series for both the target and the benchmark-set, on which we measure the difference. Expected format is "" where can be any positive integer, and options currently include: "d" (days), or "w" (weeks). e.g, "1d" means 1 day period.TimeResolutionOrEmpty1d
{ "time_resolution": "1w" }

Wizard Params

see more
benchmark_set_period
NameDescriptionTypeDefault
benchmark_set_periodTime period for benchmark set. By default means the period just before the target set period. Expected format is "" where can be any positive integer, and options currently include: "d" (days), or "w" (weeks). e.g, "1d" means 1 day period.TimePeriodOrEmpty6w
{ "benchmark_set_period": "50d" }
cadence
NameDescriptionTypeDefault
cadenceThe cadence for evaluation of this verse. Only the following cadences are valid: Minutes: 1m, 5m, 10m, 15m, 20m, 30m. Hours: 1h, 2h, 3h, 4h, 6h, 8h, 12h. Days: 1d, 2d, 3d, 4d, 5d, 6d. Weeks: 1w, 2w, 3w, 4w, 5w.Cadence1d
{ "cadence": "6h" }
default_urgency
NameDescriptionTypeDefault
default_urgencyThe urgency class for insights created using this verse. Currently, supports two values: "normal" (default) and "high". If set to "normal", then specific thresholds for "high" urgency can be set using other parameters prefixed with "highurgency". If set to "high", then threshold parameters prefixed with "highurgency" are not considered at all - since all insights of this verse will be considered as having a "high" urgency.Urgencynormal
{ "default_urgency": "high" }
metrics
NameDescriptionTypeDefault
metricsRelevant metrics to search anomalies for in the verse, relevant only for types who search for anomalies in metrics behavior.MetricsList()
{ "metrics": [ "top_score", "delta_top_to_second_score" ] }
min_anomaly_level
NameDescriptionTypeDefault
min_anomaly_levelThis parameter sets the threshold for the minimal anomaly level for which an insight will be generated. Anomaly level in this verse is: In Time-series Drift mode - The difference between the average of target time series and average of benchmark time series, normalized by the standard deviation of the joined timeseries of target and benchmark. In Overall Drift mode - the anomaly level is the diff between target set size and benchmark set size, (normalized by benchmark set size if normalize_relative_size_drift is true).PositiveFloat0.3
{ "min_anomaly_level": 0.5 }
min_segment_size
NameDescriptionTypeDefault
min_segment_sizeMinimal segment size for the united benchmark+target segments.PositiveInt100
{ "min_segment_size": 100 }
min_segment_size_fraction
NameDescriptionTypeDefault
min_segment_size_fractionMinimal segment size in fraction from baseline segment, which a segment must have in order to be considered in the search.InclusiveFraction0
{ "min_segment_size_fraction": 0.05 }
name
NameDescriptionTypeDefault
name(Required) The name of the verse. Please note, a verse's name must be different from other verses in the same stanza.strNone
{ "name": "confidence_outliers" }
segment_by
NameDescriptionTypeDefault
segment_byThe dimensions to use to segment the data in order to search for anomalies. This list must be a sublist of all arc class' dimensions. Limiting the possible values of a specific segmentation field on which insights can be generated can be done using the "avoid_values" and the "include_only_values" keys in the segmentation JSON object, as seen in the example.SegmentationsList()
{ "segment_by": [ "city", "bot_id", {"name": "provider-code", "avoid_values": ["zoom"]}, {"name": "selected-language", "avoid_values": ["eng", "spa"]}, {"name": "country", "include_only_values": ["jpn"]} ] }
target_set_period
NameDescriptionTypeDefault
target_set_periodTime period for the target set, ending on the day of the latest available data. Format detailed in common/util.py's get_time_period_for_string.TimePeriodOrEmpty2w
{ "target_set_period": "1w" }
time_resolution
NameDescriptionTypeDefault
time_resolutionIf empty, use Overall Drift mode. Otherwise, use Time-series mode, and this is the time period to serve as time resolution to use when creating the time series for both the target and the benchmark-set, on which we measure the difference. Expected format is "" where can be any positive integer, and options currently include: "d" (days), or "w" (weeks). e.g, "1d" means 1 day period.TimeResolutionOrEmpty1d
{ "time_resolution": "1w" }
trend_directions
NameDescriptionTypeDefault
trend_directionsA list of allowed anomalies trends directions - either 'asc' for ascending (anomalies in which the found value is LARGER THAN the relevant benchmark), or 'desc' for descending (anomalies in which the found value is SMALLER THAN the relevant benchmark).TrendDirections('asc', 'desc')
{ "trend_directions": [ "asc" ] }
use_relative_sizes
NameDescriptionTypeDefault
use_relative_sizesIf true, use segment's relative sizes of the relevant baseline segment. If false, use segment's absolute sizes normalized by time period lengths.boolFalse
{ "use_relative_sizes": true }