The Dataset

In general, Mona can be configured to monitor data from all types of AI models and their business environments and can track anything that would be useful in assessing the entire AI system’s behavior. This can include metadata fields (e.g., geographical information), model inputs (features), model outputs (e.g., the credit score), and business outcomes or ground truth labels when those exist (e.g., whether a loan was paid back on time).

The Tutorial Dataset:

Training and Test
Contain all data fields for credit analysis, which include Metadata fields such as the occupation of the requester, city, state, loan purpose, and the amount of money offered and approved in the application. The data also contains several numerical input features, system metadata such as the model version, and the stage (train or test), and of course the model output (credit_score) and the ground truth label used for training/test.

Example training data

  "timestamp": 1611835417000,
  "occupation": "education",
  "city": "Fayetteville",
  "state": "Arkansas",
  "id": "dfa3353a692e7dc4d8d48102c4ff345f_train",
  "purpose": "Car insurance",
  "credit_score": 0.5731902999954619,
  "loan_taken": true,
  "offered_amount": 4334,
  "feature_0": 2654996.459602782,
  "feature_1": 2.5483001715549025,
  "feature_2": 727,
  "feature_3": 6775,
  "feature_4": 0.4792214869935133,
  "feature_5": 1.5258190130021834,
  "feature_6": 23,
  "feature_7": 1.030829156176713,
  "feature_8": 1337.4128221798737,
  "feature_9": 39.95245825137266,
  "stage": "train",
  "return_until": 1613045017000,
  "label": 0,
  "model_version": "v1"

Example test data

  "timestamp": 1611842438000,
  "occupation": "education",
  "city": "Phoenix",
  "state": "Arizona",
  "id": "7b026dd53a16519ef421e6c1f4ffa8ca_test",
  "purpose": "Home services",
  "credit_score": 0.024648770897898548,
  "loan_taken": true,
  "offered_amount": 17597,
  "feature_0": 2816812.026366597,
  "feature_1": 0.5520508993135393,
  "feature_2": 601,
  "feature_3": 12003,
  "feature_4": 0.02760948701383037,
  "feature_6": 7,
  "feature_7": 8.160327989686898,
  "stage": "test",
  "return_until": 1613052038000,
  "label": 0,
  "model_version": "v1"

Contains the same fields except for the ground truth label.

Example inference data

  "timestamp": 1611930089000,
  "occupation": "technology",
  "city": "Newark",
  "state": "New Jersey",
  "id": "2f719b2936f06da8d7c79ab9c6c39923",
  "purpose": "Home construction",
  "credit_score": 0.021553255527124074,
  "loan_taken": true,
  "offered_amount": 8700,
  "approved_amount": 4900,
  "feature_0": 3391470.0843921946,
  "feature_1": 0.843949135322643,
  "feature_2": 590,
  "feature_3": 13736,
  "feature_4": 0.2720732559814525,
  "feature_5": 0.9777986328271696,
  "feature_6": 10,
  "feature_7": 5.429877032113464,
  "feature_8": 1223.328832340112,
  "feature_9": 131.82714905405732,
  "stage": "inference",
  "return_until": 1613139689000,
  "model_version": "v1"

Contains whether the loan was paid back on time or not. These messages have the same ids as the inference data, which allows Mona to merge them, even if they are exported at different times and from different places.

Example feedback data

  "id": "02e47fa712920244da330b26e0a58604",
  "loan_paid_back": false,
  "timestamp": 1611929491000

NOTE - Mona can accept any JSON message and your dataset might look completely different than the example fake dataset we are using in this tutorial.

Did this page help you?