1/38 Model management using Vertex AI

In this lesson, we describe Model monitoring.

2/38 Model management using Vertex AI

Vertex AI model monitoring is a service that helps you manage the performance of your models.

3/38 Model management using Vertex AI

Model monitoring lets you detect drift in data quality,

4/38 Model management using Vertex AI

identify skew in training versus serving data,

5/38 Model management using Vertex AI

monitor feature attribution,

6/38 Model management using Vertex AI

and use the UI to visualize monitoring metrics.

Let’s walk through an example.

7/38 Model management using Vertex AI

First to create a model deployment monitoring job by using the Cloud Console,

8/38 Model management using Vertex AI

you need to either create an endpoint or edit an existing endpoint to enable monitoring.

In the navigation page, select endpoints.

9/38 Model management using Vertex AI

In this example, three existing endpoints have monitoring disabled.

To enable model monitoring for the existing endpoint credit_risk,

10/38 Model management using Vertex AI

simply select the endpoint and then click settings.

11/38 Model management using Vertex AI

On the edit endpoint page, select model monitoring

12/38 Model management using Vertex AI

and toggle the button to enable monitoring for the endpoint.

Note that you can monitor the tabular and custom models deployed to this endpoint for changes in feature drift, training serving skew, and other objectives that help you understand how your model is performing compared to real world data.

13/38 Model management using Vertex AI

Also note that settings in this step apply to all models deployed to the endpoint.

14/38 Model management using Vertex AI

After enabling monitoring and naming the monitoring job display name, you must define the size of the time window to monitor when the monitoring job runs, and this must be defined in hours.

Monitoring frequency determines how often a deployed model’s inputs are monitored for skew or drift.

At the specified frequency, a monitoring job runs and performs monitoring on the recently logged inputs.

Monitoring frequency determines the time spent or monitoring window size of logged data that is analyzed in each monitoring run.

In the Google Cloud Console, you can see the time when each monitoring job runs and also visualize the data analyzed in each job.

The minimum granularity is one hour.

15/38 Model management using Vertex AI

Model monitoring requires you to provide an email address to serve as the email ID.

Model monitoring sends an email alert each time an alerting threshold is crossed, and each time skew or drift detection is set up, and each time an existing model monitoring job configuration is updated.

The email alert contain pertinent information, including;

16/38 Model management using Vertex AI

the time at which the monitoring job ran,

17/38 Model management using Vertex AI

the name of the feature that has skew or drift,

18/38 Model management using Vertex AI

and the alerting threshold and the recorded statistical distance measure.

19/38 Model management using Vertex AI

Sampling rate defines a percentage of the prediction input data that should be sampled when the monitoring job runs.

This parameter controls the fraction of the incoming prediction request that are logged and analyzed for monitoring purposes.

This is an optional parameter.

If you don’t configure this parameter, the model monitoring service logs all prediction requests.

20/38 Model management using Vertex AI

The input schema has two input fields,

21/38 Model management using Vertex AI

one is for the prediction input schema,

22/38 Model management using Vertex AI

and the other is for an analysis input schema.

The prediction input schema defines the format of a single instance used in prediction.

If the schema is not set, the monitoring job will generate the prediction schema from collected prediction request.

The analysis input schema describes the format of a single instance which TensorFlow Data Validation analyzes.

If this schema is not set, the monitoring job will generate the analysis schema from collected prediction request.

23/38 Model management using Vertex AI

So next, you need to select

24/38 Model management using Vertex AI

the monitoring objective;

25/38 Model management using Vertex AI

training prediction skew detection,

26/38 Model management using Vertex AI

and the target field.

In training prediction skew detection, skew is calculated between feature distributions of prediction input data and training data.

27/38 Model management using Vertex AI

In this example, we chose to monitoring the skew from a managed dataset called credit_risk_2.

28/38 Model management using Vertex AI

The target field default is entered into the target field.

Alert thresholds are optional.

You can specify an alert threshold value for each feature that will be used to trigger alerts.

Values range from zero to one.

If an alert threshold is not set, default thresholds will be used.

29/38 Model management using Vertex AI

After configuring your monitoring settings, you will receive an email message that you are now using the vertex AI model monitoring service.

30/38 Model management using Vertex AI

The email informs you that your request to set up drift or skew detection for the prediction endpoint is noted, and that incoming prediction request will be sampled and logged for any analysis.

31/38 Model management using Vertex AI

Here, monitoring is now enabled.

32/38 Model management using Vertex AI

When a feature is monitored for training serving skew or prediction drift, model monitoring computes the statistical distribution of the latest feature values seen in production.

This statistical distribution is then compared against another baseline distribution by computing a distance score to determine how similar the production feature values are to the baseline.

When the distance score between two statistical distributions exceed a certain threshold, model monitoring identifies that as skew or drift.

33/38 Model management using Vertex AI

Model monitoring uses different baselines for skew detection and drift detection.

34/38 Model management using Vertex AI

For skew detection, the baseline is the statistical distribution of the feature’s values in the training data.

35/38 Model management using Vertex AI

For drift detection, the baseline is the statistical distribution of the feature’s values seen in production in the recent past.

36/38 Model management using Vertex AI

Presented here is an example of feature distribution monitoring.

To compare two statistical distributions, model monitoring uses the following statistical measures; the Jensen-Shannon divergence, which is to calculate this distance between two distributions of numerical features, and the L-infinity distance, which is to calculate the distance between two distributions.

Visualizing data distribution as histograms lets you quickly focus on the changes that occurred in the data.

Afterward, you might decide to adjust your feature generation pipeline or retrain the model.

37/38 Model management using Vertex AI

For categorical features, the computed distribution is the number or percentage of instances of each possible value of the feature.

For numerical features, we divide the range of possible feature values into equal intervals, and compute the number or percentage of feature values that fall in each interval.

38/38 Model management using Vertex AI

As we’ve shown, prediction and model monitoring both allow you to request either batch and online predictions via a BigQuery table, a CSV file, or from a pre-built or custom container, and monitor training serving skew.