Model management using Vertex AI

In this lesson, we describe Model monitoring.

Vertex AI model monitoring is a service that helps you manage the performance of your models.

Model monitoring lets you detect drift in data quality,

identify skew in training versus serving data,

monitor feature attribution,

and use the UI to visualize monitoring metrics.

Let’s walk through an example.

First to create a model deployment monitoring job by using the Cloud Console,

you need to either create an endpoint or edit an existing endpoint to enable monitoring.

In the navigation page, select endpoints.

In this example, three existing endpoints have monitoring disabled.

To enable model monitoring for the existing endpoint credit_risk,

simply select the endpoint and then click settings.

On the edit endpoint page, select model monitoring

and toggle the button to enable monitoring for the endpoint.

Note that you can monitor the tabular and custom models deployed to this endpoint for changes in feature drift, training serving skew, and other objectives that help you understand how your model is performing compared to real world data.

Also note that settings in this step apply to all models deployed to the endpoint.

After enabling monitoring and naming the monitoring job display name, you must define the size of the time window to monitor when the monitoring job runs, and this must be defined in hours.

Monitoring frequency determines how often a deployed model’s inputs are monitored for skew or drift.

At the specified frequency, a monitoring job runs and performs monitoring on the recently logged inputs.

Monitoring frequency determines the time spent or monitoring window size of logged data that is analyzed in each monitoring run.

In the Google Cloud Console, you can see the time when each monitoring job runs and also visualize the data analyzed in each job.

The minimum granularity is one hour.

Model monitoring requires you to provide an email address to serve as the email ID.

Model monitoring sends an email alert each time an alerting threshold is crossed, and each time skew or drift detection is set up, and each time an existing model monitoring job configuration is updated.

The email alert contain pertinent information, including;

the time at which the monitoring job ran,

the name of the feature that has skew or drift,

and the alerting threshold and the recorded statistical distance measure.

Sampling rate defines a percentage of the prediction input data that should be sampled when the monitoring job runs.

This parameter controls the fraction of the incoming prediction request that are logged and analyzed for monitoring purposes.

This is an optional parameter.

If you don’t configure this parameter, the model monitoring service logs all prediction requests.

The input schema has two input fields,

one is for the prediction input schema,

and the other is for an analysis input schema.

The prediction input schema defines the format of a single instance used in prediction.

If the schema is not set, the monitoring job will generate the prediction schema from collected prediction request.

The analysis input schema describes the format of a single instance which TensorFlow Data Validation analyzes.

If this schema is not set, the monitoring job will generate the analysis schema from collected prediction request.

So next, you need to select

the monitoring objective;

training prediction skew detection,

and the target field.

In training prediction skew detection, skew is calculated between feature distributions of prediction input data and training data.

In this example, we chose to monitoring the skew from a managed dataset called credit_risk_2.

The target field default is entered into the target field.

Alert thresholds are optional.

You can specify an alert threshold value for each feature that will be used to trigger alerts.

Values range from zero to one.

If an alert threshold is not set, default thresholds will be used.

After configuring your monitoring settings, you will receive an email message that you are now using the vertex AI model monitoring service.

The email informs you that your request to set up drift or skew detection for the prediction endpoint is noted, and that incoming prediction request will be sampled and logged for any analysis.

Here, monitoring is now enabled.

When a feature is monitored for training serving skew or prediction drift, model monitoring computes the statistical distribution of the latest feature values seen in production.

This statistical distribution is then compared against another baseline distribution by computing a distance score to determine how similar the production feature values are to the baseline.

When the distance score between two statistical distributions exceed a certain threshold, model monitoring identifies that as skew or drift.

Model monitoring uses different baselines for skew detection and drift detection.

For skew detection, the baseline is the statistical distribution of the feature’s values in the training data.

For drift detection, the baseline is the statistical distribution of the feature’s values seen in production in the recent past.

Presented here is an example of feature distribution monitoring.

To compare two statistical distributions, model monitoring uses the following statistical measures; the Jensen-Shannon divergence, which is to calculate this distance between two distributions of numerical features, and the L-infinity distance, which is to calculate the distance between two distributions.

Visualizing data distribution as histograms lets you quickly focus on the changes that occurred in the data.

Afterward, you might decide to adjust your feature generation pipeline or retrain the model.

For categorical features, the computed distribution is the number or percentage of instances of each possible value of the feature.

For numerical features, we divide the range of possible feature values into equal intervals, and compute the number or percentage of feature values that fall in each interval.

As we’ve shown, prediction and model monitoring both allow you to request either batch and online predictions via a BigQuery table, a CSV file, or from a pre-built or custom container, and monitor training serving skew.

Eduardo Avelar

Model management using Vertex AI