Challenge 6: Monitor your models

< Previous Challenge - Home - Next Challenge >

Introduction

There are times when the training data becomes not representative anymore because of changing demographics, trends etc. To catch any skew or drift in feature distributions or even in predictions, it is necessary to monitor your model performance continuously.

If you’ve chosen the online inferencing path, continue with Online Monitoring, otherwise please skip to the Batch Monitoring section.

Online Monitoring

Description

Vertex AI Endpoints provide Model Monitoring capabilities which needs to be turned on for this challenge. Turn on Training-serving skew detection for your model, use an hourly granularity to get alerts. Send at least 10K prediction requests to collect monitoring data.

Success Criteria

Show that the Model Monitoring is running successfully for the endpoint that’s created in the previous challenge
By default Model Monitoring keeps request/response data in a BigQuery dataset, find and show that data

Tips

You can use the sample.csv file from challenge 1 as the baseline data
You can use the same tool you’ve used for the previous challenge to generate the requests, make sure to include some data that has a different distribution than the training data

Learning Resources

Introduction to Vertex AI Model Monitoring

Batch Monitoring

Description

Vertex AI Batch prediction jobs provide Model Monitoring capabilities as well. Create a new Batch Predition job with monitoring turned on with BigQuery input and ouput tables, use default values for the alert thresholds.

Success Criteria

There’s a new Batch Prediction job with monitoring turned on
As batch inferencing will take roughly ~25 minutes again, it’s sufficient to show the properly configured job configuration

Tips

You can use the sample.csv file from challenge 1 as the baseline training data
You can use the same data you’ve used for the previous challenge to run the batch predictions, make sure to include some data that has a different distribution than the training data

Learning Resources

Model monitoring for Batch Predictions

< Previous Challenge - Home - Next Challenge >