Challenge 6: Monitor your models

< Previous Challenge - Home - Next Challenge >

Introduction

There are times when the training data becomes not representative anymore because of changing demographics, trends etc. To catch any skew or drift in feature distributions or even in predictions, it is necessary to monitor your model performance continuously.

If you’ve chosen the online inferencing path, continue with Online Monitoring, otherwise please skip to the Batch Monitoring section.

Online Monitoring

Description

Vertex AI Endpoints provide Model Monitoring capabilities which needs to be turned on for this challenge. Turn on Training-serving skew detection for your model, use an hourly granularity to get alerts. Send at least 10K prediction requests to collect monitoring data.

Success Criteria

  1. Show that the Model Monitoring is running successfully for the endpoint that’s created in the previous challenge
  2. By default Model Monitoring keeps request/response data in a BigQuery dataset, find and show that data

Tips

  • You can use the sample.csv file from challenge 1 as the baseline data
  • You can use the same tool you’ve used for the previous challenge to generate the requests, make sure to include some data that has a different distribution than the training data

Learning Resources

Introduction to Vertex AI Model Monitoring

Batch Monitoring

Description

Vertex AI Batch prediction jobs provide Model Monitoring capabilities as well. Create a new Batch Predition job with monitoring turned on with BigQuery input and ouput tables, use default values for the alert thresholds.

Success Criteria

  1. There’s a new Batch Prediction job with monitoring turned on
  2. As batch inferencing will take roughly ~25 minutes again, it’s sufficient to show the properly configured job configuration

Tips

  • You can use the sample.csv file from challenge 1 as the baseline training data
  • You can use the same data you’ve used for the previous challenge to run the batch predictions, make sure to include some data that has a different distribution than the training data

Learning Resources

< Previous Challenge - Home - Next Challenge >