Production Machine Learning Systems#


Introduction to Advanced Machine Learning on Google Cloud#

[Video] - Advanced Machine Learning on Google Cloud - Dec 13, 2022

[Video] - Welcome - Dec 13, 2022

Architecting Production ML Systems#

[Video] - Architecting ML systems - Dec 13, 2022

[Video] - Data extraction, analysis, and preparation - Dec 13, 2022

[Video] - Model training, evaluation, and validation - Dec 13, 2022

[Video] - Trained model, prediction service, and performance monitoring - Dec 13, 2022

[Video] - Training design decisions - Dec 13, 2022

[Video] - Serving design decisions - Dec 13, 2022

[Video] - Designing from scratch - Dec 13, 2022

[Video] - Using Vertex AI - Dec 13, 2022

[Video] - Lab introduction: Structured data prediction - Dec 13, 2022

[Lab] - Structured data prediction using Vertex AI Platform

  • train_deploy - lab

Designing Adaptable ML Systems#

[Video] - Introduction - Dec 13, 2022

[Video] - Adapting to data - Dec 13, 2022

[Video] - Changing distributions - Dec 13, 2022

[Video] - Lab: Adapting to data - Dec 13, 2022

[Video] - Right and wrong decisions - Dec 13, 2022

[Video] - System failure - Dec 13, 2022

[Video] - Concept drift - Dec 13, 2022

[Video] - Actions to mitigate concept drift - Dec 13, 2022

[Video] - TensorFlow data validation - Dec 13, 2022

[Video] - Components of TensorFlow data validation - Dec 13, 2022

[Video] - Lab Introduction: Introduction to TensorFlow Data Validation - Dec 13, 2022

[Lab] - Introduction to TensorFlow Data Validation

  • tfdv_basic_spending - lab, sol

[Video] - Lab Introduction: Advanced Visualizations with TensorFlow Data Validation - Dec 13, 2022

[Lab] - Advanced Visualizations with TensorFlow Data Validation

  • tfdv_advanced_taxi - lab, sol

[Video] - Mitigating training-serving skew through design - Dec 13, 2022

[Video] - Lab Introduction: Serving ML Predictions in Batch and Real Time - Dec 13, 2022

[Lab] - Serving ML Predictions in Batch and Real Time

  • serving_ml_prediction - lab, sol

[Video] - Lab Debrief: Serving ML Predictions in Batch and Real Time - Dec 13, 2022

[Video] - Diagnosing a production model - Dec 13, 2022

Designing High-performance ML Systems#

[Video] - Introduction - Dec 13, 2022

[Video] - Training - Dec 13, 2022

[Video] - Predictions - Dec 13, 2022

[Video] - Why distributed training is needed - Dec 13, 2022

[Video] - Distributed training architectures - Dec 13, 2022

[Video] - TensorFlow distributed training strategies - Dec 13, 2022

[Video] - Mirrored strategy - Dec 13, 2022

[Video] - Multi-worker mirrored strategy - Dec 13, 2022

[Video] - TPU strategy - Dec 13, 2022

[Video] - Parameter server strategy - Dec 13, 2022

[Video] - Lab Introduction: Distributed Training with Keras - Dec 13, 2022

[Lab] - Distributed Training with Keras

[Video] - Lab Introduction: Distributed Training using GPUs on Cloud AI Platform - Dec 13, 2022

[Lab] - Distributed Training using GPUs on Cloud AI Platform

  • distributed_training - lab, sol

[Video] - Training on large datasets with API - Dec 13, 2022

[Video] - Lab Introduction: TPU-speed Data Pipelines - Dec 13, 2022

[Lab] - TPU Speed Data Pipelines

  • tpu_speed_data_pipelines - lab, sol

[Video] - Inference - Dec 13, 2022

Building Hybrid ML Systems#

[Video] - Introduction - Dec 13, 2022

[Video] - Machine Learning on Hybrid Cloud - Dec 13, 2022

[Video] - Kubeflow - Dec 13, 2022

[Video] - Lab Introduction: Kubeflow Pipelines with AI Platform - Dec 13, 2022

[Lab] - Running Pipelines on Vertex AI 2.5

[Video] - TensorFlow Lite - Dec 13, 2022

[Video] - Optimizing TensorFlow for mobile - Dec 13, 2022

[Video] - Summary - Dec 13, 2022


[Video] - Course summary - Dec 13, 2022

[Doc] - Production Machine learning systems - readings

[Doc] - All quiz questions and answers

Course Resources#

[Doc] - Architecting Production ML Systems Course Resources