Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

High-Throughput ML - Mastering Efficient Model Serving at Enterprise Scale

Databricks via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Discover how to architect and implement high-performance machine learning model serving systems capable of handling thousands of predictions per second in this 27-minute conference talk from Databricks. Learn the essential techniques for building inference pipelines that scale efficiently to massive request volumes while maintaining low latency requirements. Explore how to leverage Databricks Feature Store for consistent, low-latency feature lookups and implement auto-scaling strategies that balance performance optimization with cost management. Master the QPS × model execution time formula for determining optimal compute capacity and understand how to configure Feature Store for high-throughput operations. Gain insights into managing cold starts and scaling strategies specifically designed for latency-sensitive applications, while implementing comprehensive monitoring systems that provide visibility into inference performance. Apply these practical strategies to enterprise-grade ML serving systems, whether you're deploying recommender systems, real-time fraud detection models, or other high-volume prediction services.

Syllabus

High-Throughput ML: Mastering Efficient Model Serving at Enterprise Scale

Taught by

Databricks

Reviews

Start your review of High-Throughput ML - Mastering Efficient Model Serving at Enterprise Scale

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.