Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
Explore a conference talk on Clipper, a low-latency online prediction serving system designed to address the challenges of deploying machine learning models in real-time applications. Learn about Clipper's modular architecture that simplifies model deployment across various frameworks and applications. Discover how the system improves prediction latency, throughput, accuracy, and robustness through techniques like caching, batching, and adaptive model selection. Examine Clipper's performance on four machine learning benchmark datasets and its comparison to TensorFlow Serving. Gain insights into how Clipper enables model composition and online learning to enhance prediction accuracy and robustness without modifying underlying machine learning frameworks.