Build GenAI Apps from Scratch — UCSB PaCE Certificate Program
Google, IBM & Microsoft Certificates — All in One Plan
Overview
Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
Explore Netflix's development of Mako, a next-generation machine learning training platform designed for large-scale AI workloads, in this 33-minute conference talk from Ray Summit 2025. Learn how Netflix engineers Avin Regmi and Matan Appelbaum evolved the company's legacy training infrastructure to handle increasingly complex models, larger datasets, and rapidly growing GPU requirements. Discover the architectural decisions behind Netflix's custom GPU scheduler that improves utilization, reduces fragmentation, and ensures efficient execution of large multi-node training jobs. Examine the key components of resource orchestration, distributed execution, and system resilience that enable Netflix to scale training across diverse workloads. Understand how Ray's flexible distributed runtime integrates into high-performance training pipelines and supports critical platform components. Gain practical insights into designing modern ML training platforms, optimizing GPU usage at enterprise scale, and leveraging distributed computing frameworks to build robust AI infrastructure capable of supporting the next generation of machine learning applications.
Syllabus
Inside Netflix’s Mako: The Next-Gen ML Training Platform | Ray Summit 2025
Taught by
Anyscale