Live Online Classes in Design, Coding & AI — Small Classes, Free Retakes
Learn Excel & Financial Modeling the Way Finance Teams Actually Use Them
Overview
Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
Learn to scale large language model fine-tuning beyond memory constraints using distributed GPU clusters with FSDP, DeepSpeed, and Ray in this comprehensive webinar for ML and platform engineers. Master the orchestration and memory management strategies essential for training frontier-scale models efficiently across distributed systems. Discover how to fine-tune LLMs at scale using Ray and PyTorch, implement checkpoint saving and resuming with Ray Train, and configure ZeRO optimization for optimal memory usage and performance through various stages, mixed precision, and CPU offload techniques. Gain hands-on experience launching distributed training jobs and develop a working understanding of Ray's capabilities for accelerating LLM development. Walk away with practical knowledge, a reusable project foundation, and clear insights into how Ray and Anyscale integrate to streamline large-scale machine learning workflows.
Syllabus
Webinar: Scaling LLM Fine-Tuning with FSDP, DeepSpeed, and Ray
Taught by
Anyscale