PowerBI Data Analyst - Create visualizations and dashboards from scratch
Free AI-powered learning to build in-demand skills
Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn to scale large language model fine-tuning beyond memory constraints using distributed GPU clusters with FSDP, DeepSpeed, and Ray in this comprehensive webinar for ML and platform engineers. Master the orchestration and memory management strategies essential for training frontier-scale models efficiently across distributed systems. Discover how to fine-tune LLMs at scale using Ray and PyTorch, implement checkpoint saving and resuming with Ray Train, and configure ZeRO optimization for optimal memory usage and performance through various stages, mixed precision, and CPU offload techniques. Gain hands-on experience launching distributed training jobs and develop a working understanding of Ray's capabilities for accelerating LLM development. Walk away with practical knowledge, a reusable project foundation, and clear insights into how Ray and Anyscale integrate to streamline large-scale machine learning workflows.
Syllabus
Webinar: Scaling LLM Fine-Tuning with FSDP, DeepSpeed, and Ray
Taught by
Anyscale