Scaling Large Language Models - Getting Started with Large-Scale Parallel Training of LLMs
MLOps World: Machine Learning in Production via YouTube
Start speaking a new language. It’s just 3 weeks away.
Master Finance Tools - 35% Off CFI (Code CFI35)
Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn to implement large-scale parallel training strategies for billion-parameter language models through hands-on coding exercises and practical demonstrations. Master the fundamental parallelization dimensions including data parallelism, tensor parallelism, and pipeline parallelism, while discovering how to compose these techniques effectively for optimal training performance. Explore advanced LLM-specific methods such as context parallelism and understand the strategic principles behind data and parameter sharding across distributed hardware systems. Gain practical experience with collective communication operations for synchronizing gradients and activations, and develop skills to optimize communication patterns and memory usage for maximum training throughput. Build each parallelization strategy from first principles through live coding sessions, analyze trade-offs between different approaches, and acquire the expertise needed to train large language models when single or few GPU setups are insufficient for the task at hand.
Syllabus
Scaling Large Language Models: Getting Started with Large-Scale Parallel Training of LLMs
Taught by
MLOps World: Machine Learning in Production