Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Pretraining on AMD MI300X using ScalarLM

MLOps World: Machine Learning in Production via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
This conference talk features Greg Diamos, Founder of MLCommons, sharing his experience building ScalarLM, a framework that unifies training and inference workloads for large language models on AMD MI300X GPUs. Discover how ScalarLM leverages the MI300X's high memory bandwidth and compute density to achieve superior performance. Learn about innovative memory management techniques, dynamic kernel fusion approaches, and custom CDNA3 architecture optimizations that enable efficient scaling from single-GPU deployments to multi-node clusters. Explore the challenges encountered during development, including HIP programming model adaptations and workload-specific tuning, while gaining insights into quantitative performance comparisons against existing frameworks. Valuable for researchers and engineers working to optimize LLM workloads across diverse hardware platforms. Greg brings extensive expertise as a founder of MLPerfâ„¢, the industry standard benchmark for deep learning performance, and from his work at Baidu's Silicon Valley AI Lab where he co-invented the framework for the first 1,000 CUDA GPU training cluster.

Syllabus

Pretraining on AMD MI300X using ScalarLM

Taught by

MLOps World: Machine Learning in Production

Reviews

Start your review of Pretraining on AMD MI300X using ScalarLM

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.