Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Characterization of Large Language Model Development in Datacenters

USENIX via YouTube

Overview

AI, Data Science & Cloud Certificates from Google, IBM & Meta — 40% Off
One plan covers every Professional Certificate on Coursera. 40% off Coursera Plus Annual.
Unlock All Certificates
Explore an in-depth characterization study of Large Language Model (LLM) development in datacenters through this 17-minute conference talk from NSDI '24. Delve into the challenges and opportunities of efficiently utilizing large-scale cluster resources for LLM development, including hardware failures, parallelization strategies, and resource utilization. Examine the differences between LLMs and traditional task-specific Deep Learning workloads, and discover potential optimizations for LLM-tailored systems. Learn about innovative approaches such as fault-tolerant pretraining and decoupled scheduling for evaluation, designed to enhance fault tolerance and achieve timely performance feedback in LLM development environments.

Syllabus

NSDI '24 - Characterization of Large Language Model Development in the Datacenter

Taught by

USENIX

Reviews

Start your review of Characterization of Large Language Model Development in Datacenters

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.