Multi-GPU Training with Unsloth

Multi-GPU Training with Unsloth

Trelis Research via YouTube Direct link

44:48 Open at time of recording issues with loss reporting and using unsloth with batch size larger than one

18 of 19

18 of 19

44:48 Open at time of recording issues with loss reporting and using unsloth with batch size larger than one

Class Central Classrooms beta

YouTube videos curated by Class Central.

Classroom Contents

Multi-GPU Training with Unsloth

Automatically move to the next video in the Classroom when playback concludes

  1. 1 0:00 Faster training with multiple GPUs
  2. 2 0:39 Video Overview
  3. 3 1:24 Data parallel versus Pipeline Parallel versus Fully Sharded Data Parallel
  4. 4 6:38 Downloading a jupyter notebook as a python script for multi-gpu, e.g. an unsloth notebook
  5. 5 7:44 Unsloth vs Transformers for multi-gpu
  6. 6 8:13 Modifying a fine-tuning script for distributed data parallel
  7. 7 9:03 Starting up a GPU in one-click for fine-tuning
  8. 8 10:27 Converting a jupyter notebook to a python script
  9. 9 11:30 Installation notes for unsloth and tensorboard, and uv
  10. 10 13:32 Script modifications required for DDP
  11. 11 18:50 Training script run-through, for LoRA
  12. 12 22:46 Setting gradient accumulation steps
  13. 13 24:07 Dataset loading
  14. 14 26:22 Setting up the run name and training parameters
  15. 15 29:30 Running without multi-gpu single gpu check
  16. 16 35:47 Running with multiple GPUs using accelerate config btw torch run can result in run hangs
  17. 17 41:02 Sanity check of running with accelerate and a single gpu
  18. 18 44:48 Open at time of recording issues with loss reporting and using unsloth with batch size larger than one
  19. 19 53:11 Conclusion and shout-outs to spr1nter and rakshith

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.