How to Serve Big LLM over Decentralized GPUs - Parallax and Dynamic Programming

How to Serve Big LLM over Decentralized GPUs - Parallax and Dynamic Programming

Yacine Mahdid via YouTube Direct link

- Introduction:

1 of 10

1 of 10

- Introduction:

Class Central Classrooms beta

YouTube videos curated by Class Central.

Classroom Contents

How to Serve Big LLM over Decentralized GPUs - Parallax and Dynamic Programming

Automatically move to the next video in the Classroom when playback concludes

  1. 1 - Introduction:
  2. 2 - Parallax Overview:
  3. 3 - Phase 1 Allocating Model Layers:
  4. 4 - Phase 1 Water Filling Method:
  5. 5 - Phase 2 Pipeline Chain Selection:
  6. 6 - "Phase 3" Dynamic Rebalancing:
  7. 7 - Result overview:
  8. 8 - Latency and Througput Comparison:
  9. 9 - Scaling Study:
  10. 10 - Conclusion:

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.