Completed
Serving a model for 100 customers
Class Central Classrooms beta
YouTube videos curated by Class Central.
Classroom Contents
Serve a Custom LLM for Over 100 Customers - GPU Selection, Quantization, and API Setup
Automatically move to the next video in the Classroom when playback concludes
- 1 Serving a model for 100 customers
- 2 Video Overview
- 3 Choosing a server
- 4 Choosing software to serve an API
- 5 One-click templates
- 6 Tips on GPU selection.
- 7 Using quantisation to fit in a cheaper GPU
- 8 Vast.ai setup
- 9 Serve Mistral with vLLM and AWQ, incl. concurrent requests
- 10 Serving a function calling model
- 11 API speed tests, including concurrent
- 12 Video Recap