Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

AI Guide to the Galaxy Episode 2 - Running Local LLMs with Docker Model Runner

Docker via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn to deploy and run large language models locally using Docker Model Runner in this 45-minute episode featuring Principal Engineer Jacob Howard and host Oleg. Discover how to install and configure Docker Model Runner on Docker CE and Docker Desktop environments, exploring both GPU and CPU support options along with the underlying container-based architecture. Master the process of running LLMs in CI environments like GitHub Actions while understanding performance benchmarking on lightweight setups with minimal hardware requirements. Explore model selection strategies including choosing appropriate sizes and quantizations based on your hardware capabilities, and learn to deploy Model Runner in production environments using Kubernetes and Google Cloud Run. Gain practical debugging skills using logs, Docker Desktop's request inspector, and OpenAI API compatibility features. Get insights into upcoming features including VLLM backend support and multimodal inference capabilities, with coverage of essential tools like LLaMA.cpp, quantized LLMs, VRAM sizing, and OCI model artifacts for building agentic applications and production-scale AI deployments.

Syllabus

AI Guide to the Galaxy Episode 2: Running Local LLMs with Docker Model Runner

Taught by

Docker

Reviews

Start your review of AI Guide to the Galaxy Episode 2 - Running Local LLMs with Docker Model Runner

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.