Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

CNCF [Cloud Native Computing Foundation]

Unlocking the Potential of Large Language Models in Production - Best Practices and Solutions

CNCF [Cloud Native Computing Foundation] via YouTube

Overview

Coursera Spring Sale
40% Off Coursera Plus Annual!
Grab it
Explore a conference talk that delves into the challenges and solutions of deploying large language models (LLMs) in production environments. Learn about the paradigm shift from traditional machine learning to GenAI and LLMs, focusing on the complex LLMOps challenges in deployment, scaling, and operations. Discover best practices for building scalable inference platforms using cloud native technologies like Kubernetes, Kubeflow, Kserve, and Knative. Gain insights into essential aspects of LLM operations, including benchmarking tools, storage solutions for efficient auto-scaling, model optimization for specialized accelerators, implementing A/B testing with limited compute resources, and monitoring strategies. Follow a detailed case study of KServe that demonstrates practical solutions to these production challenges, presented by experts from Red Hat and NVIDIA.

Syllabus

Unlocking Potential of Large Models in Production - Yuan Tang, Red Hat & Adam Tetelman, NVIDIA

Taught by

CNCF [Cloud Native Computing Foundation]

Reviews

Start your review of Unlocking the Potential of Large Language Models in Production - Best Practices and Solutions

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.