Unlocking the Potential of Large Language Models in Production - Best Practices and Solutions
CNCF [Cloud Native Computing Foundation] via YouTube
Finance Certifications Goldman Sachs & Amazon Teams Trust
Learn the Skills Netflix, Meta, and Capital One Actually Hire For
Overview
Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
Explore a conference talk that delves into the challenges and solutions of deploying large language models (LLMs) in production environments. Learn about the paradigm shift from traditional machine learning to GenAI and LLMs, focusing on the complex LLMOps challenges in deployment, scaling, and operations. Discover best practices for building scalable inference platforms using cloud native technologies like Kubernetes, Kubeflow, Kserve, and Knative. Gain insights into essential aspects of LLM operations, including benchmarking tools, storage solutions for efficient auto-scaling, model optimization for specialized accelerators, implementing A/B testing with limited compute resources, and monitoring strategies. Follow a detailed case study of KServe that demonstrates practical solutions to these production challenges, presented by experts from Red Hat and NVIDIA.
Syllabus
Unlocking Potential of Large Models in Production - Yuan Tang, Red Hat & Adam Tetelman, NVIDIA
Taught by
CNCF [Cloud Native Computing Foundation]