Demystifying Self-Hosted LLMs: A Beginner's Guide to Self-Hosting on Kubernetes with Podman Desktop

This conference talk by Praveen Kumar, Ramakrishna Yekulla, and Shardul Inamdar addresses the challenges of deploying large language models (LLMs) in production environments and offers a beginner-friendly approach to self-hosting on Kubernetes. Learn how organizations can achieve enhanced data privacy, greater flexibility in model training, and potential cost savings through self-hosted LLM solutions. Discover the capabilities of the Podman Desktop AI Lab extension for streamlining LLM workload development, deployment, and management on Kubernetes. The 45-minute presentation covers essential topics including strategic selection and containerization of open-source LLM models, creation of Kubernetes deployment manifests for LLM workloads, resource provisioning to meet computational demands, and detailed exploration of the Podman Desktop AI Lab extension's integration with Kubernetes. Gain insights into how this technology enables organizations to maintain control over their data while building trust in AI technologies, particularly important for enterprises prioritizing data governance and compliance.