Gain a Splash of New Skills - Coursera+ Annual Just ₹7,999
Free courses from frontend to fullstack and AI
Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn how to build trusted, production-grade LLM services in security-sensitive financial environments through this 16-minute conference talk from Ray Summit 2025. Discover Coinbase's approach to engineering LLM infrastructure that meets the non-negotiable requirements of trust, security, and reliability in one of the world's most security-conscious crypto exchanges. Explore the technical architecture behind Coinbase's internal LLM services, including their seamless integration of Ray for distributed orchestration and scaling, vLLM for high-throughput and low-latency inference, and LiteLLM for routing, abstraction, and multi-provider reliability. Examine user authentication and authorization patterns specifically tailored for secure LLM access, service-to-service trust models that enable safe and auditable communication between internal systems, and LiteLLM distribution strategies designed to balance throughput, reliability, and fallback behavior. Understand how vLLM and Ray collaborate to power scalable, production-grade LLM serving APIs and support high-volume internal LLM traffic while ensuring consistent performance under load. Follow the complete end-to-end implementation story of delivering trustworthy, secure, and efficient LLM services that meet the strict reliability requirements of a leading global cryptocurrency exchange.
Syllabus
How Coinbase Uses Ray, vLLM & LiteLLM to Power Secure LLM Services | Ray Summit 2025
Taught by
Anyscale