Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Scaling GenAI Inference From Prototype to Production - Real-World Lessons in Speed and Cost

Databricks via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Explore real-world strategies for scaling GenAI inference systems from prototype to production in this lightning talk by Anish Kumar, Lead Engineer at Scribd, Inc. Discover how to overcome cost and time constraints when deploying AI systems at scale using Databricks' fully managed infrastructure. Learn to leverage four essential Databricks features—Workflows, Model Serving, Serverless Compute, and Notebooks—to build robust AI inference pipelines capable of processing millions of documents including text and audiobooks. Master the design of modular, parameterized notebooks that enable concurrent execution, effective dependency management, and accelerated AI-driven insights. Understand how to facilitate seamless collaboration between Data Scientists and Engineers through rapid experimentation capabilities, easy GenAI prompt tuning, flexible compute settings, efficient data iteration, and comprehensive quality testing frameworks. Gain actionable strategies for optimizing AI inference performance, automating complex data workflows, and architecting next-generation serverless AI systems while maintaining cost efficiency and maximizing operational performance.

Syllabus

Scaling GenAI Inference From Prototype to Production: Real-World Lessons in Speed & Cost

Taught by

Databricks

Reviews

Start your review of Scaling GenAI Inference From Prototype to Production - Real-World Lessons in Speed and Cost

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.