Scaling GenAI Inference From Prototype to Production - Real-World Lessons in Speed and Cost
Databricks via YouTube
Learn Excel & Financial Modeling the Way Finance Teams Actually Use Them
Power BI Fundamentals - Create visualizations and dashboards from scratch
Overview
Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
Explore real-world strategies for scaling GenAI inference systems from prototype to production in this lightning talk by Anish Kumar, Lead Engineer at Scribd, Inc. Discover how to overcome cost and time constraints when deploying AI systems at scale using Databricks' fully managed infrastructure. Learn to leverage four essential Databricks features—Workflows, Model Serving, Serverless Compute, and Notebooks—to build robust AI inference pipelines capable of processing millions of documents including text and audiobooks. Master the design of modular, parameterized notebooks that enable concurrent execution, effective dependency management, and accelerated AI-driven insights. Understand how to facilitate seamless collaboration between Data Scientists and Engineers through rapid experimentation capabilities, easy GenAI prompt tuning, flexible compute settings, efficient data iteration, and comprehensive quality testing frameworks. Gain actionable strategies for optimizing AI inference performance, automating complex data workflows, and architecting next-generation serverless AI systems while maintaining cost efficiency and maximizing operational performance.
Syllabus
Scaling GenAI Inference From Prototype to Production: Real-World Lessons in Speed & Cost
Taught by
Databricks