PowerBI Data Analyst - Create visualizations and dashboards from scratch
Earn Your CS Degree, Tuition-Free, 100% Online!
Overview
Coursera Spring Sale
40% Off Coursera Plus Annual!
Grab it
Explore the practical challenges of deploying and scaling Generative AI systems in production environments using Scala, FS2, and Server-Sent Events. Learn how to architect real-time GenAI applications that serve thousands of users with low-latency responses while implementing Retrieval-Augmented Generation (RAG) to ground Large Language Models in dynamic business data. Discover techniques for streaming token-by-token outputs, orchestrating document retrieval pipelines on-the-fly, and managing critical production concerns including memory pressure, backpressure, observability, error handling, and model fallbacks. Gain architectural patterns, tooling recommendations, and battle-tested lessons for building production-ready GenAI services such as chatbots, AI assistants, and document question-answering systems that can scale reliably in real-world environments.
Syllabus
Muayad Sayed Ali: LLMs in the Wild - Streaming, RAG, and Real Time GenAI at Scale[Scala Days 2025]
Taught by
Scala Days Conferences