Free courses from frontend to fullstack and AI
Gain a Splash of New Skills - Coursera+ Annual Nearly 45% Off
Overview
Coursera Spring Sale
40% Off Coursera Plus Annual!
Grab it
Explore how to achieve blazing fast open-source AI inference on AWS through this 17-minute conference talk from AWS re:Invent 2025. Discover the stack optimizations that enable companies like Notion, GitLab, and DoorDash to run high-performance AI inference while maintaining full model ownership, including custom CUDA kernels, speculative decoding, and disaggregated serving architectures. Learn practical AWS deployment patterns ranging from fully managed multi-region setups to secure in-VPC deployments using SageMaker, EKS, and ECS. Gain insights into building production-ready AI agents by integrating Fireworks with AWS AgentCore, and understand how to scale AI inference without compromising on performance or control over your models.
Syllabus
AWS re:Invent 2025 - Own Your AI – Blazing Fast OSS AI on AWS (STP104)
Taught by
AWS Events