Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Explore innovative approaches to scaling and enhancing Llama capabilities in this 13-minute talk by Lin Qiao, co-founder and CEO of Fireworks. Discover open compound AI architectures, FireAttention (a distributed inference engine), and scalable serving designs that enable Llama developers to accelerate inference, integrate external tools, and achieve domain specialization. Gain insights into addressing Gen AI challenges and advancing from single models to more complex AI systems.
Syllabus
Going from Single Model to Compound AI Systems
Taught by
Meta Developers