Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
In this short, hands-on course, you’ll learn how to build fast, efficient AI training and inference pipelines by optimizing both data loading and computational graphs. You’ll start by creating parallel, high-throughput data pipelines that keep GPUs consistently busy and reduce training bottlenecks. Then you’ll analyze a model’s computational graph to identify and remove redundant operations that slow execution. Through focused lesson videos, practical labs, and guided coach activities, you’ll re-export a streamlined model and validate real latency improvements. By the end, you’ll be able to diagnose performance issues, streamline pipelines, and apply optimization techniques that make AI systems faster, more reliable, and more cost-efficient.