Diving Deep into NVIDIA Nsight Systems GPU Profiling Tools for PyTorch LLM and Computer Vision Workloads
Generative AI on AWS via YouTube
Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Explore advanced GPU profiling techniques through this comprehensive webinar featuring two distinct presentations on AI systems performance optimization. Begin with an introduction and updates from Chris Fregly, author of the bestselling O'Reilly book "AI Systems Performance Engineering," before diving into the main presentation by Chaim Rand on NVIDIA Nsight Systems GPU profiling tools. Learn how to effectively use NVIDIA Nsight profiling tools to complement PyTorch Profiler for optimizing Large Language Model (LLM) and computer vision workloads. Discover practical strategies for optimizing data transfer in AI/ML workloads through detailed analysis of batched inference scenarios. Master the integration of professional-grade profiling tools with existing PyTorch workflows to identify performance bottlenecks and improve GPU utilization. Gain insights from real-world examples and case studies that demonstrate the application of these profiling techniques to production AI systems. Access supplementary resources including GitHub repositories with practical examples, related blog posts on data transfer optimization, and connections to broader AI performance engineering concepts covered in the accompanying O'Reilly publication.
Syllabus
Mastering Nvidia Nsight GPU Profiling
Taught by
Generative AI on AWS