Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Accelerating Applications with Parallel Algorithms - CUDA C++ Class Part 1

Nvidia via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn modern CUDA C++ programming techniques to accelerate applications using parallel algorithms on NVIDIA GPUs in this comprehensive 2-hour video tutorial. Master the fundamentals of GPU programming through hands-on exercises covering execution spaces, iterators, memory management, and parallel algorithm implementation. Explore the Thrust library for high-performance parallel computing, understand the differences between serial and parallel processing approaches, and work with advanced concepts like stencil operations, MdSpan for multidimensional arrays, and transform iterators. Practice computing statistical operations such as median and variance calculations, implement segmented operations for data processing, and optimize memory usage across different memory spaces. Follow along with practical coding exercises and their solutions covering execution space configuration, median computation, variance calculation, multidimensional span operations, segmented sum and mean calculations, and efficient memory copying techniques. Access accompanying slides and Google Colab notebooks to run GPU exercises for free, making this tutorial ideal for C++ developers seeking to write clean, efficient, and idiomatic GPU code using NVIDIA's latest programming tools and best practices.

Syllabus

00:00:00 Introduction
00:01:34 Introducing Thrust
00:27:36 Exercise Execution Space 1
00:28:57 Solution Execution Space 1
00:29:48 Exercise Execution Space 2
00:30:13 Solution Execution Space 2
00:30:57 Median
00:31:40 Exercise Compute Median
00:31:59 Solution Compute Median
00:33:28 Iterators
01:00:44 Exercise Computing Variance
01:02:22 Solution Computing Variance
01:04:22 Stencil and MdSpan
01:23:23 Exercise Mdspan
01:24:06 Solution Mdspan
01:26:06 Serial vs Parallel
01:37:52 Exercise Segmented Sum
01:39:30 Solution Segmented Sum
01:42:49 Transform Iterator
01:44:49 Exercise Segmented Mean
01:46:01 Solution Segmented Mean
01:47:17 Memory Spaces
01:56:37 Exercise Copy
01:57:34 Solution Copy
01:59:38 Takeways

Taught by

NVIDIA Developer

Reviews

Start your review of Accelerating Applications with Parallel Algorithms - CUDA C++ Class Part 1

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.