Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Accelerating Pandas with NVIDIA's cuDF: Basic Statistical Analysis and Data Cleaning - Episode 6

Python Tutorials for Digital Humanities via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn how to accelerate data analysis with NVIDIA's cuDF in this 15-minute tutorial from the Python Tutorials for Digital Humanities series. Discover the performance advantages of GPU acceleration when working with large datasets, specifically demonstrated on 4.3 million newspaper articles. Compare CPU and GPU performance for tasks like word counts and text length calculations using an NVIDIA RTX 5000 GPU. Follow along with essential data cleaning techniques to improve data quality. The tutorial covers introduction to cuDF, hardware setup details, dataset loading and preparation, statistical analysis on both CPU and GPU, and methods for identifying and cleaning problematic data. Access the companion notebook on GitHub to practice these techniques yourself.

Syllabus

00:00 Introduction to QDF and Video Overview
00:47 Exciting Hardware Setup for the Series
02:06 Loading and Preparing the Dataset
03:55 Performing Statistical Analysis on CPU
05:20 Accelerating Analysis with GPU
08:54 Identifying and Cleaning Bad Data
14:31 Conclusion and Next Steps

Taught by

Python Tutorials for Digital Humanities

Reviews

Start your review of Accelerating Pandas with NVIDIA's cuDF: Basic Statistical Analysis and Data Cleaning - Episode 6

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.