Completed
Intro: Handling Data Skew in Production ML Pipelines Roku
Class Central Classrooms beta
YouTube videos curated by Class Central.
Classroom Contents
Taming Data Skew in Production ML Pipelines
Automatically move to the next video in the Classroom when playback concludes
- 1 Intro: Handling Data Skew in Production ML Pipelines Roku
- 2 Roku Scale & the Mystery of Suddenly Slower Spark Jobs
- 3 How Skew Shows Up in Spark: Stragglers, Shuffle Spills, Idle Executors
- 4 What Data Skew Really Is and Why Parallelism Breaks
- 5 Real-World Example: Power Users, Hot Keys, and Power-Law Data
- 6 Why It Matters: Technical Bottlenecks + Business Cost Blowups
- 7 Where Skew Hits ML Pipelines: Recs, Classification, Computer Vision
- 8 Root Causes of Skew #1: Natural Imbalance from Real-World Events
- 9 Root Causes of Skew #2: Join-Key & Aggregation Skew in Feature Engineering
- 10 Root Causes of Skew #3: Computational Skew NLP, Embeddings, Heavy Transforms
- 11 Mitigation Step 1: Repartitioning—When It Works and Its Limits
- 12 Mitigation Step 2: Key Salting to Split Hot Keys Big Runtime Wins
- 13 Mitigation Step 3: Broadcast Joins to Avoid Massive Shuffles
- 14 Wrap-Up: Choosing the Right Fix + AI to Predict Skew Before It Happens
- 15 Closing & How to Connect