Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Zero To Mastery

Fundamentals of Apache Spark and PySpark

via Zero To Mastery Path

Overview

Get hands-on with Apache Spark and PySpark by learning how to build scalable, high-performance data pipelines using the DataFrame API, Spark jobs, joins, aggregations, and more.
  • Learn the skills and real-world tools used by Data Engineers and become top 10% in your field
  • Set up Apache Spark and configure your local or cloud environment for big data processing
  • Write efficient PySpark code to handle, transform, and analyze large-scale datasets
  • Use DataFrames to manipulate data in a distributed computing environment
  • Build scalable data pipelines that integrate multiple transformation and aggregation steps
  • Create a strong foundation for a career in Data Engineering, Data Science, and AI/ML

Syllabus

  •   Introduction
    • Introduction
    • Exercise: Meet Your Classmates and Instructor
    • Course Resources
  •   Setup and Useful Resources
    • [Optional] UNIX CLI Commands
    • [Optional] Using Windows
    • Installing Software for the Course
    • [Optional] What Is a Virtualenv?
  •   Big Data Processing with Apache Spark
    • Apache Spark
    • How Spark Works
    • Spark Application
    • DataFrames
    • Installing Spark
    • Installing Spark on Linux
    • Inside Airbnb Data
    • Writing Your First Spark Job
    • Lazy Processing
    • [Exercise] Basic Functions
    • [Exercise] Basic Functions - Solution
    • Aggregating Data
    • Joining Data
    • Aggregations and Joins with Spark
    • Complex Data Types
    • [Exercise] Aggregate Functions
    • [Exercise] Aggregate Functions - Solution
    • User Defined Functions
    • Data Shuffle
    • Data Accumulators
    • Optimizing Spark Jobs
    • Submitting Spark Jobs
    • Other Spark APIs
    • Spark SQL
    • [Exercise] Advanced Spark
    • [Exercise] Advanced Spark - Solution
    • Summary
  •   Where To Go From Here?
    • Let's Keep Learning Together!
    • Review This Byte!

Taught by

Ivan Mushketyk

Reviews

Start your review of Fundamentals of Apache Spark and PySpark

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.