Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Azure Databricks Using Python With PySpark

Bryan Cafferky via YouTube

Overview

Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
Explore Python on Spark with PySpark in Azure Databricks through this comprehensive 52-minute tutorial. Dive into basic concepts and witness extensive demonstrations in a Databricks notebook. Learn about scaleout, DataFrame API, RDD vs DataFrame, PySpark API, and scaling out ML. Follow along with notebook setup, data importing, Python vs SQL comparisons, and creating persistent tables. Master techniques for renaming columns, using Pandas, exploring data, persistence, and visualizations. Delve into case statements, Spark Sequel, Matplotlib, and user-defined functions. Conclude with hands-on experience in building and writing ML models. Access the accompanying notebook on GitHub for a complete learning experience.

Syllabus

Introduction
Background
Scaleout
DataFrame API
RDD vs DataFrame
PySpark API
PySpark
Scaling out ML
Notebook setup
Importing data
Python vs SQL
Creating persistent tables
Renaming columns
Pandas
Display
DropN
Exploring the Data
Persistence
Visualizations
More Data
Case Statements
Spark Sequel
Matplotlib
Adding a new column
Adding a new dataframe
Userdefined functions
Local Python dataframe
ML Live
Building the Model
Writing the Model

Taught by

Bryan Cafferky

Reviews

Start your review of Azure Databricks Using Python With PySpark

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.