Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

What's New in PySpark - TVFs, Subqueries, Plots, and Profilers

Databricks via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Explore the latest enhancements to PySpark's DataFrame API in this 39-minute conference talk that introduces powerful new features for more expressive and modular data workflows. Learn how to leverage table-valued functions (TVFs) through Python User-Defined Table Functions (UDTFs), including support for polymorphism that allows for flexible function definitions. Discover the new subquery API that simplifies complex analytical logic and understand how lateral joins connect these advanced features together. Master practical developer tools including built-in plotting capabilities for data visualization, profiling tools for performance optimization, and get a preview of upcoming features like UDF logging and Python-native data source API. Gain insights into building production-ready data pipelines and extending PySpark's capabilities, whether you're developing analytics workflows or AI applications. The session covers both foundational concepts and advanced techniques, making it valuable for developers working on production pipelines or those looking to extend PySpark's functionality with custom implementations.

Syllabus

What’s New in PySpark: TVFs, Subqueries, Plots, and Profilers

Taught by

Databricks

Reviews

Start your review of What's New in PySpark - TVFs, Subqueries, Plots, and Profilers

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.