Python, Prompt Engineering, Data Science — Build the Skills Employers Want Now
Build the Finance Skills That Lead to Promotions — Not Just Certificates
Overview
Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
Explore the latest enhancements to PySpark's DataFrame API in this 39-minute conference talk that introduces powerful new features for more expressive and modular data workflows. Learn how to leverage table-valued functions (TVFs) through Python User-Defined Table Functions (UDTFs), including support for polymorphism that allows for flexible function definitions. Discover the new subquery API that simplifies complex analytical logic and understand how lateral joins connect these advanced features together. Master practical developer tools including built-in plotting capabilities for data visualization, profiling tools for performance optimization, and get a preview of upcoming features like UDF logging and Python-native data source API. Gain insights into building production-ready data pipelines and extending PySpark's capabilities, whether you're developing analytics workflows or AI applications. The session covers both foundational concepts and advanced techniques, making it valuable for developers working on production pipelines or those looking to extend PySpark's functionality with custom implementations.
Syllabus
What’s New in PySpark: TVFs, Subqueries, Plots, and Profilers
Taught by
Databricks