AI Engineer - Learn how to integrate AI into software applications
Our career paths help you become job ready faster
Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn how to optimize Python User-Defined Functions (UDFs) in Apache Spark using Arrow technology without modifying existing code in this 15-minute conference talk. Discover a powerful new feature that brings Arrow optimization to regular Python UDFs, allowing you to achieve significant performance gains simply by enabling a configuration setting or toggling a UDF-level parameter. Explore the evolution from traditional Python UDFs to Arrow-optimized APIs like Pandas UDFs and the Pandas Functions API, understanding why many users still prefer regular Python UDFs for their simplicity. Gain practical insights into effectively using Arrow-optimized Python UDFs, including their strengths, limitations, and best practices for implementation. Master techniques that bridge the gap between simplicity and performance, making high-performance Python workloads accessible regardless of your Python expertise level. Understand how this enhancement maintains the familiar interface of regular Python UDFs while delivering the performance benefits typically associated with more complex Arrow-based solutions.
Syllabus
No-Code Change in Your Python UDF for Arrow Optimization
Taught by
Databricks