Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Explore Data-Oriented Programming (DOP) and Java DataFrames as powerful alternatives to traditional Java data processing approaches in this 49-minute InfoQ conference talk. Learn why DataFrames can be the missing tool in your data-oriented toolkit through practical demonstrations and performance comparisons. Discover the costs associated with Records and Streams in Java and understand the compelling reasons to integrate DataFrames into Java ecosystems. Examine real-world performance through the "One Billion Row Challenge" benchmark, comparing DataFrame-EC, Tablesaw, and Kotlin DataFrame implementations against Python/Pandas solutions. Analyze side-by-side code comparisons between different DataFrame libraries and evaluate the advantages and limitations of Kotlin DataFrames. Dive deep into the technical implementation details including primitive collections and object pooling strategies that enable high-performance data processing. Master advanced DataFrame operations including joins, pivots, and external Domain Specific Languages (DSLs) for complex data manipulation tasks. Gain insights into architectural decisions for choosing DataFrames over traditional database solutions and understand when each approach provides optimal results. Achieve Python-like expressiveness while maintaining Java-level performance in your data processing applications.
Syllabus
— Meet Vladimir Zakharov Java Expert & JSR 335 Group
— What is Data-Oriented Programming DOP?
— The Cost of Records and Streams in Java
— Why Use DataFrames in a Java Ecosystem?
— The One Billion Row Challenge: Results & Analysis
— Code Comparison: Pandas vs. DataFrame-EC vs. Tablesaw
— Kotlin DataFrames: Pros and Cons
— Under the Hood: Primitive Collections & Object Pooling
— Advanced Use Cases: Joins, Pivots, and External DSLs
— Final Takeaways: When to Choose DataFrames over Databases
Taught by
InfoQ