Get Certified in Python, SQL & AI — 50% Off This May
690+ courses, career tracks, and industry-leading certifications trusted by 19M+ learners and 80% of the Fortune 1000.
Start Learning Today
Advance your data skills by mastering Apache Spark. Using the Spark Python API, PySpark, you will leverage parallel computation with large datasets, and get ready for high-performance machine learning. From cleaning data to creating features and implementing machine learning models, you'll execute end-to-end workflows with Spark. The track ends with building a recommendation engine using the popular MovieLens dataset and the Million Songs dataset.