Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Scaling Data Engineering Pipelines - Preparing Credit Card Transactions Data for Machine Learning

Databricks via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn how to build scalable data engineering pipelines for processing credit card transaction data in preparation for machine learning applications through this 34-minute conference talk from Databricks. Explore two comprehensive real-world use cases that demonstrate advanced big data engineering techniques for constructing stable pipelines and managing petabyte-scale storage systems. Discover how implementing Delta Lake can dramatically optimize data pipeline performance, achieving an impressive 80% reduction in query execution time and 70% decrease in storage space requirements. Master the application of Databricks Workflows 'ForEach' operator for executing compute-intensive pipelines across multiple clusters, transforming processing times from months to mere days. Examine a reusable design pattern that isolates notebooks into discrete units of work, enabling data scientists to independently test and develop their solutions while maintaining pipeline stability. Gain insights from Mastercard's Director of Data Science Brandon DeShon and Lead Data Engineer Luke Garzia as they share practical strategies for scaling data engineering operations in enterprise environments focused on financial transaction processing and machine learning readiness.

Syllabus

Scaling Data Engineering Pipelines: Preparing Credit Card Transactions Data for Machine Learning

Taught by

Databricks

Reviews

Start your review of Scaling Data Engineering Pipelines - Preparing Credit Card Transactions Data for Machine Learning

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.