Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Processing 1M Identity Graphs per Second with Spark Structured Streaming

StreamNative via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn how Adobe Experience Platform processes over 1 million identity graphs per second using Spark Structured Streaming and Delta Lake in this 31-minute conference talk. Discover the architecture, data patterns, and techniques that enabled Adobe to scale their ingestion pipeline by 10x over three years while maintaining system stability and regulatory compliance. Explore how micro-batching reduces data de-duplication by 70-80%, understand key metrics for tracking query performance, and see how Delta Lake enables rate limiting and anomalous identity filtering. Gain insights into managing schema evolution, using VACUUM for regulatory compliance, implementing multi-cloud pipeline abstraction, and optimizing async task processing for data ingestion into FoundationDB. Learn about Adobe's custom deployment mechanism that minimizes latency disruption while handling over 70 billion identities across 25 deployments in seven regions on Azure and AWS clouds, enabling personalization at scale while maintaining privacy and compliance standards.

Syllabus

Processing 1M Identity Graphs per Second with Spark Structured Streaming

Taught by

StreamNative

Reviews

Start your review of Processing 1M Identity Graphs per Second with Spark Structured Streaming

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.