Overview
Syllabus
Introduction
Steps to improve user experience
History of data processing at Google
What is MapReduce
The problem with MapReduce
From Java
The problem
Artificial splitting
Un unbounded data
Delays
How to deal with delays
MillVia
Timebased windows
Session windows
Event vs processing time
Stream vs Batch
Billing Pipeline
User Experience
Abuse Detection
Historical Systems
Apache Beam
Dataflow Example
Four Questions
MapReduce
When to omit results
Create a window
Wait for results
When to trigger
Triggers in Beam
Demo
refinements
how
what just happened
cancel pipeline
run on
update pipeline
QR code
Assign color
Running the pipeline
Patch pipeline
BigQuery
Color Smash
Hit Ratio
Aggregate
Back to the slides
Taught by
Devoxx