Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Coursera

Stream & Optimize Real-Time Data Flows

Coursera via Coursera

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Master the design, implementation, and optimization of production-ready streaming data pipelines using Apache Kafka and Flink. This intermediate-level course teaches you to evaluate log configurations against governance requirements (PCI-DSS, GDPR, SOC2) and cost constraints, design stream processing topologies that join and aggregate data in real time with exactly-once semantics, and optimize pipelines through partition tuning, compression, and cost modeling. You'll work through hands-on labs that mirror real-world scenarios at DoorDash, Netflix, and Robinhood: comparing retention policies against compliance rules, building a Kafka Streams application that joins orders and payments to calculate 5-minute revenue totals, and diagnosing performance bottlenecks to meet SLAs within budget. Intermediate data engineers and platform engineers who build or operate real-time streaming systems and want to master Kafka/Flink governance, joins, windowing, and cost-optimized scaling. Understanding of distributed systems, basic Apache Kafka knowledge, familiarity with SQL and streaming concepts, Python or Java programming experience. By the end, you'll design and optimize a multi-tenant streaming platform with governance controls—skills directly applicable to streaming data engineer, real-time platform engineer, and data infrastructure roles.

Syllabus

  • Evaluate Log Configurations for Governance and Cost
    • Learn to analyze logging architectures against regulatory requirements and budget constraints. You'll evaluate retention policies for audit logs versus operational events, map data classifications to storage tiers, and quantify the cost impact of different configuration choices. By working through cost modeling exercises and compliance gap analysis, you'll recommend concrete changes to log configurations that balance compliance mandates with infrastructure costs.
  • Design Stream Processing Topologies
    • Learn to architect stream processing pipelines that transform and enrich data in real time. You'll design topologies that join multiple event streams (orders with payments), implement windowing for time-based aggregations (5-minute revenue totals), and manage stateful operations with exactly-once semantics. By working through concrete patterns like stream-stream joins and fan-out architectures, you'll build production-ready data flows that power operational dashboards and decision systems.
  • Optimize Real-Time Data Flows
    • Learn to diagnose and resolve performance bottlenecks in streaming pipelines while controlling costs. You'll analyze partition strategies against throughput requirements, evaluate replication factors versus latency SLAs, and implement compression and batching optimizations. Through cost modeling exercises and performance benchmarking, you'll balance throughput targets with infrastructure budgets and use monitoring data to make evidence-based recommendations for scaling streaming applications.

Taught by

Starweaver and Ritesh Vajariya

Reviews

Start your review of Stream & Optimize Real-Time Data Flows

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.