Building a High-Performance Real-Time Analytics Database with Apache Kafka and Druid
CodeWithYu via YouTube
Most AI Pilots Fail to Scale. MIT Sloan Teaches You Why — and How to Fix It
NY State-Licensed Certificates in Design, Coding & AI — Online
Overview
Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
Learn to build a high-performance real-time analytics database in this comprehensive video tutorial that demonstrates end-to-end data engineering using Apache ecosystem tools. Master essential concepts including streaming data with Apache Kafka, implementing distributed synchronization through Zookeeper, processing and storing data with Apache Druid, and containerizing the entire environment using Docker and Orbstack. Follow along with hands-on demonstrations covering system architecture design, project initialization, container setup, data streaming implementation, Apache Druid configuration, and execution of real-time queries and time-based aggregations. Gain practical experience in connecting Apache Druid to Kafka streams and performing advanced data operations while building a production-ready analytics infrastructure.
Syllabus
Introduction
List of Apache Frameworks for Data Engineering
System Architecture
Starting up a project from scratch
Setting up the containers and services on Docker
Streaming data into Apache Kafka
Apache Druid Walkthrough
Connecting Apache Druid to Apache Kafka
Realtime Queries and Aggregations on Apache Druid
Time Aggregations on Apache Druid
Outro
Taught by
CodeWithYu