Melting Icebergs - Enabling Analytical Access to Kafka Data through Iceberg Projections

Explore a groundbreaking technical conference talk that demonstrates how to bridge the gap between operational and analytical data estates by creating logical projections of Apache Kafka data in an Iceberg-compatible format. Learn how StreamNative engineers developed an innovative solution that eliminates traditional ETL processes while maintaining Kafka as the single source of truth, allowing direct analytical access to streaming data without compromising operational performance. Discover the technical implementation details of presenting Kafka data for Iceberg processors without upfront data movement or transformation, integrating Kafka's mature ecosystem features like Schema Registry and consumer groups into Iceberg workflows. Understand how this approach meets Iceberg's performance and cost reduction expectations while sourcing data directly from Kafka, utilizing advanced indexing for efficient queries and dynamic metadata generation. Gain insights into the protocols, formats, and services used to merge these two data giants, including the challenges and solutions encountered when maintaining operational performance while providing analytical flexibility. Examine the architectural principles behind keeping Kafka as the single source of truth, ensuring analytical processors don't require Kafka-specific adjustments, and reusing Kafka's established features like ACLs and quotas rather than reinventing them.