Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

Linux Foundation

One Pipeline to Rule Them All - Unifying Multimodal and AI Data Processing with Daft

Linux Foundation via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn how to unify multimodal and AI data processing using Daft, a Python-native data engine that eliminates the need for juggling multiple tools like ffmpeg, custom scripts, and Spark in this 26-minute conference talk from the Linux Foundation's Open Source Summit. Discover how Daft handles everything from structured tables to images and embeddings within a single framework, featuring native integrations with data catalogs like Iceberg and Delta Lake, plus VectorDBs including Turbopuffer and Lance. Watch live demonstrations of large-scale document processing, batch inference, and multimodal ETL operations all executed within one unified data pipeline. Explore how purpose-built infrastructure can transform the chaos of processing millions of images, documents, and structured data into streamlined competitive advantages, whether you're working with terabyte-scale datasets for foundation model training or building real-time inference systems. Replace fragile, multi-tool data pipelines that break under production load with a robust solution that accelerates iteration speed and handles multimodal data processing at scale.

Syllabus

One Pipeline to Rule Them All: Unifying Multimodal and AI Data Processing... Sammy Sidhu & Colin Ho

Taught by

Linux Foundation

Reviews

Start your review of One Pipeline to Rule Them All - Unifying Multimodal and AI Data Processing with Daft

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.