Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Learn practical lessons in dataset construction for training and fine-tuning AI models through this 17-minute conference talk from the AI Engineer World's Fair. Discover how Character AI approaches dataset development as both art and science, focusing on building data platforms that support rapid iterative refinement of training data. Explore the unique challenges of working with large-scale LLM datasets and diverse multimodal workloads, and understand how LanceDB addresses critical pain points in storage, management, and querying of large-scale AI data in production environments. Gain insights from Chang She, CEO and cofounder of LanceDB and original pandas contributor, alongside Noah Shpak, Research Engineer leading Character's Data Platform team, as they share their expertise in accelerating foundation model research through internet-scale data mining and retrieval systems.
Syllabus
The Hierarchy of Needs for Training Dataset Development: Chang She and Noah Shpak
Taught by
AI Engineer