Power BI Fundamentals - Create visualizations and dashboards from scratch
Get 35% Off CFI Certifications - Code CFI35
Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Explore GPU-accelerated data curation techniques for large language models in this 30-minute talk by Ryan Wolf, a Deep Learning Algorithm Engineer at NVIDIA. Learn about the importance of well-curated datasets in scaling LLMs and discover how to create high-quality datasets using NeMo Curator, an open-source library for GPU-accelerated data curation. Gain insights into scaling datasets to trillions of tokens efficiently, a crucial yet often overlooked aspect of machine learning. Benefit from Ryan's expertise in AI systems and his current focus on developing NeMo Curator. This MLOps.community presentation, part of the DE4AI series, offers valuable knowledge for those interested in advancing their understanding of data curation for foundation models.
Syllabus
GPU Accelerated Data Curation for LLMs // Ryan Wolf // DE4AI
Taught by
MLOps.community