Most AI Pilots Fail to Scale. MIT Sloan Teaches You Why — and How to Fix It
Launch Your Cybersecurity Career in 6 Months
Overview
Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
Explore supervised fine-tuning (SFT) data and reinforcement learning from human feedback (RLHF) in this comprehensive lecture from the University of Utah Data Science program. Delve into the critical components of modern language model training, examining how SFT data is collected, processed, and utilized to improve model performance on specific tasks. Learn about the RLHF methodology, which incorporates human preferences to align AI systems with desired behaviors and outputs. Understand the technical implementation details, challenges, and best practices for both SFT and RLHF approaches. Gain insights into how these techniques work together to create more capable and aligned language models, with practical examples and real-world applications. Access accompanying slides to follow along with detailed explanations of algorithms, data pipelines, and evaluation metrics used in state-of-the-art language model development.
Syllabus
SFT data & RLHF
Taught by
UofU Data Science