AI Engineer - Learn how to integrate AI into software applications
Get 50% Off Udacity Nanodegrees — Code CC50
Overview
Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Explore supervised fine-tuning (SFT) data and reinforcement learning from human feedback (RLHF) in this comprehensive lecture from the University of Utah Data Science program. Delve into the critical components of modern language model training, examining how SFT data is collected, processed, and utilized to improve model performance on specific tasks. Learn about the RLHF methodology, which incorporates human preferences to align AI systems with desired behaviors and outputs. Understand the technical implementation details, challenges, and best practices for both SFT and RLHF approaches. Gain insights into how these techniques work together to create more capable and aligned language models, with practical examples and real-world applications. Access accompanying slides to follow along with detailed explanations of algorithms, data pipelines, and evaluation metrics used in state-of-the-art language model development.
Syllabus
SFT data & RLHF
Taught by
UofU Data Science