Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

SFT Data and RLHF - Supervised Fine-Tuning and Reinforcement Learning from Human Feedback

UofU Data Science via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Explore supervised fine-tuning (SFT) data and reinforcement learning from human feedback (RLHF) in this comprehensive lecture from the University of Utah Data Science program. Delve into the critical components of modern language model training, examining how SFT data is collected, processed, and utilized to improve model performance on specific tasks. Learn about the RLHF methodology, which incorporates human preferences to align AI systems with desired behaviors and outputs. Understand the technical implementation details, challenges, and best practices for both SFT and RLHF approaches. Gain insights into how these techniques work together to create more capable and aligned language models, with practical examples and real-world applications. Access accompanying slides to follow along with detailed explanations of algorithms, data pipelines, and evaluation metrics used in state-of-the-art language model development.

Syllabus

SFT data & RLHF

Taught by

UofU Data Science

Reviews

Start your review of SFT Data and RLHF - Supervised Fine-Tuning and Reinforcement Learning from Human Feedback

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.