Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Tricks to Fine Tuning - Advanced Techniques for Model Training Without Labels

MLOps.community via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
This podcast episode features Prithviraj Ammanabrolu, Research Scientist at Databricks, discussing innovative fine-tuning techniques for language models. Dive into the concept of Tao fine-tuning, a revolutionary approach that eliminates the need for labeled data by using reinforcement learning and synthetic data to help models evaluate and improve themselves. Learn how this technique enables smaller models to perform significantly better and potentially transform efficient model deployment. Ammanabrolu, who also serves as an Assistant Professor at UC San Diego leading the PEARLS Lab, shares insights on reward model fine-tuning, the balance between training and inference compute, strategies for handling model drift, and the differences between prompt tuning and traditional fine-tuning. The 54-minute discussion covers optimization strategies for small models, their untapped potential, differences in fine-tuning various model architectures, and the implications of open model frameworks like Mistral.

Syllabus

[00:00] Raj's preferred coffee
[00:36] Takeaways
[01:02] Tao Naming Decision
[04:19] No Labels Machine Learning
[08:09] Tao and TAO breakdown
[13:20] Reward Model Fine-Tuning
[18:15] Training vs Inference Compute
[22:32] Retraining and Model Drift
[29:06] Prompt Tuning vs Fine-Tuning
[34:32] Small Model Optimization Strategies
[37:10] Small Model Potential
[43:08] Fine-tuning Model Differences
[46:02] Mistral Model Freedom
[53:46] Wrap up

Taught by

MLOps.community

Reviews

Start your review of Tricks to Fine Tuning - Advanced Techniques for Model Training Without Labels

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.