Training a Reasoning Model Using DeepSeek with 7GB VRAM - A Fast Fine-tuning Guide
Machine Learning With Hamza via YouTube
PowerBI Data Analyst - Create visualizations and dashboards from scratch
Build GenAI Apps from Scratch — UCSB PaCE Certificate Program
Overview
Google, IBM & Meta Certificates — All 10,000+ Courses at 40% Off
One annual plan covers every course and certificate on Coursera. 40% off for a limited time.
Get Full Access
Learn to fine-tune Large Language Models (LLMs) for reasoning tasks in this 27-minute tutorial video that demonstrates using the GRPO reinforcement learning algorithm with minimal GPU requirements. Explore the complete process from environment setup to testing results, including detailed explanations of GRPO methodology, data preparation, model configuration, and reward function implementation. Master local LLM fine-tuning using the Unsloth fast fine-tuning Python library, requiring only 7GB of VRAM. Follow along with practical demonstrations of training procedures, analyze training outcomes, and understand how to test the fine-tuned model effectively. Access comprehensive resources including GitHub repositories, Hugging Face documentation, and Unsloth notebooks to support the implementation process.
Syllabus
00:00 Intro
01:02 Explaining GRPO
08:03 Environment Setup guidelines
10:20 Data , Model & Reward functions
17:57 Training
21:24 Training results
23:47 Testing
Taught by
Machine Learning With Hamza