Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Enhancing Reasoning of Large Language Models through Reward-Guided Search and Self-Training

Association for Computing Machinery (ACM) via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Explore cutting-edge techniques for enhancing the reasoning capabilities of Large Language Models (LLMs) in this keynote presentation from the Large Language Model Day at KDD2024. Delve into innovative approaches that leverage inference-time compute for self-improvement, including thought space search and self-training methodologies. Discover the development of generalizable, fine-grained reward models using tree search to automatically collect per-step values of reasoning trace correctness. Learn about ReST-MCTS, a process reward-guided tree search algorithm that enables continuous training of policy and reward models without manual annotation. Examine the application of these techniques in strategic game-playing, vision-language modeling, and 3D scene generation. Gain insights into how these advancements contribute to improving the capabilities of state-of-the-art language models like grok-2. Explore future directions for scaling up self-training and applying online reinforcement learning to unlock even greater intrinsic improvements in LLM capabilities.

Syllabus

KDD2024 - Enhancing Reasoning of Large Language Models through Reward-Guided Search and SelfTraining

Taught by

Association for Computing Machinery (ACM)

Reviews

Start your review of Enhancing Reasoning of Large Language Models through Reward-Guided Search and Self-Training

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.