Overview
Syllabus
[00:00] - Introduction to building reliable agents with RL.
[00:49] - Case Study: ART-E, an AI email assistant.
[02:19] - The importance of starting with prompted models before moving to RL.
[03:17] - Performance improvements of RL over prompted models.
[05:18] - Cost and latency benefits of the RL approach.
[08:02] - The two hardest problems in modern RL: realistic environments and reward functions.
[13:13] - Optimizing agent behavior with "extra rewards."
[15:25] - The problem of "reward hacking" and how to address it.
[18:37] - The solution to reward hacking:
Taught by
AI Engineer