In this course, we integrate the grid-world environment with a Q-learning agent, focusing on agent-environment interaction and training over multiple episodes. We explore the exploration vs. exploitation tradeoff using an ε-greedy strategy and visualize performance through reward plots and policy displays.
Overview
Syllabus
- Unit 1: Integrating Agents with Environments in Reinforcement Learning
- Agent-Environment Interaction Loop Integration
- Evaluating Agent Performance Over Time
- Fix the Q-Learning Bug
- Visualize Your Agent in Action
- Integrate Agent with Environment
- Unit 2: Balancing Exploration and Exploitation with Epsilon-Greedy Strategy
- Epsilon Greedy Strategy in Action
- Epsilon-Greedy Strategy Implementation
- Fix the Epsilon-Greedy Bug
- Epsilon Decay for Smarter Learning
- Epsilon-Greedy Strategy Implementation
- Unit 3: Visualizing Training Statistics in Reinforcement Learning
- Visualize Learning with Moving Averages
- Visualize Learning with Reward Plots
- Visualize Agent Learning Progress
- Streamline Your Visualization Code
- Unit 4: Visualizing Policies and Value Functions in Reinforcement Learning
- Visualizing Agent's Decision Strategy
- Fix the State Value Bug
- Visualize Agent's Value Function
- Visualize Policy and Value Together
- Integrate and Visualize Learning Agent