Reinforcement Learning - From DQN to PPO with Practical Applications

Learn reinforcement learning through hands-on implementation of advanced algorithms and real-world applications in this comprehensive 7 hour 59 minute course. Master Deep Q Networks (DQN) by solving the CartPole problem, then advance to Double Deep Q Networks and Double Dueling Deep Q Networks for improved performance. Explore essential techniques including epsilon-greedy strategies and prioritized experience replay to optimize learning efficiency. Build convolutional neural network-based DQN agents that learn to play Pong directly from pixel inputs, demonstrating the power of deep reinforcement learning in visual environments. Progress beyond value-based methods to policy gradient approaches, implementing Advanced Actor Critic (A2C) algorithms and their asynchronous variant (A3C) for parallel learning. Dive deep into Proximal Policy Optimization (PPO), covering both discrete and continuous action spaces through practical OpenAI Gym tutorials. Apply your knowledge to challenging real-world scenarios including training AI agents to land rockets, learn bipedal locomotion, and navigate through complex obstacle courses using the RockRL framework. Gain practical experience with state-of-the-art reinforcement learning algorithms while building AI systems capable of mastering complex control tasks through trial and error learning.

Syllabus

Introduction to Reinforcement Learning - Cartpole DQN
Solving the CartPole with Double Deep Q Network
Introduction to Double Dueling Deep Q Network
Epsilon Greedy strategy in Deep Q Learning
Introduction to Prioritized Experience Replay in Deep Q Learning
Deep Q Network with Convolutional Neural Networks
A.I. learns to play Pong game from pixels with DQN
Reinforcement learning agents Beyond DQN (policy Gradient)
Advanced Actor Critic algorithm (A2C) with Pong
Introduction to Asynchronous Advanced Actor Critic algorithm (A3C)
Introduction to Proximal Policy Optimization algorithm (PPO)
Introduction to Proximal Policy Optimization Tutorial with OpenAI gym environment
Continuous Proximal Policy Optimization Tutorial with OpenAI gym environment
A.I. Learns to Land a Rocket (RockRL)
A.I Learns To Walk (Reinforcement Learning - RockRL)
A.I. Learns to Walk Through Obstacles (Reinforcement Learning - RockRL)