Class Central is learner-supported. When you buy through links on our site, we may earn an affiliate commission.

YouTube

Markov Decision Processes of the Third Kind - Learning Distributions by Policy Gradient Methods

Erwin Schrödinger International Institute for Mathematics and Physics (ESI) via YouTube

Overview

Coursera Flash Sale
40% Off Coursera Plus for 3 Months!
Grab it
Explore distributional Markov Decision Processes in this 30-minute conference lecture that examines a novel class of control problems where the objective shifts from optimizing expected values to learning policies that guide cumulative reward distributions toward specific target laws. Delve into the mathematical framework of these "third kind" Markov Decision Processes and discover how they differ from traditional approaches by focusing on distributional outcomes rather than risk functionals. Learn about a proposed model-free policy-gradient algorithm that utilizes neural-network parameterizations of randomized Markov policies on augmented state spaces, combined with sample-based evaluation of characteristic-function loss. Examine the theoretical foundations including convergence proofs to stationary points using stochastic approximation techniques under mild regularity and growth assumptions. Review numerical experiments demonstrating the method's capability to match complex target distributions, and gain insights from collaborative research involving distributional control theory and its practical applications in machine learning and stochastic optimization.

Syllabus

Nicole Bäuerle - Markov Decision Processes of the Third Kind: Learning Distributions by Policy...

Taught by

Erwin Schrödinger International Institute for Mathematics and Physics (ESI)

Reviews

Start your review of Markov Decision Processes of the Third Kind - Learning Distributions by Policy Gradient Methods

Never Stop Learning.

Get personalized course recommendations, track subjects and courses with reminders, and more.

Someone learning on their laptop while sitting on the floor.