Chapter 21: Reinforcement Learning

Markov Decision Processes and Partial Observability intermediate

"Algorithm quality cannot rescue a poorly specified state/action/reward design." -- Chapter 21

"Algorithm quality cannot rescue a poorly specified state/action/reward design." -- Chapter 21

Register to Read

Sign up for a free account to access all 112 primer topics.

Create Free Account

Already have an account? Sign in

Chapter

21 Reinforcement Learning

More Topics

Distributional RL and Tail-Aware Action Selection Policy Gradient Theorem and Actor-Critic Architectures Reward Shaping and Expected Utility Theory Temporal-Difference Learning and Bellman Equations