Chapter 21: Reinforcement Learning

Reward Shaping and Expected Utility Theory intermediate

"Reward hacking and misalignment are the dominant failure modes." -- Chapter 21

"Reward hacking and misalignment are the dominant failure modes." -- Chapter 21

Register to Read

Sign up for a free account to access all 112 primer topics.

Create Free Account

Already have an account? Sign in