Learn how to Design, Develop, Deploy

Showing articles tagged: reinforcement-learning

Why We Think

Special thanks to John Schulman for a lot of super valuable feedback and direct edits on this post. Test time compute (Graves et al. 2016, Ling, et al. 2017, Cobbe et al. 2021) and Chain-of-thought (CoT) (W...

Date: May 1, 2025|Estimated Reading Time: 40 min|Author: Lilian Weng

Reward Hacking in Reinforcement Learning

Reward hacking occurs when a reinforcement learning (RL) agent exploits flaws or ambiguities in the reward function to achieve high rewards, without genuinely learning or completing the intended task....

Date: November 28, 2024|Estimated Reading Time: 37 min|Author: Lilian Weng

Learn how to Design, Develop, Deploy

Why We Think

Reward Hacking in Reinforcement Learning

Tags