Earlier this year, researchers tried teaching an AI to play the original Sonic the Hedgehog as part of the The OpenAI Retro Contest. The AI was told to prioritize increasing its score, which in Sonic means doing stuff like defeating enemies and collecting rings while also trying to beat a level as fast as possible. This dogged pursuit of one particular definition of success led to strange results: In one case, the AI began glitching through walls in the game’s water zones in order to finish more quickly.
It was a creative solution to the problem laid out in front of the AI, which ended up discovering accidental shortcuts while trying to move right. But it wasn’t quite what the researchers had intended. One of researchers’ goals with machine-learning AIs in gaming is to try and emulate player behavior by feeding them large amounts of player generated data. In effect, the AI watches humans conduct an activity, like playing through a Sonic level, and then tries to do the same, while being able to incorporate its own attempts into its learning. In a lot of instances, machine learning AIs end up taking their directions literally. Instead of completing a variety of objectives, a machine-learning AI might try to take shortcuts that completely upend human beings’ understanding of how a game should be played.
Victoria Krakovna, a researcher on Google’s DeepMind AI project, has spent the last several months collecting examples like the Sonic one. Her growing collection has recently drawn new attention after being shared on Twitter by Jim Crawford, developer of the puzzle series Frog Fractions, among other developers and journalists. Each example includes what she calls “reinforcement learning agents hacking the reward function,” which results in part from unclear directions on the part of the programmers.
“While ‘specification gaming’ is a somewhat vague category, it is particularly referring to behaviors that are clearly hacks, not just suboptimal solutions,” she wrote in her initial blog post on the subject. “A classic example is OpenAI’s demo of a reinforcement learning agent in a boat racing game going in circles and repeatedly hitting the same reward targets instead of actually playing the game.”