Humans effortlessly know that a tree is a tree and a dog is a dog no matter the size, color or angle at which they’re viewed. In fact, identifying such visual elements is one of the earliest tasks children learn. But researchers have struggled to determine how the brain does this simple evaluation. As deep-learning systems have come to master this ability, scientists have started to ask whether computers analyze data—and particularly images—similarly to the human brain. “The way that the human mind, the human visual system, understands shape is a mystery that has baffled people for many generations, partly because it is so intuitive and yet it’s very difficult to program” says Jacob Feldman, a psychology professor at Rutgers University.
A paper published in Scientific Reports in June comparing various object recognition models came to the conclusion that people do not evaluate an object like a computer processing pixels, but based on an imagined internal skeleton. In the study, researchers from Emory University, led by associate professor of psychology Stella Lourenco, wanted to know if people judged object similarity based on the objects’ skeletons—an invisible axis below the surface that runs through the middle of the object’s shape. The scientists generated 150 unique three-dimensional shapes built around 30 different skeletons and asked participants to determine whether or not two of the objects were the same. Sure enough, the more similar the skeletons were, the more likely participants were to label the objects as the same. The researchers also compared how well other models, such as neural networks (artificial intelligence–based systems) and pixel-based evaluations of the objects, predicted people’s decisions. While the other models matched performance on the task relatively well, the skeletal model always won.
“There’s a big emphasis on deep neural networks for solving these problems [of object recognition]. These are networks that require lots and lots of training to even learn a single object category, whereas the model that we investigated, a skeletal model, seems to be able to do this without this experience,” says Vladislav Ayzenberg, a doctoral student in Lourenco’s lab. “What our results show is that humans might be able to recognize objects by their internal skeletons, even when you compare skeletal models to these other well-established neural net models of object recognition.”
Next, the researchers pitted the skeletal model against other models of shape recognition, such as ones that focus on the outline. To do so, Ayzenberg and Lourenco manipulated the objects in certain ways, such as shifting the placement of an arm in relation to the rest of the body or changing how skinny, bulging, or wavy the outlines were. People once again judged the objects as being similar based on their skeletons, not their surface qualities.