While serving a nine-month stint at Google, Sergey Levine watched as the company’s AlphaGo program defeated the world’s best human player of the ancient Chinese game Go in March. Levine, a robotics specialist at the University of California, Berkeley, admired the sophisticated feat of machine learning but couldn’t help focusing on a notable shortcoming of the powerful Go-playing algorithms. “They never picked up any of the pieces themselves,” he jokes.
One way that the creators of AlphaGo trained the program was by feeding 160,000 previous games of Go to a powerful algorithm called a neural network, much the way similar algorithms have been shown countless labeled pictures of cats and dogs until they learn to recognize the animals in unlabeled photos. But this technique isn’t easily applicable to training a robotic arm.
So roboticists have instead turned to a different technique: the scientist gives a robot a goal, such as screwing a cap onto a bottle, but relies on the machine to figure out the specifics itself. By attempting the task over and over, it eventually attains the goal. But the learning process requires lots of attempts, and it doesn’t work with difficult tasks.
Levine’s breakthrough was to use the same kind of algorithm that has gotten so good at classifying images. After he gives the robot some easy-to-solve versions of the task at hand—instructing it to screw on the cap, for example—the robot then retrospectively studies its own successes. It observes how the data from its vision system maps to the motor signals of the robotic hand doing the task correctly. The robot supervises its own learning. “It’s reverse--engineering its own behavior,” Levine says. It then can apply that learning to related tasks.
With the AI technique, previously insoluble robotics tasks have suddenly become approachable, thanks to the massive increase in training efficiency. Suddenly, robots are getting a lot more clever.
—Andrew Rosenblum