Deep learning techniques have transformed how robots master new tasks. As head of the Robotics and Embodied AI Lab at Stanford University, Shuran Song is at the forefront of that shift, and finding creative ways to make robots more useful.
Just recently, Song, 33, and her team designed a low-cost way of giving robots a new sense: hearing. Most robots navigate primarily with sight through cameras—a problem in low-visibility environments. Song’s lab built a system to capture audio, which made robots better at tasks like erasing a whiteboard or emptying a cup.
The new system was built upon one of Song’s most significant contributions to the field—a handheld gripper equipped with microphones that anyone can use to do everything from wash dishes to repair a bike. While you complete a task with the gripper, the device constantly tracks your movements while recording audio and video. That data can then be used to train robots, similar to the way large language models are built.
Song is making all of this training data she collects open source. She’s working on a number of collaborative datasets, including DROID, which can be used by academic researchers who have far less access to training data than startups backed by venture capital firms.
Though safe and useful robots helping us with our daily at-home tasks are still some time away, we’re getting closer, thanks in part to Song’s work.