Photo of Jiajun WU

Artificial intelligence & robotics

Jiajun WU

Providing AI tools with perception, reasoning, and interaction capabilities.

Year Honored
2020

Organization
Stanford University

Region
Asia Pacific

Hails From
Asia Pacific

Jiajun Wu focuses on computer vision, machine learning, and computational cognitive science.

As an AI researcher, he aims to develop machines with human-level scene understanding: using one single image, humans can interpret what we see, reconstruct 3D scenes, predict what will happen, and plan actions accordingly.

Such “physical” understanding remains far from what current AI tools may achieve, despite impressive progress on large “foundation” models, such as ChatGPT and DALL-E.

To achieve this goal, Jiajun draws on a broad wealth of expertise in physics, graphics, cognitive science, and artificial intelligence to embed such physical scene understanding into machines.

These machines can learn to see, reason, and interact with the physical world like humans. This key insight is to identify the causal structure of the physical world as the "core knowledge" that the learning system needs.

Jiajun has developed creative methods that leverage simulation methods to learn physical scene understanding without explicit human-labeled training data. He developed AI methods that integrate top-down differentiable/neural simulation engines for computer graphics, physics, language, and human cognition, with bottom-up recognition models and perception and sensing systems.

In terms of multimodal understanding, Jiajun uses senses beyond sight to perceive and interact with the scene, including sound and touch. He created a dataset of impact sounds of real-world objects using a sophisticated robotic capture system, as well as the ObjectFolder dataset for benchmarking multi-sensory perception and interaction systems.