1st Workshop on Multimodal Spatial Intelligence

October 20 (1 pm - 5:30 pm), 2025

Room 306 A | Zoom Link (code: farmcans)

About the Workshop

Our multi-modal spatial intelligence workshop aims to bring together researchers from computer vision, robotics, graphics, and NLP for a half-day of discussion and talks at the intersection of visual understanding, multimodal learning, and embodied AI. Our focus is on placing multi-modal large language models (MLLMs) at the core of spatial intelligence-- exploring how they can learn, interpret, and act on spatial information from images, videos, and 3D data.

Keywords:

Spatial Reasoning Multimodal Large Language Model World Models Embodied AI 3D Understanding

Invited Speakers

Photo of Saining Xie

Saining Xie

NYU / Google DeepMind

Photo of Manling Li

Manling Li

Northwestern University

Photo of Ranjay Krishna

Ranjay Krishna

UW / AI2

Photo of Yue Wang

Yue Wang

USC / NVIDIA

Photo of Qianqian Wang

Qianqian Wang

UC Berkeley

Program

13:00 – 13:10 Welcome & Introduction
13:10 – 13:50 Generate Robotic Data with Spatial Intelligence Yue Wang (USC / NVIDIA)
13:50 – 14:30 Towards Spatial Supersensing Saining Xie (NYU / Google DeepMind)
14:30 – 15:10 Why is Spatial Understanding Hard for VLMs? Manling Li (Northwestern University)
15:10 – 15:30 ☕ Coffee Break & Social
15:30 – 16:10 Visual Reasoning Will Be Bigger Than Language Reasoning Ranjay Krishna (UW / AI2)
16:10 – 16:50 On Latent Abilities Underlying Spatial Intelligence Qianqian Wang (UC Berkeley)
16:50 – 17:20 Panel Discussion: Future of Multimodal Spatial Intelligence Modulator: Songyou Peng
17:20 – 17:30 Concluding Remarks

Organizers

Photo of Songyou Peng

Songyou Peng

Google DeepMind

Photo of Jihan Yang

Jihan Yang

NYU

Photo of Kyle Genova

Kyle Genova

Google DeepMind

Photo of Thomas Funkhouser

Tom Funkhouser

Google DeepMind

Photo of Fei-Fei Li

Fei-Fei Li

Stanford / World Labs

Photo of Leonidas J. Guibas

Leo Guibas

Stanford / Google DeepMind

Photo of Saining Xie

Saining Xie

NYU / Google DeepMind

Contact

For any questions, please reach out to the primary contact:

Contact Songyou Peng