Date: June 3rd Morning, 2026
Place: Denver, CO, USA
Our multi-modal spatial intelligence (MUSI) workshop addresses how multimodal large language models (MLLMs) understand, reason about, and interact with spatial information from the physical world. The multimodal nature of spatial intelligence—requiring integration of images, videos, and 3D data—necessitates bringing together researchers from diverse domains: computer vision, robotics, graphics, and NLP. While recent MLLMs show promising visual-spatial capabilities, fundamental questions remain about spatial relationships, 3D environment modeling, and real-world spatial reasoning. This workshop explores how MLLMs learn spatial representations across modalities, advance world modeling and embodied AI, and address ethical considerations. We aim to establish benchmarks and foster cross-disciplinary collaboration to advance spatial reasoning in multimodal AI.
We invite submissions on topics including, but not limited to:
| Submission Deadline | March 13, 2026 (23:59 AoE) | Loading... |
| Author Notification | April 3, 2026 (23:59 AoE) | Loading... |
| Camera Ready | April 24, 2026 (23:59 AoE) | Loading... |
| Workshop Date | June 3rd Morning, 2026 | - |
*All deadlines are Anywhere on Earth (AoE). Timelines are subject to change.
Submissions should be made through the OpenReview submission system.
The workshop will be non-archival. Authors of accepted papers retain the full copyright of their work and are free to submit extended versions to conferences or journals.
| 08:10 – 08:20 | Welcome & Introduction |
| 08:20 – 08:50 | Keynote Talk 1 Angel X. Chang (Simon Fraser University) |
| 08:50 – 09:20 | Keynote Talk 2 Katerina Fragkiadaki (Carnegie Mellon University) |
| 09:20 – 09:50 | Keynote Talk 3 Chuang Gan (UMass Amherst / MIT-IBM Watson AI Lab) |
| 09:50 – 10:05 | ☕ Coffee Break & Social |
| 10:05 – 10:35 | Keynote Talk 4 Roozbeh Mottaghi (FAIR / University of Washington) |
| 10:35 – 11:05 | Keynote Talk 5 Saining Xie (New York University / Google DeepMind) |
| 11:05 – 11:35 | Keynote Talk 6 Ranjay Krishna (University of Washington / Allen Institute for AI) |
| 11:35 – 12:05 | Keynote Talk 7 Kristen Grauman (University of Texas at Austin) |
| 12:05 – 12:35 | Closing Poster Session and Remarks |
For any inquiries about the workshop, please reach out via email: