SysNav: Multi-Level Systematic Cooperation Enables Real-World, Cross-Embodiment Object Navigation

Haokun Zhu¹, Zongtai Li¹, Zihan Liu^1,2, Kevin Guo¹, Zhengzhi Lin¹, Yuxin Cai^1,3, Guofei Chen¹, Chen Lv³, Wenshan Wang¹, Jean Oh¹, Ji Zhang¹

¹Carnegie Mellon University, ²New York University, ³Nanyang Technological University

arXiv Code (Coming Soon)

示例图片

Contributions

We propose SysNav, a three-level object navigation system that decouples semantic reasoning, navigation planning, and motion control. This design enables cross-embodiment generalization across wheeled robots, quadrupeds, and humanoids, allowing each component to focus on its respective strengths.
We design a hierarchical navigation strategy that treats rooms as minimal decision-making units. The VLM performs high-level semantic reasoning over a structured scene representation for room-level decisions, while efficient classical exploration methods handle in-room navigation, leveraging both VLM's semantic strengths and the spatial structure of indoor environments.
Our system not only supports standard object navigation but also enables conditional object navigation—such as navigation conditioned on object attributes or spatial relations—through its structured scene representation.
We conduct extensive evaluations including 190 real-world experiments across three robot embodiments, achieving 4-5x improvement in navigation efficiency over existing baselines, and evaluate on four simulation benchmarks (HM3D-v1, HM3D-v2, MP3D, and HM3D-OVON), achieving state-of-the-art performance. To the best of our knowledge, this is the first system capable of reliably and efficiently completing object navigation at building-scale.

Wheeled Robot: Long-range Object Navigation

Key Moments - Click image to jump • Click 🔍 to zoom

0:07

Robot finds a refrigerator, but not in a lounge, continues.

0:21

Robot transits to the next room selected by the VLM.

0:40

Robot transits to the next room selected by the VLM.

1:18

Robot finds the refrigerator in the lounge, completes.

Key Moments - Click image to jump • Click 🔍 to zoom

0:16

Robot transits to the next room selected by the VLM.

0:48

Robot finds blue and grey trash cans, but not in a classroom, continues.

1:08

The robot aborts exploring the current room and transits to the classroom.

1:11

Robot finds the blue trash can in a classroom, completes.

Key Moments - Click image to jump • Click 🔍 to zoom

0:07

Robot finds a microwave oven but not near the fridge, continues.

0:29

Robot finds a microwave oven but not near the fridge, continues.

0:49

Robot transits to the next room selected by the VLM.

1:03

Robot finds a microwave oven near the fridge, completes.

Wheeled Robot: Object Navigation

23 demos - demos - Click any video to view in detail

Page 1 of 4

Wheeled Robot: Object Navigation with Self-attribute Condition

10 demos - Click any video to view in detail

Page 1 of 3

Wheeled Robot: Object Navigation with Spatial Condition

6 demos - Click any video to view in detail

Quadruped: Object Navigation

7 demos - Click any video to view in detail

Page 1 of 2

Quadruped: Object Navigation with Self-attribute Condition

6 demos - Click any video to view in detail

Page 1 of 2

Quadruped: Object Navigation with Spatial Condition

5 demos - Click any video to view in detail

Page 1 of 2

Humanoid: Object Navigation

7 demos - Click any video to view in detail

Page 1 of 2

Humanoid: Object Navigation with Self-attribute Condition

5 demos - Click any video to view in detail

Page 1 of 2

Humanoid: Object Navigation with Spatial Condition

2 demos - Click any video to view in detail

Wheeled Robot: Semi-known Environment Object Navigation

Key Moments - Click image to jump • Click 🔍 to zoom

Frame 1

0:10

Target object is not in the environment during the mapping run.

Frame 2

0:21

Robot completes the mapping run.

Frame 3

0:36

A laptop is put in the meeting room after the mapping run.

Frame 4

0:57

Robot finds a laptop near the desk in the meeting room, completes.

Quantitative Results in Real-world

Real-world Quantitative Results

Quantitative Results in Simulation Benchmarks

Simulation Quantitative Results