SysNav: Multi-Level Systematic Cooperation Enables Real-World, Cross-Embodiment Object Navigation

1Carnegie Mellon University, 2New York University, 3Nanyang Technological University
示例图片

Contributions

  • We propose SysNav, a three-level object navigation system that decouples semantic reasoning, navigation planning, and motion control. This design enables cross-embodiment generalization across wheeled robots, quadrupeds, and humanoids, allowing each component to focus on its respective strengths.
  • We design a hierarchical navigation strategy that treats rooms as minimal decision-making units. The VLM performs high-level semantic reasoning over a structured scene representation for room-level decisions, while efficient classical exploration methods handle in-room navigation, leveraging both VLM's semantic strengths and the spatial structure of indoor environments.
  • Our system not only supports standard object navigation but also enables conditional object navigation—such as navigation conditioned on object attributes or spatial relations—through its structured scene representation.
  • We conduct extensive evaluations including 190 real-world experiments across three robot embodiments, achieving 4-5x improvement in navigation efficiency over existing baselines, and evaluate on four simulation benchmarks (HM3D-v1, HM3D-v2, MP3D, and HM3D-OVON), achieving state-of-the-art performance. To the best of our knowledge, this is the first system capable of reliably and efficiently completing object navigation at building-scale.

Wheeled Robot Wheeled Robot: Long-range Object Navigation

Wheeled Robot Wheeled Robot: Object Navigation

23 demos - demos - Click any video to view in detail

Wheeled Robot Wheeled Robot: Object Navigation with Self-attribute Condition

10 demos - Click any video to view in detail

Wheeled Robot Wheeled Robot: Object Navigation with Spatial Condition

6 demos - Click any video to view in detail

Quadruped Quadruped: Object Navigation

7 demos - Click any video to view in detail

Quadruped Quadruped: Object Navigation with Self-attribute Condition

6 demos - Click any video to view in detail

Quadruped Quadruped: Object Navigation with Spatial Condition

5 demos - Click any video to view in detail

Humanoid Humanoid: Object Navigation

7 demos - Click any video to view in detail

Humanoid Humanoid: Object Navigation with Self-attribute Condition

5 demos - Click any video to view in detail

Humanoid Humanoid: Object Navigation with Spatial Condition

2 demos - Click any video to view in detail

Wheeled Robot Wheeled Robot: Semi-known Environment Object Navigation

Key Moments - Click image to jump • Click 🔍 to zoom
Frame 1 0:10
Target object is not in the environment during the mapping run.
Frame 2 0:21
Robot completes the mapping run.
Frame 3 0:36
A laptop is put in the meeting room after the mapping run.
Frame 4 0:57
Robot finds a laptop near the desk in the meeting room, completes.

Quantitative Results in Real-world

Real-world Quantitative Results

Quantitative Results in Simulation Benchmarks

Simulation Quantitative Results
×
×