Object Navigation

1Carnegie Mellon University, 2New York University, 3Nanyang Technological University
示例图片

Contributions

  • We propose a cross-embodiment system—supporting wheeled robots, quadrupeds, and humanoids—that incrementally builds a multi-layer representation of the environment, including room, viewpoint, and object layers, enabling the VLM to make more informed decisions during the object navigation.
  • We design an efficient two-stage navigation policy based on this representation, combining high-level planning guided by the VLM's reasoning and low-level exploration with VLM's assistance.
  • Our system not only supports standard object navigation but also enables conditional object navigation—such as navigation conditioned on object attributes or spatial relations—through its multi-layer representation.
  • We conduct extensive real-world evaluations, including three long-range tests spanning an entire building floor and 75 unit tests conducted within multi-room environments (51 on wheeled robots, 18 on quadrupeds and 6 on humannoids).

Wheeled Robot Wheeled Robot: Long-range Object Navigation

Wheeled Robot Wheeled Robot: Object Navigation

35 demos - demos - Click any video to view in detail

Wheeled Robot Wheeled Robot: Object Navigation with Self-attribute Condition

10 demos - Click any video to view in detail

Wheeled Robot Wheeled Robot: Object Navigation with Spatial Condition

6 demos - Click any video to view in detail

Quadruped Quadruped: Object Navigation

7 demos - Click any video to view in detail

Quadruped Quadruped: Object Navigation with Self-attribute Condition

6 demos - Click any video to view in detail

Quadruped Quadruped: Object Navigation with Spatial Condition

5 demos - Click any video to view in detail

Humanoid Humanoid: Object Navigation

demos - Click any video to view in detail

Coming Soon!

×
×