3DGSNav icon

Enhancing Vision-Language Model Reasoning for
Object Navigation via Active 3D Gaussian Splatting

Wancai Zheng1, Hao Chen2, Xianlong Lu1, Linlin Ou1, Xinyi Yu1*,
1Zhejiang University of Technology.
2Zhejiang University.
*Corresponding author

Abstract

Object navigation is a core capability of embodied intelligence, enabling an agent to locate target objects in unknown environments. Recent advances in vision–language models (VLMs) have facilitated zero-shot object navigation (ZSON). However, existing methods often rely on scene abstractions that convert environments into semantic maps or textual representations, causing high‑level decision making to be constrained by the accuracy of low‑level perception. In this work, we present 3DGSNav, a novel ZSON framework that embeds 3D Gaussian Splatting (3DGS) as persistent memory for VLMs to enhance spatial reasoning. Through active perception, 3DGSNav incrementally constructs a 3DGS representation of the environment, enabling trajectory-guided free-viewpoint rendering of frontier-aware first-person views. Moreover, we design structured visual prompts and integrate them with Chain-of-Thought (CoT) prompting to further improve VLM reasoning. During navigation, a real‑time object detector filters potential targets, while VLM‑driven active viewpoint switching performs target re‑verification, ensuring efficient and reliable recognition. Extensive evaluations across multiple benchmarks and real‑world experiments on a quadruped robot demonstrate that our method achieves robust and competitive performance against state‑of‑the‑art approaches.

BibTeX


        @misc{zheng20263dgsnavenhancingvisionlanguagemodel,
              title={3DGSNav: Enhancing Vision-Language Model Reasoning for Object Navigation via Active 3D Gaussian Splatting}, 
              author={Wancai Zheng and Hao Chen and Xianlong Lu and Linlin Ou and Xinyi Yu},
              year={2026},
              eprint={2602.12159},
              archivePrefix={arXiv},
              primaryClass={cs.RO},
              url={https://arxiv.org/abs/2602.12159}, 
        }