Human–AI Co-Navigation for Indoor Object Search under Uncertainty
Abstract
Assistive technologies for people with visual impairments increasingly use artificial intelligence to support object-finding and navigation in indoor environments. Yet fully autonomous perception remains unreliable in such settings, as indoor spaces are visually complex, only partially observable from the user’s current viewpoint, and subject to continuous change. Our work takes the position that effective assistive navigation is inherently collaborative; the system performs continuous perceptual processing, while the user provides occasional natural-language guidance when the search becomes uncertain or inefficient. To this end, we propose a human–AI collaboration framework that utilizes a Vision-Language Model (VLM) as the perceptual and semantic backbone of a navigation agent. A human user, modeled by a simulated intervention controller, provides sparse and structured guidance, which is integrated with the VLM to update its semantic search hypotheses toward the likely location of the target object. Evaluation is conducted in the Habitat simulator on photorealistic scenes from the Habitat-Matterport3D dataset. Experiments analyze how human guidance affects task success and navigation efficiency, showing that guidance is most effective when it corrects the VLM's misaligned semantic search hypotheses, providing insights into the role of minimal human input in VLM-based assistive navigation systems.
Keywords: Assistive Technology, Human-AI Collaboration, Vision-language Models, Indoor Navigation
DOI: 10.54941/ahfe1007370
Cite this paper
More from this volume
- View2Decide: A Wearable Traffic-Light Display for Real-Time Physiological Decision Support in Military First Response
- Early Prediction of Physiological Strain Using Multivariate Time-Series Data
- Real-time detection and machine learning classification of physical fatigue in construction workers using multi-modal digital biomarkers
- Ergonomic Assessment of Lower-Limb Exoskeleton on Physiological Responses in Wildland Firefighters
- Integrating firefighters’ individual physical state in enhanced automated respiratory protection monitoring as decision-support: Influence on cognitive load in complex incident operations in a VR-Study
- Conversational Co-Design with Machine Agency
- Investigating Mindfulness and Decision-Making under Stress Using Immersive Virtual Reality Firefighting Scenarios
- Decision-Making in Emergency Response Organisations: Human Factors Challenges and Implications for Digital Support Systems
- Mobile Platform for Integrated Data Capture in Immersive First Responder Training and Decision-Making
- Towards Fair Representation in AI-Mediated Decision-Making: A Conceptual Model for Socio-Technical Contexts
- Creating a Framework for the Collection of Biometric and Environmental Data During Collegiate Flight Training
- Augmented Memory and Attention in UI Interaction: NTDC as an Information-Theoretic Framework for Learning and Multitasking


AHFE Open Access