A bio-inspired quadruped locomotion framework that separates proprioceptive kinesthetics from visuospatial terrain reasoning, enabling dynamic traversal and graceful fallback under corrupted or unavailable vision.
KiVi enables robust locomotion and obstacle avoidance on a DeepRobotics Lite3 quadruped across diverse terrains and under severe visual disturbances.
Vision-based locomotion has shown great promise in enabling legged robots to perceive and adapt to complex environments. However, visual information is inherently fragile, being vulnerable to occlusions, reflections, and lighting changes, which often cause instability in locomotion. Inspired by animal sensorimotor integration, we propose KiVi, a Kinesthetic-Visuospatial integration framework, where kinesthetics encodes proprioceptive sensing of body motion and visuospatial reasoning captures visual perception of surrounding terrain. KiVi separates these pathways, leveraging proprioception as a stable backbone while selectively incorporating vision for terrain awareness and obstacle avoidance. Combined with memory-enhanced attention, this design allows robust interpretation of visual cues while maintaining fallback stability through proprioception. Experiments show that KiVi enables quadruped robots to traverse diverse terrains and operate reliably in unstructured outdoor environments, remaining robust to out-of-distribution visual noise and occlusion unseen during training.
KiVi uses a dual-branch estimator with a Kinesthetic Module for proprioceptive body-motion sensing and a Visuospatial Module for visual terrain reasoning.
The kinesthetic branch provides a stable locomotion backbone, while the visuospatial branch uses memory-enhanced attention to reconstruct terrain structure and anticipate obstacles. Their latent representations are integrated by the downstream actor for dynamic, terrain-aware control.
KiVi is evaluated in simulation and on DeepRobotics Lite3 hardware across visual corruption, terrain traversability, and outdoor disturbance tests.
Training spans stairs, platforms, random rough terrain, slopes, gaps, and high walls with increasing procedural difficulty.
With a constant forward command, the robot traverses tree roots, stairs, elevated platforms, and dynamic pedestrian scenarios.
Under tall grass and complete camera occlusion, KiVi maintains stable locomotion by falling back to proprioceptive control.
Compared with a fused visual-proprioceptive baseline, KiVi keeps joint power and variance closer to the blind locomotion baseline under severe visual disturbances, indicating stable and energy-efficient control.
Reflective surfaces create structured depth artifacts, yet KiVi maintains stable locomotion.
@inproceedings{li2026kivi,
title={KiVi: Kinesthetic-Visuospatial Integration for Dynamic and Safe Egocentric Legged Locomotion},
author={Li, Peizhuo and Li, Hongyi and Ma, Yuxuan and Chang, Linnan and Yang, Xinrong and Yu, Ruiqi and Liao, Shuhao and Zhang, Yifeng and Cao, Yuhong and Zhu, Qiuguo and Sartoretti, Guillaume},
booktitle={IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)},
year={2026},
eprint={2509.23650},
archivePrefix={arXiv},
primaryClass={cs.RO}
}