REFERENCES
1. Li, J.; Yang, S. X. Digital twins to embodied artificial intelligence: review and perspective. Intell. Robot. 2025, 5, 202-27.
2. Chen, W.; Chi, W.; Ji, S.; et al. A survey of autonomous robots and multi-robot navigation: perception, planning and collaboration. Biomim. Intell. Robot. 2025, 5, 100203.
3. Bin, T.; Yan, H.; Wang, N.; Nikolić, M. N.; Yao, J.; Zhang, T. A survey on the visual perception of humanoid robot. Biomim. Intell. Robot. 2025, 5, 100197.
4. Fan, R., Guo, S., Bocus, M. J. Autonomous driving perception. Cham, Switzerland: Springer, 2023. https://scholar.google.com/scholar?q=Autonomous+driving+perception.+Cham,+Switzerland:+Springer+2023&hl=zh-CN&as_sdt=0&as_vis=1&oi=scholart (accessed 2025-11-24).
5. Huang, Y.; Fan, D.; Duan, H.; et al. Human-like dexterous manipulation for anthropomorphic five-fingered hands: a review. Biomim. Intell. Robot. 2025, 5, 100212.
6. Mon-Williams, R.; Li, G.; Long, R.; Du, W.; Lucas, C. G. Embodied large language models enable robots to complete complex tasks in unpredictable environments. Nat. Mach. Intell. 2025, 7, 592-601.
7. Liu, Z.; Zhao, W.; Jia, N.; Liu, X.; Yang, J. SANet: scale-adaptive network for lightweight salient object detection. Intell. Robot. 2024, 4, 503-23.
8. He, H.; Liao, R.; Li, Y. MSAFNet: a novel approach to facial expression recognition in embodied AI systems. Intell. Robot. 2025, 5, 313-32.
9. Zhuang, T.; Liang, X.; Xue, B.; Tang, X. An in-vehicle real-time infrared object detection system based on deep learning with resource-constrained hardware. Intell. Robot. 2024, 4, 276-92.
10. Zhang, C.; Chen, J.; Li, J.; Peng, Y.; Mao, Z. Large language models for human-robot interaction: a review. Biomim. Intell. Robot. 2023, 3, 100131.
11. Liu, H.; Dong, Y.; Hou, C.; et al. Parallel implementation for real-time visual SLAM systems based on heterogeneous computing. Intell. Robot. 2024, 4, 256-75.
12. Chen, Q.; Wang, C.; Wang, D.; Zhang, T.; Li, W.; He, X. Lifelong knowledge editing for vision language models with low-rank mixture-of-experts. arXiv 2004, arXiv:2411.15432.
13. Huang, J.; Li, J.; Jia, N.; et al. RoadFormer+: delivering RGB-X scene parsing through scale-aware information decoupling and advanced heterogeneous feature fusion. IEEE. Trans. Intell. Veh. 2025, 10, 3156-65.
14. Liu, Y.; Chen, Q.; Albanie, S. Adaptive cross-modal prototypes for cross-domain visual-language retrieval. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR); 2021. pp 14954-64. https://openaccess.thecvf.com/content/CVPR2021/papers/Liu_Adaptive_Cross (accessed 2025-11-24).
15. Guo, S.; Long, Z.; Wu, Z.; Chen, Q.; Pitas, I.; Fan, R. LIX: implicitly infusing spatial geometric prior knowledge into visual semantic segmentation for autonomous driving. IEEE. Trans. Image. Process. 2025, 34, 7250-63.
16. Liu, C., Chen, Q., Fan, R. Playing to vision foundation model’s strengths in stereo matching. IEEE, 2024; pp 1-12.






