宇航学报 ›› 2022, Vol. 43 ›› Issue (9): 1176-1185.doi: 10.3873/j.issn.1000-1328.2022.09.005

• 飞行器设计与力学 • 上一篇    下一篇

基于深度强化学习的复杂地形适应机器人设计与实验

杨顿,杨帅,于洋,王琪   

  1. 北京航空航天大学航空科学与工程学院,北京 100191
  • 收稿日期:2022-02-12 修回日期:2022-06-12 出版日期:2022-09-15 发布日期:2022-09-15
  • 基金资助:
    国家优秀青年科学基金(12022212)

Design and Experiment of Complex Terrain Adaptive Robot Based on Deep Reinforcement Learning

YANG Dun, YANG Shuai, YU Yang, WANG Qi   

  1. School of Aeronautic Science and Engineering, Beihang University, Beijing 100191, China
  • Received:2022-02-12 Revised:2022-06-12 Online:2022-09-15 Published:2022-09-15

摘要: 针对行星表面轻量化自主探测任务,基于仿生思想设计了一种仿海胆结构的十二足球形机器人,其具备自主改变构型以贴合复杂地形的能力,可实现无倾覆、高容错的全向运动;基于数据驱动方法,对该机器人设计了一种数据高效的无模型强化学习运动策略,可实现无先验知识的从0到1步态训练以及步态的实物样机快速部署。通过在平面地形和非结构化地形中对其进行仿真实验,验证了经过训练的机器人具备自主运动、适应非结构地形等能力;通过与常用基准策略进行对比,证实了本文提出的运动策略具有训练高效、鲁棒性好的优势;最后通过开发原理样机,开展实物实验验证了仿真环境中所生成的步态在真实物理环境中的动力学可行性。

关键词: 仿生机器人, 强化学习, 复杂地形, 自主运动策略, 行星探测

Abstract: For the lightweight autonomous exploration mission of planetary surface, a sea urchin like twelve leg spherical robot is proposed based on the structural bionic idea. It has the potential to autonomously change the structure to fit the complex terrain, and can realize omnidirectional motion without overturning and high fault tolerance. Based on the data driven method, a data efficient model free reinforcement learning motion strategy is designed for the robot, which can realize zero to one gait training and deployment without prior knowledge and rapid deployment of the physical prototype of gait. Through the simulation experiments on flat ground and unstructured terrain, it is verified that the trained robot has the ability to move autonomously and adapt to unstructured terrain. By comparing with the commonly used benchmark strategies, it is proved that the proposed strategy has the advantages of high training efficiency and good robustness. Finally, a prototype is developed to verify the dynamic feasibility of the gait generated in the simulation environment in the real physical environment.

Key words: Bionic robots, Reinforcement learning, Complex terrain, Autonomous movement strategies, Planetary exploration

中图分类号: