宇航学报 ›› 2022, Vol. 43 ›› Issue (6): 802-810.doi: 10.3873/j.issn.1000-1328.2022.06.011

• 制导、导航、控制与电子 • 上一篇    下一篇

一种无人机自主避障与目标追踪方法

江未来,徐国强,王耀南   

  1. 1. 湖南大学电气与信息工程学院,长沙 410082;2. 湖南大学机器人视觉感知与控制技术国家工程研究中心,长沙 410082
  • 收稿日期:2021-07-20 修回日期:2022-01-03 出版日期:2022-06-15 发布日期:2022-06-15
  • 基金资助:
    国家自然科学基金(61903133,61733004);江苏省重点研发计划项目(BE2020082 1);国家重点研发计划重点专项项目(2021YFC1910400)

A Method for Autonomous Obstacle Avoidance and Target Tracking of Unmanned Aerial Vehicle

JIANG Weilai, XU Guoqiang, WANG Yaonan   

  1. 1. College of Electrical and Information Engineering, Hunan University, Changsha 410082, China;2. National Engineering Research Center for Robot Visual Perception and Control Technology, Hunan University,Changsha 410082, China
  • Received:2021-07-20 Revised:2022-01-03 Online:2022-06-15 Published:2022-06-15

摘要: 针对无人机自主避障与目标追踪问题,以深度Q网络(DQN)算法为基础,提出一种多经验池深度Q网络(MP DQN)算法,使无人机避障与追踪的成功率和算法的收敛性得到优化。更进一步,赋予无人机环境感知能力,并在奖励函数中设计了方向奖惩函数,提升了无人机对环境的泛化能力以及算法的整体性能。仿真结果表明,相较于DQN和双重DQN(DDQN)算法,MP DQN算法具有更快的收敛速度、更短的追踪路径和更强的环境适应性。

关键词: 无人机(UAV), 深度强化学习, 自主避障, 目标追踪, 环境感知

Abstract: Aiming at the problem of autonomous obstacle avoidance and target tracking of unmanned aerial vehicle (UAV), based on the Deep Q Network (DQN) algorithm, a Multiple Pools Deep Q Network (MP DQN) algorithm is proposed to optimize the success rate of UAV obstacle avoidance and target tracking and the convergence of the algorithm. Furthermore, the environmental perception ability of UAV is given, and the directional reward function is designed in the reward mechanism, which improves the generalization ability of the UAV to the environment and the overall performance of the algorithm. The simulation results show that, compared with DQN and Double Deep Q Network (DDQN) algorithms, MP DQN algorithm has faster convergence speed, shorter tracking path and stronger environmental adaptability.

Key words: Unmanned aerial vehicle (UAV), Deep reinforcement learning, Autonomous obstacle avoidance, Target tracking, Environmental perception

中图分类号: