Journal of Astronautics ›› 2021, Vol. 42 ›› Issue (6): 757-765.doi: 10.3873/j.issn.1000-1328.2021.06.009

Previous Articles     Next Articles

Multi UAV Cooperative Autonomous Navigation Based on  Multi agent Deep Deterministic Policy Gradient

LI Bo, YUE Kai qiang, GAN Zhi gang, GAO Pei xin   

  1. School of Electronics and Information, Northwestern Polytechnical University, Xi’an 710114, China
  • Received:2020-07-17 Revised:2020-11-04 Online:2021-06-15 Published:2021-07-23

Abstract: Aiming at the problem that the traditional optimization algorithm is difficult to get the desired results in a short time in the research of multi UAV (unmanned aerial vehicle) task decision making method, this paper proposes a multi agent deep deterministic policy gradient (MADDPG) algorithm based on deep reinforcement learning. It allows UAVs to use global information in learning and only local information in application decision making. The model structure of MADDPG algorithm is designed. Finally, through simulation experiments and comparing with deep deterministic policy gradient (DDPG) algorithm, it is verified that the MADDPG algorithm proposed in this paper can greatly improve the learning speed on the basis of ensuring the accuracy, and make up for the shortcomings of the traditional reinforcement learning algorithm in the field of multiple agents. 


Key words: UAV, Task decision making, Deep reinforcement learning, Policy gradient, Multi agent

CLC Number: