基于改进的DDPG算法的蛇形机器人路径规划方法
CSTR:
作者:
作者单位:

作者简介:

通讯作者:

中图分类号:

基金项目:


Path planning method of snake-like robot based on improved DDPG algorithm
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
  • |
  • 文章评论
    摘要:

    针对蛇形机器人执行路径规划任务时,面对复杂环境传统强化学习算法出现的训练速度慢、容易陷入死区导致收敛速度慢等问题,提出了一种改进的深度确定性策略梯度(deep deterministic policy gradient,DDPG)算法。首先,在策略-价值(actor-critic)网络中引入多层长短期记忆(long short-term memory,LSTM)神经网络模型,使其控制经验池中信息的记忆和遗忘程度;其次,通过最优化特征参数将CPG(central pattern generators)网络融入强化学习模型,并设计新型网络状态空间和奖励函数;最后,将改进算法与传统算法分别部署在Webots环境中进行仿真实验。结果表明,相比于传统算法,改进算法整体训练时间平均降低了15%,到达目标点迭代次数平均降低了22%,减少了行驶过程中陷入死区的次数,收敛速度也有明显的提升。因此所提算法可以有效地引导蛇形机器人躲避障碍物,为其在复杂环境下执行路径规划任务提供了新的思路。

    Abstract:

    Aiming at the problems of low training speed and convergence speed caused by falling into a dead zone of traditional reinforcement learning algorithm of the snake-like robot when performing path planning task in multi-obstacle environment, an improved deep deterministic policy gradient(DDPG) algorithm was proposed. Firstly, a multi-layer long short-term memory (LSTM) neural network model was introduced into the actor-critic network to control the memory and forgetting degree of information in the experience pool; secondly, the CPG(central pattern generators) network was integrated into a reinforcement learning model by optimizing feature parameters, designing new network state space and reward function, finally, The improved algorithm and the traditional algorithm were deployed in Webots environment for simulation experiments.The results show that compared with the traditional algorithm, the overall training time of the improved algorithm is reduced by 15% on average, and the number of iterations to reach the target point is reduced by 22% on average, which reduces the times of falling into the dead zone during driving and obviously improves the convergence speed. The algorithm can effectively guide the snake-like robot to avoid obstacles, thus providing a new idea for its performing path planning task in multi-obstacle environment.

    参考文献
    相似文献
    引证文献
引用本文

郝崇清,任博恒,赵庆鹏,侯宝帅,白 彤,武晓晶,樊劲辉.基于改进的DDPG算法的蛇形机器人路径规划方法[J].河北科技大学学报,2023,44(2):165-176

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
  • HTML阅读次数:
  • 引用次数:
历史
  • 收稿日期:2022-12-21
  • 最后修改日期:2023-02-25
  • 录用日期:
  • 在线发布日期: 2023-05-11
  • 出版日期:
文章二维码