Trying to improve the learning time, the reward values in Q-learning method are not constant. MFQLA tuned the reward values according to current state.
为了改善学习的时间,Q学习方法中的奖励值并不是固定的,而是根据状态而变化。
For reinforcement learning control in continuous Spaces, a Q-learning method based on a self-organizing fuzzy RBF (radial basis function) network is proposed.
针对连续空间下的强化学习控制问题,提出了一种基于自组织模糊rbf网络的Q学习方法。
In order to reduce the delay of cars passing through intersections, control strategies are set up by cloud model and some parameters of the control model are improved by Q-learning method.
为了减少车辆通过路口的延误,采用云模型建立控制策略,运用Q -学习改进控制模型的参数。
Q learning method is used in intelligence planning path with magnets to achieve the shortest path search, obstacle avoidance, task scheduling and so on.
采用Q学习方法进行磁钉路径的智能规划,实现最短路径寻找,同时解决了任务调度及避障等问题。
The result of simulation illustrates that the signal control method based on Q-Learning is better than fixed-time control, actuated control and signal control based on genetic algorithms.
仿真实验的结果表明,基于Q -学习的信号控制方法优于定时控制、感应式控制和基于遗传算法的信号控制方法。
Then the four main algorithms including dynamic programming, monte carlo method, temporal-difference and Q-learning are given respectively, and their difference and relation are pointed out.
动态规划、蒙特卡罗算法、时序差分算法、Q-学习,并指出了它们之间的区别和联系。
Q-learning is a typical Reinforcement Learning (RL) method with a slow convergence speed especially as the scales of the state space and action space increase.
学习是一种典型的强化学习,其学习效率较低,尤其是当状态空间和决策空间较大时。
Q-learning is a typical Reinforcement Learning (RL) method with a slow convergence speed especially as the scales of the state space and action space increase.
学习是一种典型的强化学习,其学习效率较低,尤其是当状态空间和决策空间较大时。
应用推荐