q learning method q学习算法
Trying to improve the learning time, the reward values in Q-learning method are not constant. MFQLA tuned the reward values according to current state.
为了改善学习的时间,Q学习方法中的奖励值并不是固定的,而是根据状态而变化。
For reinforcement learning control in continuous Spaces, a Q-learning method based on a self-organizing fuzzy RBF (radial basis function) network is proposed.
针对连续空间下的强化学习控制问题,提出了一种基于自组织模糊rbf网络的Q学习方法。
In order to reduce the delay of cars passing through intersections, control strategies are set up by cloud model and some parameters of the control model are improved by Q-learning method.
为了减少车辆通过路口的延误,采用云模型建立控制策略,运用Q -学习改进控制模型的参数。
应用推荐