The agent has a high intelligence and can improve the learning ability according to the dynamic environment with the ability of Q learning.
学习使智能体具有较高的智能性,可以通过提高自己的学习能力适应不断变化的动态环境。
Q-learning was applied to resolution of the adaptive dispatching rule selection problem under dynamic single-machine scheduling environment.
提出了一种利用Q-学习解决动态单机调度环境下的自适应调度规则选择的方法。
Then the four main algorithms including dynamic programming, monte carlo method, temporal-difference and Q-learning are given respectively, and their difference and relation are pointed out.
动态规划、蒙特卡罗算法、时序差分算法、Q-学习,并指出了它们之间的区别和联系。
应用推荐