动态规划、蒙特卡罗算法、时序差分算法、Q-学习,并指出了它们之间的区别和联系。
Then the four main algorithms including dynamic programming, monte carlo method, temporal-difference and Q-learning are given respectively, and their difference and relation are pointed out.
动态规划、蒙特卡罗算法、时序差分算法、Q-学习,并指出了它们之间的区别和联系。
Then the four main algorithms including dynamic programming, monte carlo method, temporal-difference and Q-learning are given respectively, and their difference and relation are pointed out.
应用推荐