学习是一种重要的强化学习算法。
讨论平均准则控制马氏链的强化学习算法。
An average reward reinforcement learning algorithm for control Markov chains is presented.
论文主要研究了基于平均型强化学习算法的动态调度方法。
The thesis mainly focuses on the dynamic scheduling method based on the averaged rewards reinforcement learning algorithms.
传统的强化学习算法只能解决离散状态空间和动作空间的学习问题。
Conventional reinforcement algorithms only deal with discrete state Spaces and discrete action Spaces.
说明:模拟智能机器小车,通过强化学习算法,学习最优导航策略。
Simulation machine car through reinforcement learning algorithm, learning optimal navigation strategies.
目前主流的强化学习算法是Q学习算法,但Q学习本身存在一些问题。
Q learning algorithm is the most popular reinforcement learning algorithm, but the algorithm exist some problems.
多代理体技术实现了教学的个性化,强化学习算法使得教学策略具有智能化。
Multi-Agent technology achieves the personalized in ITS, and reinforcement learning algorithm makes teaching strategies with the intelligent.
主要研究了强化学习算法及其在机器人足球比赛技术动作学习问题中的应用。
This paper discusses reinforcement learning(RL)algorithm and its application to technical action learning of soccer robot.
在理论分析的基础上,提出了协同博弈的强化学习算法,并证明了算法的收敛性。
On the basis of theoretical analysis, the cooperative game reinforcement learning method is proposed and its convergence is proved.
本文提出了基于过程奖赏和优先扫除的强化学习算法作为多机器人系统的冲突消解策略。
A reinforcement learning algorithm based on process reward and prioritized sweeping is presented as interference solving strategy.
将Q强化学习算法应用于移动机器人局部路径规划,解决了移动机器人在复杂环境中的局部路径规划问题。
In this paper Q reinforcement learning algorithm is adopted for mobile robot local path planning. It makes mobile robot resolve the problem of local path planning in a complex environment.
论文提出一种模糊强化学习算法,通过模糊推理系统将连续的状态空间映射到连续的动作空间,然后通过学习得到一个完整的规则库。
In this paper, we propose a fuzzy reinforcement algorithm, which map continuous state Spaces to continuous action Spaces by fuzzy inference system and then learn a rule base.
这种方法可以削减学习哈尔滨工程大学博士学位论文单元的冗余状态信息,降低学习空间的组合强度,加快群体强化学习算法的学习速度。
The new algorithm can cut down the redundant state information, so that the composition intensity of learning space is decreased and the convergence of the learning course is accelerated.
为了提高智能体系统中的典型的强化学习——Q -学习的学习速度和收敛速度,使学习过程充分利用环境信息,本文提出了一种基于经验知识的Q -学习算法。
In order to enhance the study speed and the convergence rate of Q-learning algorithm, an algorithm that based on the experience knowledge about environment is proposed.
该算法采用强化学习中值迭代策略,在运行中能够从环境中获取相应知识,提高其搜索能力。
By adopting the value iterative strategies of reinforcement learning, the algorithm can absorb the corresponding knowledge from its environment during its running and improve its search ability.
讨论了学习社会行为的可行性和必要性,并采用强化学习方法,给出了多机器人传接合作搬运的详细算法实现。
The possibility and necessity of learning social behavior were discussed, and applying reinforcement learning and the above idea to multi-agent's learning relay cooperation in convey.
提出了基于强化学习的网络爬虫算法,并应用于餐饮类站点的发现中。
A network spider algorithm based on the reinforcement learning is proposed and deployed to discovery the web site of dinning.
主要采用强化学习的方法对AUV进行控制和决策,综合Q学习算法、BP神经网络和人工势场法对AUV进行避碰规划。
The reinforcement learning is adopted to control and decision for AUV, and Q-learning, BP neural net, artificial potential is integrated to avoidance planning for AUV.
在单独的行为对象中包含了基于强化学习中的Q学习及人工神经网络的优化学习算法。
A single behavior object contains the algorithm for optimizing the demonstrated group of ACTS. The algorithm is using the Q-learning based on artificial nerve network.
本文提出了一种基于蚁群算法的强化学习模型,即蚁群算法与Q学习相结合的思想。
The paper proposes a model of reinforcement learning based on ant colony algorithm, namely the combination of ant colony algorithm and Q learning.
本文提出了一种基于蚁群算法的强化学习模型,即蚁群算法与Q学习相结合的思想。
The paper proposes a model of reinforcement learning based on ant colony algorithm, namely the combination of ant colony algorithm and Q learning.
应用推荐