本文研究了连续时间马氏决策规划折扣模型在(c)上最优策略的若干重要性质和它的结构。
Certain important properties of an optimal policy in m (c) for a continuous time discounted Markov decision model are studied.
在无穷时间和连续折扣情况下,证明了最优修理、更新策略的存在,以使设备的期望折扣净收入最大。
Under the criterion of infinite-horizon expected discounted reward, the existence of some optimal policy is proved.
应用推荐