Later on, you need to learn how to develop the operant conditioning method into a reinforcement schedule, so that you'll know when it's appropriate to give out good and bad rewards for things.
稍后,你需要学习怎样在强化日程表中增加行为条件,以便你知道什么时候应该提供好或坏的回报。
I think one of the things about reinforcement learning is that it tends to require exploration. So using it in the context of physical systems is somewhat hard.
我认为增强学习的一个特点就是它需要探索,所以在物理系统环境下使用它往往有些困难。
应用推荐