Temporal-Difference Learning with Linear Function Approximation

The next couple of meetings will be dedicated to Reinforcement Learning (RL).
This week we will cover the paper
Fast Gradient-Descent Methods for Temporal-Difference Learning with Linear Function Approximation by Sutton et al.
This paper presented a solution to a long-standing problem: how to learn an
approximate value function in an *online* (efficient) but* off-policy* setup.

If you want to refresh your RL knowledge, I recommend Nahum Shimkin’s
lecture notes. In particular, the last lecture Value and Policy Approximations is very relevant to this weeks topic.

Leave a Reply

Your email address will not be published. Required fields are marked *