Sutton and Barto Part 2

I am currently reading Sutton and Barto (reference below). Along the way I decided to recreate certain experiments cited in each book chapter. This particular example includes the example from Figure 6.2.

I have added a notebook of the experiments:

Temporal Difference Learning with Batch Updates

Plots are here.

  1. Richard S. Sutton and Andrew G. Barto. 2018. Reinforcement Learning: An Introduction. A Bradford Book, Cambridge, MA, USA.