May 18, 2021

Optimism and Pessimism in Optimised Replay

<p>The replay of task-relevant trajectories is known to contribute to memory consolidation and improved task performance. A wide variety of experimental data show that the content of replayed sequences is highly specific and can be modulated by reward as well as other prominent task variables. However, the rules governing the choice of sequences to be replayed still remain poorly understood. One recent theoretical suggestion is that the prioritization of replay experiences in decision-making problems is based on their effect on the choice of action. We exploit this to address recent experimental data showing in a particular task that human subjects tended to replay sub-optimal outcomes that they later chose to avoid. We show that pessimistic replay is of benefit to forgetful agents experiencing large amounts of uncertainty in their models of the world. Further, we fit our model parameters to the individual subjects' choices and confirm that their replay choices were appropriate according to the proposed scheme.</p>
