Much animal learning is slow, with cumulative changes in behavior driven by reward prediction errors. When the abstract structure of a problem is known, however, both animals and formal learning models can rapidly attach new items to their roles within this structure, sometimes in a single trial. Frontal cortex is likely to play a key role in this process. To examine information seeking and use in a known problem structure, we trained monkeys in a novel explore/exploit task, requiring the animal first to test objects for their association with reward, then, once rewarded objects were found, to re-select them on further trials for further rewards. Many cells in the frontal cortex showed an explore/exploit preference, changing activity in a signal trial to align with one-shot learning in the monkeys’ behaviour. In contrast to this binary switch, these cells showed little evidence of continuous changes linked to expectancy or prediction error. Explore/exploit preferences were independent for two stages of the trial, object selection and receipt of feedback. Within an established task structure, frontal activity may control the separate operations of explore and exploit, switching in one trial between the two.
bioRxiv Subject Collection: Neuroscience