Every silver lining has a touch of grey: Developing a dual-valence reinforcement learning model to address depression uncertainty quantification for statistical and neural models
PI: Kim Stachenfeld
Co-PI: Ken Miller, Columbia
Abstract
As we contemplate what actions to take, every possible outcome is a mix of good and bad. A job in a far away city might offer salary and career benefits, but strain your family and distance you from friends. Competing goals, which may play differently across different timescales or in different contexts, must often be weighed to make a decision. Standard Reinforcement Learning models used in Psychology and AI ignore this, treating reward as a scalar quantity that is delivered by the environment, rather than a multifaceted signal constructed by a brain simultaneously optimizing for many different goals. We also know that planning for the future involves replay of the past, that imagining the future and recalling memories are very intimately related, and that the brain simultaneously represents predictions about immediate and distant outcomes . People with depression typically have a trait that is related: when they look forward in time, they see only negative possible outcomes; when they look backward in time, they also see that essentially everything they have done had negative outcomes. They are aware that they used to think there were positive outcomes in the past, but now they can see that those were illusions, that the reality was dominated by the negative aspects, or that what seemed positive was thin and hollow, whereas the negative aspects are now large and dominating. Similarly, they can rationally see the things that people might think are attainable positive future outcomes, but they know that it will come out badly, or that the positive aspects are hollow and empty, and the negative aspects will be dominant. Whatever they do, they will embarrass or hurt themselves and the people they love, and the same was true of the past. As an example, they decided to move the family for a job, which may have worked out well in many ways but also caused pain and difficulty as the family adjusted to being uprooted, and now they see only that they damaged their family and accomplished nothing real. Because of the changes in motivation and reward processing that accompany depression, researchers have often drawn on models from RL and optimal decision theory in order to capture altered learning and decision-making behaviors under depression. However, these models have been limited in their ability to capture systematic changes in depressive behavior. This is because the effect of depression on reward learning appears to be indirect: growing evidence suggests that issues arise with negatively biased beliefs about and memory for events. The goal of this project is to (1) build an elaborated RL agent in which the agent organizes its experience into positive and negative outcomes, which play out on different timescales, and (2) use this to make predictions about how depression affects behavior in complex, real-world environments.
Publications
In progress
Resources
In progress
