A study of stable and plastic structure in Representations from artificial and natural continual learning systems

PI: Richard Zemel 
Co-PI: Stefano Fusi, Columbia

Abstract

Artificial intelligence (AI) systems tasked with sequentially learning to complete new tasks, a regime known as continual learning, have struggled to maintain performance on previously learned tasks. This phenomenon is known as catastrophic forgetting. In the case of large language models, pretraining on enormous corpora of unstructured data before fine-tuning the model on a specific task has been proven to yield models that efficiently learn the new task but struggle to retain this efficiency when learning new tasks beyond the first [1–3]. This is in stark contrast to natural learning systems like the human brain, which seem to not only retain much of their performance on previously learned tasks but even learn new tasks with improved efficiency by bootstrapping previously learned information. How can a system have both the plasticity to learn and excel at new tasks as well as the stability to maintain previously learned capabilities? How can information be structured within the system so that old knowledge can be used to accelerate learning?

We aim to help decipher this asymmetry by investigating learned representations in both artificial and natural intelligence systems. We will focus on results from a recent monkey neuroscience experiment, which stands out as to our knowledge there are few such experiments in a continual learning setting. We hypothesize that
the facility of natural systems for continual learning arises from useful pre-existing properties of the neural
representations in conjunction with mechanisms to promote stability and plasticity in appropriate regions of the brain. We further hypothesize that the absence of these plasticity-stability moderating mechanisms
in artificial learning systems leads to their difficulty with continual learning. Stefano Fusi’s group has developed several techniques for analyzing the geometry of representations coming from different regions
of the brain[4–7]. Specifically, they have investigated the types of geometric properties these representations must maintain in order to generalize to new situations in a learned task. We will extend this work by
analyzing structures of artificial and natural learning systems’ representations that are stable across time and across learning related but distinct tasks, as well as those structures that are plastic, and shift and develop given novel stimuli and tasks. This new form of analysis will allow us to compare and contrast the properties of artificial and natural intelligence systems, developing our understanding of the brain and motivating new machine learning (ML) techniques for continual learning.

Publications

In progress

Resources

In progress