- This event has passed.
Continual Learning Working Group Talk
July 30 @ 3:00 pm - 5:00 pm
Title: Continual learning, machine self-reference, and the problem of problem-awareness
Abstract: Continual learning (CL) without forgetting has been a long-standing problem in machine learning with neural networks. Here I will bring a new perspective by looking at learning algorithms (LAs) as memory mechanisms with their own decision making problem. I will present a natural solution to CL under this view: instead of handcrafting such LAs, we metalearn continual in-context LAs using self-referential weight matrices. Experiments confirm that this method effectively achieves CL without forgetting, outperforming handcrafted algorithms on classic benchmarks. While this is a promising result on its own, in this talk, I will go beyond this limited scope of CL. I will serve this CL setting as an example to introduce a broader perspective of “problem awareness” in machine learning. I will argue that in many prior CL methods, systems fail in CL because they do not know what it means to continually learn without forgetting. I will show that the same argument can explain the previous failures of neural networks on other classic challenges—historically pointed out by cognitive scientists in comparison to human intelligence—, such as systematic generalization and few-shot learning. I will highlight how similar metalearning methods provide a promising solution to these challenges too.
Bio: is a post-postdoc at Harvard University, Center for Brain Science.
Previously, he was a postdoc and lecturer at the Swiss AI Lab IDSIA, University of Lugano (Switzerland) from 2020 to 2023, where he taught a popular course on practical deep learning. He received his PhD in Computer Science from RWTH Aachen University (Germany) in 2020, and undergraduate and Master’s degrees in Applied Mathematics from École Centrale Paris and ENS Cachan (France). He was also a research intern at Google in NYC and Mountain View, in 2017 and 2018. He is broadly interested in the computational principles of learning, memory, perception, self-reference, and decision making, as ingredients for building and understanding general-purpose intelligence. The scope of his research interests has expanded from language modeling (PhD) to general sequence and program learning (postdoc), and currently to neuroscience and cognitive science (post-postdoc).