Internal Working Group Speakers

Frontier Models for Neuroscience and Behavior

Mia Dai (Columbia)- June 8, 2026

Date: June 8, 2026
Time: 3:00pm
Virtual Link: Upon request at [email protected]

Title: Sparse-View Interpretable 3D Animal Behavior Representations for Neural Encoding and Decoding

Abstract: A deeper understanding of brain function requires a precise, structured characterization of behavior. Yet, capturing behavior from video in a form suitable for scientific analysis remains a fundamental challenge. Many prior studies represent behavior via pose estimation or nonlinear video embeddings. However, pose tracking discards rich information beyond predefined keypoints, while nonlinear video embeddings lack interpretability. We address this limitation with SABLE (Sparse-view Animal Behavior Latent Embeddings), a self-supervised framework that leverages geometric inductive biases to learn behavior representations. By augmenting a multi-view transformer with priors from monocular depth and pose estimation, SABLE reconstructs 3D animal behavior from extremely sparse views while learning explicit 3D latent structure. Without ground-truth 3D labels, it reliably recovers 3D behavior from two-view videos, whereas state-of-the-art methods fail or yield degenerate solutions. Through comprehensive evaluation, we demonstrate that SABLE learns 3D representations that match or exceed prior SOTA performance in neural encoding and decoding. Our method establishes 3D-aware video embeddings that capture complex behavior, opening new avenues for studying brain-behavior relationships.

Hubert Banville (Meta))- May 11, 2026

Hubert Banville

Date: May 11, 2026
Time: 3:00pm
Virtual Link: Upon request at [email protected]

Title: A foundation model of vision, audition, and language for in-silico neuroscience

Abstract: Cognitive neuroscience is fragmented into specialized models, each tailored to specific experimental paradigms, hence preventing a unified model of cognition in the human brain. Here, we introduce TRIBE v2, a tri-modal (video, audio and language) foundation model capable of predicting human brain activity in a variety of naturalistic and experimental conditions. Leveraging a unified dataset of over 1,000 hours of fMRI across 720 subjects, we demonstrate that our model accurately predicts high-resolution brain responses for novel stimuli, tasks and subjects, superseding traditional linear encoding models, delivering several-fold improvements in accuracy. Critically, TRIBE v2 enables in silico experimentation: tested on seminal visual and neuro-linguistic paradigms, it recovers a variety of results established by decades of empirical research. Finally, by extracting interpretable latent features, TRIBE v2 reveals the fine-grained topography of multisensory integration. These results establish artificial intelligence as a unifying framework for exploring the functional organization of the human brain. We will be hosting Hubert Banville from Meta who will discuss their latest TRIBE fMRI foundation model.

Jorge Menendez (Meta Reality Labs)- March 2, 2026

Jorge Aurelio Menendez

Date: March 2, 2026
Time: 3:00pm
Virtual Link: Upon request at [email protected]

Abstract: Spatiotemporal and multimodal datasets contain structured variability distributed across space, time, and measurement modality, motivating modeling approaches that can learn representations directly from large-scale data. Inspired by video foundational models, we study how the masked autoencoder training objective can learn shared structure across heterogeneous observations while preserving modality-specific information, and how training these models requires multiple engineering methods for scaling. Furthermore, we show that self-attention supports the emergence of interpretable structure by decomposing them based on the variability across samples. These results suggest that large-scale self-supervised learning provides a unified approach for modeling high-dimensional dynamical systems while enabling interpretation of the learned representations.

Thuy Nguyen (MILA) - February 2, 2026

Thuy Nguyen

Date: February 2, 2026
Time: 3:00pm
Virtual Link: request @ [email protected]

Title: A multimodal sleep foundation model for disease prediction 2)

Abstract:
Sleep is a fundamental biological process with broad implications for physical and mental health, yet its complex relationship with disease remains poorly understood. Polysomnography (PSG)—the gold standard for sleep analysis—captures rich physiological signals but is underutilized due to challenges in standardization, generalizability and multimodal integration. To address these challenges, we developed SleepFM, a multimodal sleep foundation model trained with a new contrastive learning approach that accommodates multiple PSG configurations. Trained on a curated dataset of over 585,000 hours of PSG recordings from approximately 65,000 participants across several cohorts, SleepFM produces latent sleep representations that capture the physiological and temporal structure of sleep and enable accurate prediction of future disease risk. From one night of sleep, SleepFM accurately predicts 130 conditions with a C-Index of at least 0.75 (Bonferroni-corrected P < 0.01), including all-cause mortality (C-Index, 0.84), dementia (0.85), myocardial infarction (0.81), heart failure (0.80), chronic kidney disease (0.79), stroke (0.78) and atrial fibrillation (0.78). Moreover, the model demonstrates strong transfer learning performance on a dataset from the Sleep Heart Health Study—a dataset that was excluded from pretraining—and performs competitively with specialized sleep-staging models such as U-Sleep and YASA on common sleep analysis tasks, achieving mean F1 scores of 0.70–0.78 for sleep staging and accuracies of 0.69 and 0.87 for classifying sleep apnea severity and presence. This work shows that foundation models can learn the language of sleep from multimodal sleep recordings, enabling scalable, label-efficient analysis and disease prediction.

Bryan Li (UPenn) - November 5, 2025

Bryan Li

Date: November 5, 2025

Bio
Bryan Li is completing his PhD in NeuroAI at the University of Edinburgh, under the supervision of Arno Onken and Nathalie Rochefort. His main PhD project focuses on building deep learning-based encoding models of the visual cortex that accurately predict neural activity in response to arbitrary visual stimuli. Recently, he joined Dario Farina’s lab at Imperial College London as an Encode Fellow, working on neuromotor interfacing and decoding.

Title:
Movie-trained transformer reveals novel response properties to dynamic stimuli in mouse visual cortex (https://www.biorxiv.org/content/10.1101/2025.09.16.676524v2)

Abstract:
Understanding how the brain encodes complex, dynamic visual stimuli remains a fundamental challenge in neuroscience. Here, we introduce ViV1T, a transformer-based model trained on natural movies to predict neuronal responses in mouse primary visual cortex (V1). ViV1T outperformed state-of-the-art models in predicting responses to both natural and artificial dynamic stimuli, while requiring fewer parameters and reducing runtime. Despite being trained exclusively on natural movies, ViV1T accurately captured core V1 properties, including orientation and direction selectivity as well as contextual modulation, despite lacking explicit feedback mechanisms. ViV1T also revealed novel functional features. The model predicted a wider range of contextual responses when using natural and model-generated surround stimuli compared to traditional gratings, with novel model-generated dynamic stimuli eliciting maximal V1 responses. ViV1T also predicted that dynamic surrounds elicited stronger contextual modulation than static surrounds. Finally, the model identified a subpopulation of neurons that exhibit contrast-dependent surround modulation, switching their response to surround stimuli from inhibition to excitation when contrast decreases. These predictions were validated through semi-closed-loop in vivo recordings. Overall, ViV1T establishes a powerful, data-driven framework for understanding how brain sensory areas process dynamic visual information across space and time.

Konstantin Willeke (Stanford University) - October 8, 2025

Konstantin Willeke

Date: October 8, 2025
Time: 2:00pm
Zoom: Upon request @ [email protected]

Title:
OmniMouse: Scaling properties of multi-modal, multi-task Brain Models on 150B Neural Tokens

Abstract:
Scaling data and artificial neural networks has transformed AI, driving breakthroughs in language and vision. Whether similar principles apply to modeling brain activity remains unclear. Here we leveraged a dataset of 3.3 million neurons from the visual cortex of 78 mice across 323 sessions, totaling more than 150 billion neural tokens recorded during natural movies, images and parametric stimuli, and behavior. We train multi-modal, multi-task transformer models (1M–300M parameters) that support three regimes flexibly at test time: neural prediction (predicting neuronal responses from sensory input and behavior), behavioral decoding (predicting behavior from neural activity), neural forecasting (predicting future activity from current neural dynamics), or any combination of the three. We find that performance scales reliably with more data, but gains from increasing model size saturate — suggesting that current brain models are limited by data rather than compute. This inverts the standard AI scaling story: in language and computer vision, massive datasets make parameter scaling the primary driver of progress, whereas in brain modeling — even in the mouse visual cortex, a relatively simple and low-resolution system — models remain data-limited despite vast recordings. These findings highlight the need for richer stimuli, tasks, and larger-scale recordings to build brain foundation models. The observation of systematic scaling raises the possibility of phase transitions in neural modeling, where larger and richer datasets might unlock qualitatively new capabilities, paralleling the emergent properties seen in large language models.

Vinam Arora (University of Pennsylvania) and Ji Xia (Columbia University) - August 27, 2025

Vinam Arora and Ji Xia

Date: August 27, 2025
Location: JLGSC-L3-079
Time: 2:00pm
Zoom: Upon request @ [email protected]

Title and Abstracts:

1st Speaker: Vinam Arora
Title: Know Thyself by Knowing Others: Learning Neuron Identity from Population Context

Abstract: Identifying the functional identity of individual neurons is essential for interpreting circuit dynamics, yet remains a major challenge in large-scale in vivo recordings where anatomical and molecular labels are often unavailable. Here we introduce NuCLR, a self-supervised framework that learns context-aware representations of neuron identity by modeling each neuron's role within the broader population. NuCLR employs a spatiotemporal transformer that captures both within-neuron dynamics and across-neuron interactions, and is trained with a sample-wise contrastive objective that encourages stable, discriminative embeddings across time. Across multiple open-access datasets, NuCLR outperforms prior methods in both cell type and brain region classification. It enables zero-shot generalization to entirely new populations—without retraining or access to stimulus labels—offering a scalable approach for real-time, functional decoding of neuron identity across diverse experimental settings.

2nd Speaker:
Title: In painting the neural picture: Inferring Unrecorded Brain Area Dynamics from Multi-Animal Datasets.

Abstract: Understanding how the brain drives memory-guided movements requires recording neural activity from the motor cortex and interconnected subcortical areas. Neuropixels probes now allow simultaneous recordings from subsets of these areas, but no single session captures all areas of interest, and different neurons are sampled from each area across sessions. This poses a key challenge: how to integrate neural data across sessions to reconstruct the complete multi-area picture. We address this with a transformer-based autoencoder that aligns neural activity into a shared latent space across sessions and animals, separately for each brain area, including those not recorded in a given session. This approach enables single-trial analysis of multi-area neural dynamics from all areas of interest. I am now working on improving this method, and will discuss both its present challenges and promising directions for future work.

Memming Park (Stony Brook University) - July 30th, 2025

Memming Park

Date: July 30th, 2025
Location: JLGSC-L3-079
Time: 2:00pm
Zoom: Upon request @ [email protected]

Title: Meta-dynamical state space modeling for integrative neural data analysis

Abstract:
Uncovering the organizing principles of neural systems requires integrating information across diverse datasets—each alone offering a limited view and signal-to-noise ratio, but together revealing coherent dynamical structures. We present a meta-dynamical state-space modeling framework that learns a shared solution space of neural dynamics from heterogeneous recordings across sessions, animals, and tasks. By capturing cross-dataset similarity and variability on a low-dimensional manifold that spans a space of dynamical systems, our approach enables few-shot inference, rapid adaptation to new recordings, and discovery of latent dynamical motifs that underlie behavior. We demonstrate its utility in modeling motor cortex activity, revealing dynamics that generalize across individuals and track the change in dynamics during learning. We argue that for understanding neural computation and real-time neuroscience applications, our approach is well-suited as a foundation model for integrative neuroscience.

Guillaume Lajoie (Mila – Quebec AI Institute) - June 25, 2025

Guillaume Lajoie

Title: POSSM: Generalizable, real-time neural decoding with hybrid state-space models

Abstract:
Real-time decoding of neural spiking data is a core aspect of neurotechnology applications such as brain-computer interfaces, where models are subject to strict latency constraints. Traditional methods, including simple recurrent neural networks, are fast and lightweight but are less equipped for generalization to unseen data. In contrast, recent Transformer-based approaches leverage large-scale neural datasets to attain strong generalization performance. However, these models typically have much larger computational requirements and are not suitable for settings requiring low latency or limited memory. To address these shortcomings, we present POSSM, a novel architecture that combines individual spike tokenization and an input cross-attention module with a recurrent state-space model (SSM) backbone, thereby enabling (1) fast and causal online prediction on neural activity and (2) efficient generalization to new sessions, individuals, and tasks through multi-dataset pre-training. We evaluate our model’s performance in terms of decoding accuracy and inference speed on monkey reaching datasets, and show that it extends to clinical applications, namely handwriting and speech decoding. Notably, we demonstrate that pre-training on monkey motor-cortical recordings improves decoding performance on the human handwriting task, highlighting the exciting potential for cross-species transfer. In all of these tasks, we find that POSSM achieves comparable decoding accuracy with state-of-the-art Transformers, at a fraction of the inference cost. These results suggest that hybrid SSMs may be the key to bridging the gap between accuracy, inference speed, and generalization when training neural decoders for real-time, closed-loop applications.

Srini Turaga (HHMI - Janelia Research Campus) - April 12, 2024

Frontier Models for Neuroscience and Behavior Working Group Priorly Known as Animal Behavior Video Analysis Working Group

Srini Turaga

Title: Whole-body simulation of realistic fruit fly locomotion with deep reinforcement learning

Abstract:

The body of an animal determines how the nervous system produces behavior. Therefore, detailed modeling of the neural control of sensorimotor behavior requires a detailed model of the body. Here we contribute an anatomically-detailed biomechanical whole-body model of the fruit fly {\em Drosophila melanogaster} in the \mujoco physics engine. Our model is general-purpose, enabling the simulation of diverse fly behaviors, both on land and in the air. We demonstrate the generality of our model by simulating realistic locomotion, both flight and walking. To support these behaviors, we have extended \mbox{MuJoCo} with phenomenological models of fluid forces and adhesion forces. Through data-driven end-to-end reinforcement learning, we demonstrate that these advances enable the training of neural network controllers capable of realistic locomotion along complex trajectories based on high-level steering control signals. With a visually guided flight task, we demonstrate a neural controller that can use the vision sensors of the body model to control and steer flight. Our project is an open-source platform for modeling neural control of sensorimotor behavior in an embodied context.

Ugne Klibaite (Harvard University) - February 16, 2024

Ugne Klibaite

Title: Mapping the landscape of social of social behavior using high-resolution 3D tracking of freely interacting animals

Abstract:
Social interaction is a fundamental component of animal behavior. However, we lack tools to describe it with quantitative rigor, limiting our understanding of its principles and the neuropsychiatric disorders, like autism, that perturb it. To address these limitations, I and collaborators have developed a technique for high-resolution 3D tracking of freely interacting animals and their body-wide social touch patterns, solving the challenging subject occlusion and part assignment problems using 3D geometric reasoning, graph neural networks, and semi-supervised learning. Using this technology, I have collected and annotated over 34 million 3D postures in interacting rats, featuring five new monogenic autism models lacking reports of social behavioral phenotypes. I will introduce a novel multi-scale approach which I have used to identify a rich landscape of stereotyped interactions, synchrony, and body contact across strains. This deep phenotyping approach revealed a spectrum of changes in rat autism models and in response to amphetamine, and this framework has the potential to facilitate quantitative studies of social behaviors and their neurobiological underpinnings.

Carl Vondrick (Columbia University) - January 19, 2024

Carl Vondrick

Title: Multimodal Learning from Pixels to People

Abstract:
People experience the world through modalities of sight, sound, words, touch, and more. By leveraging their natural relationships and developing multimodal learning methods, my research creates artificial perception systems with diverse skills, including spatial, physical, logical, and cognitive abilities, for flexibly analyzing visual data. This multimodal approach provides versatile representations for tasks like 3D reconstruction, visual question answering, and object recognition, while offering inherent explainability and excellent zero-shot generalization across tasks. By closely integrating diverse modalities, we can overcome key challenges in machine learning and enable new capabilities for computer vision, especially for the many upcoming applications where trust is required.

Multi-resource-cost Optimization of Neural Network Models

Hadi Vafaii (Redwood Center for Theoretical Neuroscience) - April 7, 2026

Hadi Vafaii

Date: April 7, 2026

Location: ZI L3-079

Time: 1:00pm

Title: Metabolic cost of information processing in Poisson variational autoencoders

Abstract: Computation in biological systems is fundamentally energy-constrained, yet standard theories of computation treat energy as freely available. Here, we argue that variational free energy minimization under a Poisson assumption offers a principled path toward an energy-aware theory of computation. Our key observation is that the Kullback-Leibler (KL) divergence term in the Poisson free energy objective becomes proportional to the prior firing rates of model neurons, yielding an emergent metabolic cost term that penalizes high baseline activity. This structure couples an abstract information-theoretic quantity — the coding rate — to a concrete biophysical variable — the firing rate — which enables a trade-off between coding fidelity and energy expenditure. Such a coupling arises naturally in the Poisson variational autoencoder (P-VAE; a brain-inspired generative model that encodes inputs as discrete spike counts and recovers a spiking form of sparse coding as a special case) but is absent from standard Gaussian VAEs. To demonstrate that this metabolic cost structure is unique to the Poisson formulation, we compare the P-VAE against GReLU-VAE, a Gaussian VAE with ReLU rectification applied to latent samples, which controls for the non-negativity constraint. Across a systematic sweep of the KL term weighting coefficient β and latent dimensionality, we find that increasing β monotonically increases sparsity and reduces average spiking activity in the P-VAE. In contrast, GReLU-VAE representations remain unchanged, confirming that the effect is specific to Poisson statistics rather than a byproduct of non-negative representations. These results establish Poisson variational inference as a promising foundation for a resource-constrained theory of computation.

Zoom Link: Upon request @ [email protected]

Xaq Pitkow (Carnegie Mellon) - April 22, 2026

Vijay Balasubramanian (University of Pennsylvania) - March 24, 2026

Xuexin Wei (University of Texas)- March 12, 2026

Xuexin Wei

Date: March 17, 2026
Location: ZI L5-084

Title: Constraints of efficient neural computation

Abstract: Neural systems adapt to the statistical structure of the environment to support behavior. While it is generally recognized that such adaptation is subject to various biological constraints (such as noise, metabolism, wiring cost), how these constraints determine the optimal neural computation remains unclear. For the first part of this talk, I will discuss theories of efficient coding based on consideration of metabolic cost and neural noise. For the second part, I will present ongoing work on how the geometry of the stimulus manifold shapes the structure of neural code. In particular, using the processing of heading direction as an example, I will show that the asymmetry of the stimulus manifold naturally accounts for key properties of heading direction encoding in macaque MST.

Zoom Link: Upon request @ [email protected]

Xaq Pitkow (Carnegie Mellon) - January 20, 2026

Xaq Pitkow

Date: January 20, 2026

Title: Frugal Inference for Control

Abstract: A key challenge in advancing artificial intelligence is achieving the right balance between utility maximization and resource use by both external movement and internal computation. While this trade-off has been studied in fully observable settings, our understanding of resource efficiency in partially observable environments remains limited. Motivated by this challenge, we develop a version of the POMDP framework where the information gained through inference is treated as a resource that must be optimized alongside task performance and motion effort. By solving this problem in environments described by linear-Gaussian dynamics, we uncover fundamental principles of resource efficiency. Our study reveals a phase transition in the inference, switching from a Bayes-optimal approach to one that strategically leaves some uncertainty unresolved. This frugal behavior gives rise to a structured family of equally effective strategies, facilitating adaptation to later objectives and constraints overlooked during the original optimization. We illustrate the applicability of our framework and the generality of the principles we derived using two nonlinear tasks. Overall, this work provides a foundation for a new type of rational computation that both brains and machines could use for effective but resource-efficient control under uncertainty.

Zoom Link: Upon request @ [email protected]

Alan Stocker (UPenn) - December 10, 2025

Alan Stocker

Date: December 10, 2025
Time: 11:00am
Location: Zuckerman Institute L5-116

Title: Economics of temporal evidence integration

Abstract: The temporal integration of sensory information is an important aspect of many human decision tasks. I will present results of ongoing research in my laboratory aimed at understanding the dynamic processes underlying evidence integration. In particular, I will discuss a novel resource-rational model that treats both the representation as well as the integration and maintenance of sensory evidence as actively controlled, performance-effort trade-off mechanisms. Validated against data from various behavioral experiments, the model not only provides a normative explanation for observed non-linear dynamics in evidence integration but also a parsimonious explanation for individual tendencies for recency or primacy behavior. As the work is ongoing and unpublished, I am looking forward to an engaged discussion with the audience.

Zoom Link: Upon request @ [email protected]

Jascha Achterberg (University of Oxford) - October 21, 2025

Jascha Achterberg

Date: October 21, 2025
Location: JLGSC-L03-079
Time: 1:00pm
Zoom: Upon request @ [email protected]

Title:
Building the brain’s efficient system-level architecture: optimisations across space, time, and multiple regions

Abstract:
The computations a brain can perform are fundamentally constrained by physical realities: energetic resources are limited, and time is precious. To understand why the brain works the way it does, we must understand its function in the context of these constraints. Prior modeling work has successfully demonstrated how spatial energetic constraints drive structure-function co-optimization, giving rise to many of the architectural features we observe across areas of neuroscience. By incorporating such physical constraints, we can build complex systems-level models that are meaningfully constrained by physically measurable factors rather than arbitrary design choices.

In this talk, I will expand on these spatial frameworks by introducing new work on temporal processing and signal precision constraints in neural networks. I will demonstrate how different optimization strategies within individual regions can be combined in heterogeneous multi-region models, revealing how the brain trades off resource use across tasks and situations. Finally, I will show how space and time interact in surprising ways to achieve efficient computation — principles that apply not only to the brain but to any large-scale distributed computing system. Together, these advances bring us closer to understanding the general principles that enable sophisticated intelligence to emerge from physically and energetically constrained computing systems.

Kwabena Boahen (Stanford University) - August 7th, 2025

Kwabena Boahen

Date: August 7, 2025
Location: JLGSC-L05-84
Time: 2:30pm
Zoom: Upon request @ [email protected]

Title: From 2D Chips to 3D Brains

Abstract:
Artificial intelligence (AI) realizes a synaptocentric conception of the learning brain with dot-products and advances by performing twice as many multiplications every two months. But the semiconductor industry tiles twice as many multipliers on a chip only every two years. Moreover, the returns from tiling these multipliers ever more densely now diminish, because signals must travel relatively farther and farther, expending energy and exhausting heat that scales quadratically. As a result, communication is now much more expensive than computation. Much more so than in biological brains, where energy-use scales linearly rather than quadratically with neuron count. That allows an 86-billion-neuron human brain to use as little power as a single lightbulb (25W) rather than as much as the entire US (3TW). Hence, rescaling a chip’s energy-use from quadratic to linear is critical to scale AI sustainably from trillion (1012) parameters (mouse scale) today to a quadrillion (1015) parameters (human scale) in the next five years. But this would require communication cost to be reduced radically. Towards that end, I will present a recent re-conception of the brain’s fundamental unit of computation that sparsifies signals by moving away from synaptocentric learning with dot-products to dendrocentric learning with sequence detectors.

Mitya Chklovskii (Simons Foundation) - May 25, 2025

Paul Schrater (University of Minnesota) - February 25, 2025

Paul Schrater

Title: Control when confidence is costly

Abstract:

We develop a version of stochastic control that accounts for computational costs of inference. Past studies identified efficient coding without control, or efficient control that neglects the cost of synthesizing information. Here we combine these concepts into a framework where agents rationally approximate inference for efficient control. Specifically, we study Linear Quadratic Gaussian (LQG) control with an added internal cost on the relative precision of the posterior probability over the world state. This creates a trade-off: an agent can obtain more utility overall by sacrificing some task performance, if doing so saves enough bits during inference. We discover that the rational strategy that solves the joint inference and control problem goes through phase transitions depending on the task demands, switching from a costly but optimal inference to a family of suboptimal inferences related by rotation transformations, each misestimate the stability of the world. In all cases, the agent moves more to think less. This work provides a foundation for a new type of rational computations that could be used by both brains and machines for efficient but computationally constrained control.
We develop a version of stochastic control that accounts for computational costs of inference. Past studies identified efficient coding without control, or efficient control that neglects the cost of synthesizing information. Here we combine these concepts into a framework where agents rationally approximate inference for efficient control. Specifically, we study Linear Quadratic Gaussian (LQG) control with an added internal cost on the relative precision of the posterior probability over the world state. This creates a trade-off: an agent can obtain more utility overall by sacrificing some task performance, if doing so saves enough bits during inference. We discover that the rational strategy that solves the joint inference and control problem goes through phase transitions depending on the task demands, switching from a costly but optimal inference to a family of suboptimal inferences related by rotation transformations, each misestimate the stability of the world. In all cases, the agent moves more to think less. This work provides a foundation for a new type of rational computations that could be used by both brains and machines for efficient but computationally constrained control.

Simon Laughlin (University of Cambridge) - October 3, 2024

Simon Laughlin

Title: Neuronal energy consumption: basic measures and trade-offs, and their effects on efficiency

Tom Griffiths (Princeton University) - September 25, 2024

Continual Learning

Blank Slate Technologies - April 30, 2026

Led by: CEO Matt Trevithick

Date: April 30, 2026

Presentations by a Team of Speakers: Blank Slate Technologies

Title: Quantifying Cognitive Performance in the Wild: Measurement, Modeling, and Operational Outcomes

Zoom Link: Upon request @ [email protected]

Language and Vision

Sara Gong (Columbia) - April 27, 2026

Sara Gong

Date: April 27, 2026
Location: Virtual
Time: 3pm

Title and Abstract: TBD

Zoom Link: Upon request @ [email protected]

Katherine Xu (University of Pennsylvania) - March 23, 2026

Katherine Xu

Date: March 23, 2026
Location: Virtual

Title: Are Vision-Language Models Checking or Looking?

Abstract:

Today’s AI vision systems are trained on vast amounts of data, yet it remains unclear whether they simply retrieve memorized answers or actively reason. We conjecture that hallucinations and limited creativity in these models stem from an over-reliance on superficial “checking” rather than active “looking.” Checking retrieves the most probable memorized association, which often fails when novel inputs mismatch stored patterns. In contrast, looking involves reasoning on the fly by iteratively sampling information, revising interpretations, and integrating evidence across modalities. First, I will share our recent work on Vibe Spaces for creatively connecting visual concepts. Second, I will propose visual humor as a lens to probe these cross-modal reasoning deficits. I will conclude with early findings from my ongoing research to open a discussion on potential collaborative directions for our working group.

Zoom Link: Upon request @ [email protected]