Skip to content

CTN: Scott Linderman

Zuckerman Institute - L5-084 3227 Broadway, New York

Title: When and How to Parallelize Seemingly Sequential Models   Abstract: Transformers have become the de facto model for sequential data in large part because they are well adapted to modern hardware: At training time, the loss can be evaluated in parallel over the sequence length on GPUs and TPUs. By contrast, evaluating nonlinear recurrent neural networks…

Language and Vision Working Group

Initial Meeting! About: The ARNI Language & Vision Working Group aims to bring together researchers across neuroscience, cognitive science, computer science, and AI to collaboratively advance our understanding of how humans and machines construct multimodal experiences. Its goal is to create a space for discussing ongoing language- and vision-focused projects, identifying natural points of overlap,…