USI email 2025

Information-theoretic Approaches to Deep Learning with Attention

06.05 11:00 - 12:30 USI East Campus, Room C1.03

GOOGLE / OUTLOOK / ICS

Abstract: Characterising deep learning in terms of probabilistic inference gives us a theoretical framework for understanding the information content of their internal representations. This Bayesian perspective is made computationally efficient with variational approximations, including the Variational Information Bottleneck (VIB) regulariser for vector-space representations. With attention-based models such as Transformers, attention functions use internal representations which are arbitrarily long sequences of vectors. We can still define variational Bayesian methods for these variable-sized representations using Bayesian nonparametrics, giving us the Nonparametric Variational Information Bottleneck (NVIB) regulariser. We will explain these methods and outline some of the main empirical results for this theoretically motivated approach to deep learning architectures.

Talk in the framework of the course Advanced Topics in NLP

Host: Prof. Lonneke van der Plas

06.05

Wednesday

James Henderson is a Senior Researcher at the Idiap Research institute, where he heads the Natural Language Understanding group. He previously held positions at the University of Geneva, Xerox Research Centre Europe, University of Edinburgh, and University of Exeter. In 2025, he was awarded an ERC Advanced Grant to work on "Interpretable Beliefs and Programmable Knowledge with Bayesian Attention in Large Language Models" (BALM). His research focusses on representation learning for NLP, including graph-to-graph and variational-Bayesian extensions of the attention functions in transformers.

11:00

GOOGLE / OUTLOOK / ICS