Our paper “Dissecting Multimodal In-Context Learning: Modality Asymmetries and Circuit Dynamics in Modern Transformers” has been accepted to ICML 2026 as a Spotlight.

This is joint work with Karsten Roth, Quentin Bouniot, Wei Xu, and Zeynep Akata.

The paper provides the first mechanistic account of multimodal in-context learning in modern transformers. We build a controlled testbed for analyzing how architectural choices and data statistics shape ICL across modalities, identify a learning asymmetry between modalities, characterize the underlying circuit dynamics (induction-like behavior across modalities), and validate the findings on production-scale MLLMs such as Qwen2.5-VL.

More details and a link to the paper / code coming soon.