A complete list of my peer-reviewed publications. Bold indicates first / co-first authorship. For up-to-date metrics, see my Google Scholar.

2026

Dissecting Multimodal In-Context Learning: Modality Asymmetries and Circuit Dynamics in Modern Transformers

Dissecting Multimodal In-Context Learning: Modality Asymmetries and Circuit Dynamics in Modern Transformers

Yiran Huang, Karsten Roth, Quentin Bouniot, Wei Xu, Zeynep Akata

International Conference on Machine Learning (ICML), 2026 — Spotlight

Provides the first mechanistic account of multimodal in-context learning in modern transformers, and establishes a controlled testbed for analyzing how architectural choices and data statistics shape ICL across modalities. Reveals a learning asymmetry between modalities, characterizes the underlying circuit dynamics, and validates the findings on production-scale MLLMs (Qwen2.5-VL).
Structural Pruning of Large Vision-Language Models: Pruning Dynamics, Recovery, and Data Efficiency

Structural Pruning of Large Vision-Language Models: Pruning Dynamics, Recovery, and Data Efficiency

Yiran Huang, Lukas Thede, Massimiliano Mancini, Wei Xu, Zeynep Akata

International Journal of Computer Vision (IJCV), 2026

Studies layerwise and widthwise structural pruning in open-source vision-language models. Shows that supervised fine-tuning combined with hidden-state distillation can retain more than 95% of original performance using only 5% of the recovery data.

2025

Towards Understanding Multimodal In-Context Learning Through Classification Tasks

Yiran Huang, Karsten Roth, Quentin Bouniot, Wei Xu, Zeynep Akata

NeurIPS 2025 Workshop “What Can(’t) Transformers Do?”

Investigating Structural Pruning and Recovery Techniques for Compressing MLLMs

Investigating Structural Pruning and Recovery Techniques for Compressing MLLMs

Yiran Huang, Lukas Thede, Massimiliano Mancini, Wei Xu, Zeynep Akata

German Conference on Pattern Recognition (GCPR), 2025 — Oral

Revealing and Reducing Gender Biases in Vision and Language Assistants (VLAs)

Revealing and Reducing Gender Biases in Vision and Language Assistants (VLAs)

Leander Girrbach, Stephan Alaniz, Yiran Huang, Trevor Darrell, Zeynep Akata

International Conference on Learning Representations (ICLR), 2025

Evaluates gender bias in 22 MLLMs across skills and occupations. Fine-tuning-based debiasing methods achieve the best trade-off between bias reduction and retaining task performance.

2024

Gender Bias in Vision-Language Assistants

Leander Girrbach, Stephan Alaniz, Yiran Huang, Trevor Darrell, Zeynep Akata

ECCV 2024 Workshop “The Dark Side of Generative AIs and Beyond”