Yiran Huang

I am a PhD researcher in Computer Science at the Technical University of Munich, advised by Prof. Zeynep Akata and Prof. Wenjia Xu, and a member of IMPRS-IS and MCML. My research is on multimodal large language models, at the intersection of mechanistic interpretability and efficient post-training.

I am open to a Summer/Fall 2026 research internship in LLM / MLLM research, especially multimodal in-context learning, and post-training (distillation, RL). If your team is hiring and there’s a fit, please reach out.

Yiran Huang

News

  • May 2026. ✨ One paper accepted in ICML 2026 (Spotlight).
  • Feb 2026. 🎉 One paper accepted in IJCV 2026, extending our GCPR 2025 oral.
  • Oct 2025. 🎉 One paper accepted in the NeurIPS 2025 Workshop “What Can(’t) Transformers Do?”.
  • Aug 2025. 🎤 One paper accepted in GCPR 2025 (Oral).
  • Jan 2025. 🎉 One paper accepted in ICLR 2025.
  • Oct 2024. 🎉 One paper accepted in the ECCV 2024 Workshop “The Dark Side of Generative AIs and Beyond”.
  • Aug 2023. 🌱 Joined Prof. Zeynep Akata’s lab at TUM as a PhD researcher; member of IMPRS-IS and MCML.

Selected Publications

Multimodal In-Context Learning & Interpretability

Dissecting Multimodal In-Context Learning: Modality Asymmetries and Circuit Dynamics in Modern Transformers

Dissecting Multimodal In-Context Learning: Modality Asymmetries and Circuit Dynamics in Modern Transformers

Yiran Huang, Karsten Roth, Quentin Bouniot, Wei Xu, Zeynep Akata

ICML 2026 (Spotlight)

First mechanistic account of multimodal in-context learning in modern transformers. We isolate a learning asymmetry between text and image modalities, characterize the circuit dynamics that drive it, and validate findings on production-scale MLLMs (Qwen2.5-VL).

Efficient Post-Training of MLLMs

Structural Pruning of Large Vision-Language Models: Pruning Dynamics, Recovery, and Data Efficiency

Structural Pruning of Large Vision-Language Models: Pruning Dynamics, Recovery, and Data Efficiency

Yiran Huang, Lukas Thede, Massimiliano Mancini, Wei Xu, Zeynep Akata

IJCV 2026

Layerwise and widthwise structural pruning across LLaVA, Bunny, and InternVL. SFT combined with hidden-state distillation retains >95% of original performance while using only 5% of the recovery data — making aggressive compression practical for open-source VLMs.
Investigating Structural Pruning and Recovery Techniques for Compressing MLLMs

Investigating Structural Pruning and Recovery Techniques for Compressing MLLMs

Yiran Huang, Lukas Thede, Massimiliano Mancini, Wei Xu, Zeynep Akata

GCPR 2025 (Oral)

The pruning + recovery pipeline behind the IJCV journal version. Studies how recovery-data scale and distillation targets interact with pruning aggressiveness across multiple open-source MLLMs.

Bias & Fairness in MLLMs

Revealing and Reducing Gender Biases in Vision and Language Assistants (VLAs)

Revealing and Reducing Gender Biases in Vision and Language Assistants (VLAs)

Leander Girrbach, Stephan Alaniz, Yiran Huang, Trevor Darrell, Zeynep Akata

ICLR 2025

Evaluates gender bias across 22 MLLMs over skills and occupations. Among debiasing strategies, fine-tuning-based methods achieve the best trade-off between bias reduction and downstream task performance.