[{"content":"Our paper \u0026ldquo;Dissecting Multimodal In-Context Learning: Modality Asymmetries and Circuit Dynamics in Modern Transformers\u0026rdquo; has been accepted to ICML 2026 as a Spotlight.\nThis is joint work with Karsten Roth, Quentin Bouniot, Wei Xu, and Zeynep Akata.\nThe paper provides the first mechanistic account of multimodal in-context learning in modern transformers. We build a controlled testbed for analyzing how architectural choices and data statistics shape ICL across modalities, identify a learning asymmetry between modalities, characterize the underlying circuit dynamics (induction-like behavior across modalities), and validate the findings on production-scale MLLMs such as Qwen2.5-VL.\nMore details and a link to the paper / code coming soon.\n","permalink":"https://yiranhuangirene.github.io/posts/2026-05-icml-spotlight/","summary":"\u0026ldquo;Dissecting Multimodal In-Context Learning: Modality Asymmetries and Circuit Dynamics in Modern Transformers\u0026rdquo; was accepted to ICML 2026 as a Spotlight.","title":"Paper accepted to ICML 2026 as a Spotlight"},{"content":"Our paper \u0026ldquo;Structural Pruning of Large Vision-Language Models: Pruning Dynamics, Recovery, and Data Efficiency\u0026rdquo; has been accepted to the International Journal of Computer Vision (IJCV) 2026.\nThis is an extended journal version of our GCPR 2025 oral, joint with Lukas Thede, Massimiliano Mancini, Wei Xu, and Zeynep Akata.\nThe paper studies layerwise and widthwise structural pruning in open-source vision-language models. Our key finding: supervised fine-tuning combined with hidden-state distillation can retain more than 95% of original performance using only 5% of the recovery data — making post-pruning recovery realistic on academic compute.\n","permalink":"https://yiranhuangirene.github.io/posts/2026-ijcv-pruning/","summary":"\u0026ldquo;Structural Pruning of Large Vision-Language Models: Pruning Dynamics, Recovery, and Data Efficiency\u0026rdquo; was accepted to IJCV 2026.","title":"Paper accepted to IJCV 2026"},{"content":"Our paper \u0026ldquo;Investigating Structural Pruning and Recovery Techniques for Compressing MLLMs\u0026rdquo; was accepted to the German Conference on Pattern Recognition (GCPR) 2025 as an Oral presentation.\nThis is joint work with Lukas Thede, Massimiliano Mancini, Wei Xu, and Zeynep Akata.\nWe develop pruning and recovery pipelines for LLaVA, Bunny, and InternVL, and study how much recovery data is actually needed to restore performance after pruning. The journal version was later accepted to IJCV 2026.\n","permalink":"https://yiranhuangirene.github.io/posts/2025-gcpr-oral/","summary":"\u0026ldquo;Investigating Structural Pruning and Recovery Techniques for Compressing MLLMs\u0026rdquo; was accepted to GCPR 2025 as an Oral.","title":"GCPR 2025 Oral: Structural Pruning and Recovery for MLLMs"},{"content":"The paper \u0026ldquo;Revealing and Reducing Gender Biases in Vision and Language Assistants (VLAs)\u0026rdquo; was accepted to ICLR 2025.\nThis is joint work led by Leander Girrbach with Stephan Alaniz, myself, Trevor Darrell, and Zeynep Akata.\nWe evaluate gender bias in 22 multimodal LLMs across skills and occupations, and find that fine-tuning-based debiasing methods achieve the best trade-off between debiasing strength and retaining task performance.\n","permalink":"https://yiranhuangirene.github.io/posts/2025-iclr-vla-bias/","summary":"\u0026ldquo;Revealing and Reducing Gender Biases in Vision and Language Assistants (VLAs)\u0026rdquo; was accepted to ICLR 2025.","title":"ICLR 2025: Revealing and Reducing Gender Biases in Vision and Language Assistants"}]