About

University roles and responsibilities

  • Personal Tutor

    Research

    Research interests

    Supervision

    Postgraduate research supervision

    Teaching

    Publications

    Highlights

    † Corresponding author.  * Equal contribution.

    Mingyu Cao, Alvaro H.C. Correia, Christos Louizos, Shiwei Liu†, Lu Yin†. Search or Accelerate: Confidence-Switched Position Beam Search for Diffusion Language Models. The Forty-third International Conference on Machine Learning (ICML), 2026. [Paper

    Di He, Songjun Tu, Keyu Wang, Lu Yin†, Shiwei Liu†. One LR Doesn’t Fit All: Heavy-Tail Guided Layerwise Learning Rates for LLMs. The Forty-third International Conference on Machine Learning (ICML), 2026. 

    Pengxiang Li, Yefan Zhou, Dilxat Muhtar, Lu Yin, Shilin Yan, Li Shen, Soroush Vosoughi, Shiwei Liu. Diffusion Language Models Know the Answer Before Decoding. The Fourteenth International Conference on Learning Representations (ICLR), 2026. [Oral] [Paper

    Xinchen Han, Hossam Afifi, Michel Marot, Xilu Wang, Lu Yin†. Long Chain-of-Thought Compression via Fine-Grained Group Policy Optimization. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2026. [Paper

    Adarsh Kappiyath, Abhra Chaudhuri, Ajay Kumar Jaiswal, Ziquan Liu, Yunpeng Li, Xiatian Zhu, Lu Yin†. SEBRA: Debiasing through Self-Guided Bias Ranking. The Thirteenth International Conference on Learning Representations (ICLR), 2025. [Paper

    Pengxiang Li*, Lu Yin*, Shiwei Liu. Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and Post-LN. The Thirteenth International Conference on Learning Representations (ICLR), 2025. [Paper

    Pengxiang Li*, Lu Yin*, Shiwei Liu. Outlier-weighed Layerwise Sampling for LLM Fine-tuning. The 63rd Annual Meeting of the Association for Computational Linguistics (ACL Findings), 2025. [Paper

    Di He, Ajay Jaiswal, Songjun Tu, Li Shen, Ganzhao Yuan, Shiwei Liu, Lu Yin†. AlphaDecay: Module-wise Weight Decay for Heavy-Tailed Balancing in LLMs. Conference on Neural Information Processing Systems (NeurIPS), 2025. [Paper

    Tianhao Chen, Xin Xu, Zijing Liu, Pengxiang Li, Xinyuan Song, Ajay Kumar Jaiswal, Fan Zhang, Jishan Hu, Yang Wang, Hao Chen, Shizhe Diao, Shiwei Liu, Yu Li, Lu Yin†, Can Yang. GPAS: Accelerating Convergence of LLM Pretraining via Gradient-Preserving Activation Scaling. Conference on Neural Information Processing Systems (NeurIPS), 2025. [Paper

    Lu Yin, You Wu, Zhenyu Zhang, Cheng-Yu Hsieh, Yaqing Wang, Yiling Jia, Gen Li, Ajay Jaiswal, Mykola Pechenizkiy, Yi Liang, Michael Bendersky, Zhangyang Wang, Shiwei Liu. Outlier Weighed Layerwise Sparsity (OWL): A Missing Secret Sauce for Pruning LLMs to High Sparsity. The Forty-first International Conference on Machine Learning (ICML), 2024. [Paper

    Lu Yin, Ajay Jaiswal, Shiwei Liu, Souvik Kundu, Zhangyang Wang. Pruning Small Pre-Trained Weights Irreversibly and Monotonically Impairs “Difficult” Downstream Tasks in LLMs. The Forty-first International Conference on Machine Learning (ICML), 2024. [Paper

    Lu Yin, Gen Li, Meng Fang, Li Shen, Tianjin Huang, Zhangyang Wang, Vlado Menkovski, Xiaolong Ma, Mykola Pechenizkiy, Shiwei Liu. Dynamic Sparse Training Is also A Structure Sparsity Learner. Conference on Neural Information Processing Systems (NeurIPS), 2023. [Paper

    Lu Yin, Shiwei Liu, Fang Meng, Tianjin Huang, Vlado Menkovski, Mykola Pechenizkiy. Lottery Pools: Winning More by Interpolating Tickets without Increasing Training or Inference Cost. Thirty-Seventh AAAI Conference on Artificial Intelligence (AAAI), 2023. [Paper