Dr Lu Yin
Academic and research departments
Computer Science Research Centre, Nature Inspired Computing and Engineering Research Group, Surrey Institute for People-Centred Artificial Intelligence (PAI).About
Biography
Greetings! I’m Lu, a Lecturer (Assistant Professor equivalent) in the School of Computer Science and Electronic Engineering at the University of Surrey, where I lead the Lightweight & Universal Machine Intelligence (LUMI) Lab. I am also affiliated with the People-Centred AI Institute.
I am honoured to be a long-term visitor and collaborator with the Visual Informatics Group (VITA) at UT Austin, and a visiting scholar at Eindhoven University of Technology (TU/e) and MPI-ELLIS.
Previously, I served as a Postdoctoral Fellow at TU/e and worked as a Research Scientist Intern at Google’s New York City office.
I was selected as a CPAL 2026 Rising Star. My research interests include:
- Efficient and Scalable Foundation Models
- Understanding and Enhancing LLMs
- Interdisciplinary AI Applications
I work closely with academic and industry collaborators, including researchers from Meta London, Google NYC, Intel Research, and JD.com.
I am always happy to discuss research ideas, collaborations, and potential visits. I welcome enquiries from motivated students, PhD applicants, and visiting researchers. Feel free to reach out if you would like to discuss anything with me.
University roles and responsibilities
- Personal Tutor
ResearchResearch interests
My research aims to build more capable, efficient, and accessible AI systems. I work on large foundation models from several complementary perspectives: improving their capabilities through pre-training and post-training, making them more efficient through compression and scalable inference, understanding their internal behaviours, and exploring new model architectures and data-centric learning strategies. A recurring theme in my work is to identify where intelligence, efficiency, and robustness emerge in modern AI systems, and how these insights can be used to design better models for real-world applications.
## Foundation Models ##LLMs ##Model Compression ##Model Understanding
Research interests
My research aims to build more capable, efficient, and accessible AI systems. I work on large foundation models from several complementary perspectives: improving their capabilities through pre-training and post-training, making them more efficient through compression and scalable inference, understanding their internal behaviours, and exploring new model architectures and data-centric learning strategies. A recurring theme in my work is to identify where intelligence, efficiency, and robustness emerge in modern AI systems, and how these insights can be used to design better models for real-world applications.
## Foundation Models ##LLMs ##Model Compression ##Model Understanding
Supervision
Postgraduate research supervision
Undergoing Ph.D supervisions
- Robustness of Large Foundation Models at Scale - Kappiyath Adarsh.
- Efficient Diffusion Language Models - Mingyu Cao.
- Test Time Adaptation for Diffusion Language Models - Handa Li.
- Weight Space Learning in LLMs with Symmetry - Xiaolong Han (with Prof. Ferrante Neri).
Efficient 3D Scene Understanding - Vishal Thengane (with Dr. Xiatian Zhu).
Teaching
My teaching within the School of Computer Science and Electronic Engineering focuses on artificial intelligence, deep learning, and business analytics. I aim to help students understand both the theoretical foundations and practical applications of modern AI methods, especially how machine learning and data-driven techniques can be used to solve real-world problems.
I currently teach Deep Learning and Advanced AI, which introduces students to modern deep learning methods, neural network architectures, and advanced AI techniques. The module supports students in developing both conceptual understanding and practical implementation skills, preparing them for further study, research, and industry roles in artificial intelligence.
I also teach business analytics and data visualisation modules, where students learn how to use data, analytical thinking, and visual communication to support business decision-making. These modules are designed to bridge technical methods with practical business contexts, helping students develop skills that are valuable across both technical and non-technical career paths.
See below for a full list of my teaching experience.
2025/26
COM3025 Deep Learning and Advanced AI, University of Surrey
COMM074 Business Analytics with Data Visualisation, University of Surrey
2024/25
COM3025 Deep Learning and Advanced AI, University of Surrey
COM3018 Practical Business Analytics, University of Surrey
Publications
Highlights
† Corresponding author. * Equal contribution.
Mingyu Cao, Alvaro H.C. Correia, Christos Louizos, Shiwei Liu†, Lu Yin†. Search or Accelerate: Confidence-Switched Position Beam Search for Diffusion Language Models. The Forty-third International Conference on Machine Learning (ICML), 2026. [Paper]
Di He, Songjun Tu, Keyu Wang, Lu Yin†, Shiwei Liu†. One LR Doesn’t Fit All: Heavy-Tail Guided Layerwise Learning Rates for LLMs. The Forty-third International Conference on Machine Learning (ICML), 2026.
Pengxiang Li, Yefan Zhou, Dilxat Muhtar, Lu Yin, Shilin Yan, Li Shen, Soroush Vosoughi, Shiwei Liu. Diffusion Language Models Know the Answer Before Decoding. The Fourteenth International Conference on Learning Representations (ICLR), 2026. [Oral] [Paper]
Xinchen Han, Hossam Afifi, Michel Marot, Xilu Wang, Lu Yin†. Long Chain-of-Thought Compression via Fine-Grained Group Policy Optimization. IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), 2026. [Paper]
Adarsh Kappiyath, Abhra Chaudhuri, Ajay Kumar Jaiswal, Ziquan Liu, Yunpeng Li, Xiatian Zhu, Lu Yin†. SEBRA: Debiasing through Self-Guided Bias Ranking. The Thirteenth International Conference on Learning Representations (ICLR), 2025. [Paper]
Pengxiang Li*, Lu Yin*, Shiwei Liu. Mix-LN: Unleashing the Power of Deeper Layers by Combining Pre-LN and Post-LN. The Thirteenth International Conference on Learning Representations (ICLR), 2025. [Paper]
Pengxiang Li*, Lu Yin*, Shiwei Liu. Outlier-weighed Layerwise Sampling for LLM Fine-tuning. The 63rd Annual Meeting of the Association for Computational Linguistics (ACL Findings), 2025. [Paper]
Di He, Ajay Jaiswal, Songjun Tu, Li Shen, Ganzhao Yuan, Shiwei Liu, Lu Yin†. AlphaDecay: Module-wise Weight Decay for Heavy-Tailed Balancing in LLMs. Conference on Neural Information Processing Systems (NeurIPS), 2025. [Paper]
Tianhao Chen, Xin Xu, Zijing Liu, Pengxiang Li, Xinyuan Song, Ajay Kumar Jaiswal, Fan Zhang, Jishan Hu, Yang Wang, Hao Chen, Shizhe Diao, Shiwei Liu, Yu Li, Lu Yin†, Can Yang. GPAS: Accelerating Convergence of LLM Pretraining via Gradient-Preserving Activation Scaling. Conference on Neural Information Processing Systems (NeurIPS), 2025. [Paper]
Lu Yin, You Wu, Zhenyu Zhang, Cheng-Yu Hsieh, Yaqing Wang, Yiling Jia, Gen Li, Ajay Jaiswal, Mykola Pechenizkiy, Yi Liang, Michael Bendersky, Zhangyang Wang, Shiwei Liu. Outlier Weighed Layerwise Sparsity (OWL): A Missing Secret Sauce for Pruning LLMs to High Sparsity. The Forty-first International Conference on Machine Learning (ICML), 2024. [Paper]
Lu Yin, Ajay Jaiswal, Shiwei Liu, Souvik Kundu, Zhangyang Wang. Pruning Small Pre-Trained Weights Irreversibly and Monotonically Impairs “Difficult” Downstream Tasks in LLMs. The Forty-first International Conference on Machine Learning (ICML), 2024. [Paper]
Lu Yin, Gen Li, Meng Fang, Li Shen, Tianjin Huang, Zhangyang Wang, Vlado Menkovski, Xiaolong Ma, Mykola Pechenizkiy, Shiwei Liu. Dynamic Sparse Training Is also A Structure Sparsity Learner. Conference on Neural Information Processing Systems (NeurIPS), 2023. [Paper]
Lu Yin, Shiwei Liu, Fang Meng, Tianjin Huang, Vlado Menkovski, Mykola Pechenizkiy. Lottery Pools: Winning More by Interpolating Tickets without Increasing Training or Inference Cost. Thirty-Seventh AAAI Conference on Artificial Intelligence (AAAI), 2023. [Paper]