Research


We are organizing a Deep Learning Seminar, focusing on discussions around deep neural networks, large language models, and generative models. For more information, please visit: https://zuoooooooo.github.io/dlseminar.html


An Efficient Pruner for Large Language Model with Theoretical Guarantee (with Canhong Wen and Wenliang Pan)
International Conference on Machine Learning (ICML), 2025. [icml] [openreview]

Abstract: Large Language Models (LLMs) have showcased remarkable performance across a range of tasks but are hindered by their massive parameter sizes, which impose significant computational and storage demands. Pruning has emerged as an effective solution to reduce model size, but traditional methods often involve inefficient retraining or rely on heuristic-based one-shot approaches that lack theoretical guarantees. In this paper, we reformulate the pruning problem as an $\ell_0$-penalized optimization problem and propose a monotone accelerated Iterative Hard Thresholding (mAIHT) method. Our approach combines solid theoretical foundations with practical effectiveness, offering a detailed theoretical analysis that covers convergence, convergence rates, and risk upper bounds. Through extensive experiments, we demonstrate that mAIHT outperforms state-of-the-art pruning techniques by effectively pruning the LLaMA-7B model across various evaluation metrics.