ICLR The Surprising Effectiveness of Randomness in LLM Pruning

Poster
in
Workshop: Workshop on Sparsity in LLMs (SLLM): Deep Dive into Mixture of Experts, Quantization, Hardware, and Inference

The Surprising Effectiveness of Randomness in LLM Pruning

Shuyao Xu · Liu Jiayao · Zhenfeng He · Cheng Peng · Weidi Xu

[ Abstract ] [ Project Page ]

[ OpenReview]

Sat 26 Apr 8:10 p.m. PDT — 9:10 p.m. PDT

Abstract:

This paper investigates the structured pruning of large language models (LLMs). We find that random pruning, despite its simplicity, is a surprisingly effective baseline, particularly at lower pruning ratios. We further propose a simple and efficient method that combines randomness with existing pruning heuristics. Specifically, our method combines random neuron clustering with activation magnitude pruning, exhibiting performance comparable to gradient-based methods while being significantly more efficient (up to 50x faster). Our code is available at https://anonymous.4open.science/r/random-prune-8F1C.

Chat is not available.

Poster in Workshop: Workshop on Sparsity in LLMs (SLLM): Deep Dive into Mixture of Experts, Quantization, Hardware, and Inference

The Surprising Effectiveness of Randomness in LLM Pruning

Shuyao Xu · Liu Jiayao · Zhenfeng He · Cheng Peng · Weidi Xu

Poster
in
Workshop: Workshop on Sparsity in LLMs (SLLM): Deep Dive into Mixture of Experts, Quantization, Hardware, and Inference