stereoplegic
			's Collections
			 
		
			
		Pruning
		
	updated
			
 
				
				
	
	
	
			
			Deja Vu: Contextual Sparsity for Efficient LLMs at Inference Time
		
			Paper
			
•
			2310.17157
			
•
			Published
				
			•
				
				14
			
 
	
	 
	
	
	
			
			Dynamic Context Pruning for Efficient and Interpretable Autoregressive
  Transformers
		
			Paper
			
•
			2305.15805
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Compress, Then Prompt: Improving Accuracy-Efficiency Trade-off of LLM
  Inference with Transferable Prompt
		
			Paper
			
•
			2305.11186
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Composable Sparse Fine-Tuning for Cross-Lingual Transfer
		
			Paper
			
•
			2110.07560
			
•
			Published
				
			•
				
				2
			
 
	
	 
	
	
	
			
			Parameter-Efficient Neural Reranking for Cross-Lingual and Multilingual
  Retrieval
		
			Paper
			
•
			2204.02292
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Pruning Adversarially Robust Neural Networks without Adversarial
  Examples
		
			Paper
			
•
			2210.04311
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			LoRAPrune: Pruning Meets Low-Rank Parameter-Efficient Fine-Tuning
		
			Paper
			
•
			2305.18403
			
•
			Published
				
			•
				
				2
			
 
	
	 
	
	
	
			
			Parameter-Efficient Fine-Tuning with Layer Pruning on Free-Text
  Sequence-to-Sequence Modeling
		
			Paper
			
•
			2305.08285
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Multi-Head Adapter Routing for Cross-Task Generalization
		
			Paper
			
•
			2211.03831
			
•
			Published
				
			•
				
				2
			
 
	
	 
	
	
	
			
			Improving Visual Prompt Tuning for Self-supervised Vision Transformers
		
			Paper
			
•
			2306.05067
			
•
			Published
				
			•
				
				2
			
 
	
	 
	
	
	
			
			Dynamic Token Pruning in Plain Vision Transformers for Semantic
  Segmentation
		
			Paper
			
•
			2308.01045
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			The Information Pathways Hypothesis: Transformers are Dynamic
  Self-Ensembles
		
			Paper
			
•
			2306.01705
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Sparse Iso-FLOP Transformations for Maximizing Training Efficiency
		
			Paper
			
•
			2303.11525
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			How do neurons operate on sparse distributed representations? A
  mathematical theory of sparsity, neurons and active dendrites
		
			Paper
			
•
			1601.00720
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Scalable Training of Artificial Neural Networks with Adaptive Sparse
  Connectivity inspired by Network Science
		
			Paper
			
•
			1707.04780
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Quick and Robust Feature Selection: the Strength of Energy-efficient
  Sparse Training for Autoencoders
		
			Paper
			
•
			2012.00560
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Sparse Finetuning for Inference Acceleration of Large Language Models
		
			Paper
			
•
			2310.06927
			
•
			Published
				
			•
				
				15
			
 
	
	 
	
	
	
			
			How Well Do Sparse Imagenet Models Transfer?
		
			Paper
			
•
			2111.13445
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			The Optimal BERT Surgeon: Scalable and Accurate Second-Order Pruning for
  Large Language Models
		
			Paper
			
•
			2203.07259
			
•
			Published
				
			•
				
				4
			
 
	
	 
	
	
	
			
			SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot
		
			Paper
			
•
			2301.00774
			
•
			Published
				
			•
				
				3
			
 
	
	 
	
	
	
			
			PockEngine: Sparse and Efficient Fine-tuning in a Pocket
		
			Paper
			
•
			2310.17752
			
•
			Published
				
			•
				
				14
			
 
	
	 
	
	
	
			
			LoRAShear: Efficient Large Language Model Structured Pruning and
  Knowledge Recovery
		
			Paper
			
•
			2310.18356
			
•
			Published
				
			•
				
				24
			
 
	
	 
	
	
	
			
			Continual Learning via Neural Pruning
		
			Paper
			
•
			1903.04476
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			A Survey on Model Compression for Large Language Models
		
			Paper
			
•
			2308.07633
			
•
			Published
				
			•
				
				3
			
 
	
	 
	
	
	
			
			A Simple and Effective Pruning Approach for Large Language Models
		
			Paper
			
•
			2306.11695
			
•
			Published
				
			•
				
				3
			
 
	
	 
	
	
	
			
			Finding Neurons in a Haystack: Case Studies with Sparse Probing
		
			Paper
			
•
			2305.01610
			
•
			Published
				
			•
				
				2
			
 
	
	 
	
	
	
			
			XPrompt: Exploring the Extreme of Prompt Tuning
		
			Paper
			
•
			2210.04457
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			SPDF: Sparse Pre-training and Dense Fine-tuning for Large Language
  Models
		
			Paper
			
•
			2303.10464
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			DSEE: Dually Sparsity-embedded Efficient Tuning of Pre-trained Language
  Models
		
			Paper
			
•
			2111.00160
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Only 5\% Attention Is All You Need: Efficient Long-range Document-level
  Neural Machine Translation
		
			Paper
			
•
			2309.14174
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Beyond Attentive Tokens: Incorporating Token Importance and Diversity
  for Efficient Vision Transformers
		
			Paper
			
•
			2211.11315
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Sheared LLaMA: Accelerating Language Model Pre-training via Structured
  Pruning
		
			Paper
			
•
			2310.06694
			
•
			Published
				
			•
				
				3
			
 
	
	 
	
	
	
			
			Compresso: Structured Pruning with Collaborative Prompting Learns
  Compact Large Language Models
		
			Paper
			
•
			2310.05015
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Can pruning make Large Language Models more efficient?
		
			Paper
			
•
			2310.04573
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Compressing LLMs: The Truth is Rarely Pure and Never Simple
		
			Paper
			
•
			2310.01382
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			DPHuBERT: Joint Distillation and Pruning of Self-Supervised Speech
  Models
		
			Paper
			
•
			2305.17651
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			PADA: Pruning Assisted Domain Adaptation for Self-Supervised Speech
  Representations
		
			Paper
			
•
			2203.16965
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Task-Agnostic Structured Pruning of Speech Representation Models
		
			Paper
			
•
			2306.01385
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Recycle-and-Distill: Universal Compression Strategy for
  Transformer-based Speech SSL Models with Attention Map Reusing and Masking
  Distillation
		
			Paper
			
•
			2305.11685
			
•
			Published
				
			•
				
				2
			
 
	
	 
	
	
	
			
			The Emergence of Essential Sparsity in Large Pre-trained Models: The
  Weights that Matter
		
			Paper
			
•
			2306.03805
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Parameter-Efficient Sparsity for Large Language Models Fine-Tuning
		
			Paper
			
•
			2205.11005
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Beyond Size: How Gradients Shape Pruning Decisions in Large Language
  Models
		
			Paper
			
•
			2311.04902
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Leveraging Structured Pruning of Convolutional Neural Networks
		
			Paper
			
•
			2206.06247
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			You are caught stealing my winning lottery ticket! Making a lottery
  ticket claim its ownership
		
			Paper
			
•
			2111.00162
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Sparse then Prune: Toward Efficient Vision Transformers
		
			Paper
			
•
			2307.11988
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			SHARP: Sparsity and Hidden Activation RePlay for Neuro-Inspired
  Continual Learning
		
			Paper
			
•
			2305.18563
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Incremental Task Learning with Incremental Rank Updates
		
			Paper
			
•
			2207.09074
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			On the Soft-Subnetwork for Few-shot Class Incremental Learning
		
			Paper
			
•
			2209.07529
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Forget-free Continual Learning with Soft-Winning SubNetworks
		
			Paper
			
•
			2303.14962
			
•
			Published
				
			•
				
				2
			
 
	
	 
	
	
	
			
			Exclusive Supermask Subnetwork Training for Continual Learning
		
			Paper
			
•
			2210.10209
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Continual Task Allocation in Meta-Policy Network via Sparse Prompting
		
			Paper
			
•
			2305.18444
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			SparCL: Sparse Continual Learning on the Edge
		
			Paper
			
•
			2209.09476
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Continual Learning with Dynamic Sparse Training: Exploring Algorithms
  for Effective Model Updates
		
			Paper
			
•
			2308.14831
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Dynamic Sparse Training with Structured Sparsity
		
			Paper
			
•
			2305.02299
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Accurate Neural Network Pruning Requires Rethinking Sparse Optimization
		
			Paper
			
•
			2308.02060
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Dynamic Sparse Training via Balancing the Exploration-Exploitation
  Trade-off
		
			Paper
			
•
			2211.16667
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			HyperSparse Neural Networks: Shifting Exploration to Exploitation
  through Adaptive Regularization
		
			Paper
			
•
			2308.07163
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Is Complexity Required for Neural Network Pruning? A Case Study on
  Global Magnitude Pruning
		
			Paper
			
•
			2209.14624
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			End-to-End Neural Network Compression via ell_1{ell_2}
  Regularized Latency Surrogates
		
			Paper
			
•
			2306.05785
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Fire Together Wire Together: A Dynamic Pruning Approach with
  Self-Supervised Mask Prediction
		
			Paper
			
•
			2110.08232
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			LAPP: Layer Adaptive Progressive Pruning for Compressing CNNs from
  Scratch
		
			Paper
			
•
			2309.14157
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Weight-dependent Gates for Network Pruning
		
			Paper
			
•
			2007.02066
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Rewarded meta-pruning: Meta Learning with Rewards for Channel Pruning
		
			Paper
			
•
			2301.11063
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Soft Masking for Cost-Constrained Channel Pruning
		
			Paper
			
•
			2211.02206
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Group channel pruning and spatial attention distilling for object
  detection
		
			Paper
			
•
			2306.01526
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Structured Pruning Learns Compact and Accurate Models
		
			Paper
			
•
			2204.00408
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			COST-EFF: Collaborative Optimization of Spatial and Temporal Efficiency
  with Slenderized Multi-exit Language Models
		
			Paper
			
•
			2210.15523
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Latency Adjustable Transformer Encoder for Language Understanding
		
			Paper
			
•
			2201.03327
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Learned Token Pruning for Transformers
		
			Paper
			
•
			2107.00910
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			AxFormer: Accuracy-driven Approximation of Transformers for Faster,
  Smaller and more Accurate NLP Models
		
			Paper
			
•
			2010.03688
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Zero-TPrune: Zero-Shot Token Pruning through Leveraging of the Attention
  Graph in Pre-Trained Transformers
		
			Paper
			
•
			2305.17328
			
•
			Published
				
			•
				
				2
			
 
	
	 
	
	
	
			
			Pruning Pre-trained Language Models Without Fine-Tuning
		
			Paper
			
•
			2210.06210
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Frustratingly Simple Memory Efficiency for Pre-trained Language Models
  via Dynamic Embedding Pruning
		
			Paper
			
•
			2309.08708
			
•
			Published
				
			•
				
				3
			
 
	
	 
	
	
	
			
			Are Sixteen Heads Really Better than One?
		
			Paper
			
•
			1905.10650
			
•
			Published
				
			•
				
				2
			
 
	
	 
	
	
	
			
			SuperTickets: Drawing Task-Agnostic Lottery Tickets from Supernets via
  Jointly Architecture Searching and Parameter Pruning
		
			Paper
			
•
			2207.03677
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Generative Model for Models: Rapid DNN Customization for Diverse Tasks
  and Resource Constraints
		
			Paper
			
•
			2308.15003
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Growing Efficient Deep Networks by Structured Continuous Sparsification
		
			Paper
			
•
			2007.15353
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Task-Specific Expert Pruning for Sparse Mixture-of-Experts
		
			Paper
			
•
			2206.00277
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			SiRA: Sparse Mixture of Low Rank Adaptation
		
			Paper
			
•
			2311.09179
			
•
			Published
				
			•
				
				9
			
 
	
	 
	
	
	
			
			ComPEFT: Compression for Communicating Parameter Efficient Updates via
  Sparsification and Quantization
		
			Paper
			
•
			2311.13171
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Masking as an Efficient Alternative to Finetuning for Pretrained
  Language Models
		
			Paper
			
•
			2004.12406
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Less is More: Selective Layer Finetuning with SubTuning
		
			Paper
			
•
			2302.06354
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Prune Once for All: Sparse Pre-Trained Language Models
		
			Paper
			
•
			2111.05754
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			To prune, or not to prune: exploring the efficacy of pruning for model
  compression
		
			Paper
			
•
			1710.01878
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Learning a Consensus Sub-Network with Polarization Regularization and
  One Pass Training
		
			Paper
			
•
			2302.10798
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			SortedNet, a Place for Every Network and Every Network in its Place:
  Towards a Generalized Solution for Training Many-in-One Neural Networks
		
			Paper
			
•
			2309.00255
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Sorted LLaMA: Unlocking the Potential of Intermediate Layers of Large
  Language Models for Dynamic Inference Using Sorted Fine-Tuning (SoFT)
		
			Paper
			
•
			2309.08968
			
•
			Published
				
			•
				
				23
			
 
	
	 
	
	
	
			
			LLM-Pruner: On the Structural Pruning of Large Language Models
		
			Paper
			
•
			2305.11627
			
•
			Published
				
			•
				
				3
			
 
	
	 
	
	
	
			
			Towards Green AI in Fine-tuning Large Language Models via Adaptive
  Backpropagation
		
			Paper
			
•
			2309.13192
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Feature Flow Regularization: Improving Structured Sparsity in Deep
  Neural Networks
		
			Paper
			
•
			2106.02914
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Automatic Neural Network Pruning that Efficiently Preserves the Model
  Accuracy
		
			Paper
			
•
			2111.09635
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			I3D: Transformer architectures with input-dependent dynamic depth for
  speech recognition
		
			Paper
			
•
			2303.07624
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			An EMO Joint Pruning with Multiple Sub-networks: Fast and Effect
		
			Paper
			
•
			2303.16212
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Distributed Pruning Towards Tiny Neural Networks in Federated Learning
		
			Paper
			
•
			2212.01977
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Neural Network Pruning as Spectrum Preserving Process
		
			Paper
			
•
			2307.08982
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Pruning a neural network using Bayesian inference
		
			Paper
			
•
			2308.02451
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Class-dependent Compression of Deep Neural Networks
		
			Paper
			
•
			1909.10364
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Structured Bayesian Compression for Deep Neural Networks Based on The
  Turbo-VBI Approach
		
			Paper
			
•
			2302.10483
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Global Sparse Momentum SGD for Pruning Very Deep Neural Networks
		
			Paper
			
•
			1909.12778
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Emergence of Segmentation with Minimalistic White-Box Transformers
		
			Paper
			
•
			2308.16271
			
•
			Published
				
			•
				
				16
			
 
	
	 
	
	
	
			
			White-Box Transformers via Sparse Rate Reduction: Compression Is All
  There Is?
		
			Paper
			
•
			2311.13110
			
•
			Published
				
			•
				
				2
			
 
	
	 
	
	
	
			
			Sparse Probabilistic Circuits via Pruning and Growing
		
			Paper
			
•
			2211.12551
			
•
			Published
				
			•
				
				2
			
 
	
	 
	
	
	
			
			Learning to Prune Deep Neural Networks via Reinforcement Learning
		
			Paper
			
•
			2007.04756
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Pruning Very Deep Neural Network Channels for Efficient Inference
		
			Paper
			
•
			2211.08339
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Fast Convex Pruning of Deep Neural Networks
		
			Paper
			
•
			1806.06457
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Pruning by Explaining: A Novel Criterion for Deep Neural Network Pruning
		
			Paper
			
•
			1912.08881
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Efficient Joint Optimization of Layer-Adaptive Weight Pruning in Deep
  Neural Networks
		
			Paper
			
•
			2308.10438
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Advancing Model Pruning via Bi-level Optimization
		
			Paper
			
•
			2210.04092
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			COLT: Cyclic Overlapping Lottery Tickets for Faster Pruning of
  Convolutional Neural Networks
		
			Paper
			
•
			2212.12770
			
•
			Published
				
			•
				
				2
			
 
	
	 
	
	
	
			
			When Layers Play the Lottery, all Tickets Win at Initialization
		
			Paper
			
•
			2301.10835
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Lottery Tickets in Evolutionary Optimization: On Sparse
  Backpropagation-Free Trainability
		
			Paper
			
•
			2306.00045
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Pruning at Initialization -- A Sketching Perspective
		
			Paper
			
•
			2305.17559
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			The Unreasonable Effectiveness of Random Pruning: Return of the Most
  Naive Baseline for Sparse Training
		
			Paper
			
•
			2202.02643
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Why Random Pruning Is All We Need to Start Sparse
		
			Paper
			
•
			2210.02412
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Scatterbrain: Unifying Sparse and Low-rank Attention Approximation
		
			Paper
			
•
			2110.15343
			
•
			Published
				
			•
				
				2
			
 
	
	 
	
	
	
			
			Adaptive Activation-based Structured Pruning
		
			Paper
			
•
			2201.10520
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Neuron-based Pruning of Deep Neural Networks with Better Generalization
  using Kronecker Factored Curvature Approximation
		
			Paper
			
•
			2111.08577
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			AUTOSPARSE: Towards Automated Sparse Training of Deep Neural Networks
		
			Paper
			
•
			2304.06941
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			A Survey on Deep Neural Network Pruning-Taxonomy, Comparison, Analysis,
  and Recommendations
		
			Paper
			
•
			2308.06767
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Pruning Deep Neural Networks from a Sparsity Perspective
		
			Paper
			
•
			2302.05601
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			White-Box Transformers via Sparse Rate Reduction
		
			Paper
			
•
			2306.01129
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			SeReNe: Sensitivity based Regularization of Neurons for Structured
  Sparsity in Neural Networks
		
			Paper
			
•
			2102.03773
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Pruning artificial neural networks: a way to find well-generalizing,
  high-entropy sharp minima
		
			Paper
			
•
			2004.14765
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Regularization-based Pruning of Irrelevant Weights in Deep Neural
  Architectures
		
			Paper
			
•
			2204.04977
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			FedDIP: Federated Learning with Extreme Dynamic Pruning and Incremental
  Regularization
		
			Paper
			
•
			2309.06805
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Learning Activation Functions for Sparse Neural Networks
		
			Paper
			
•
			2305.10964
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			LOss-Based SensiTivity rEgulaRization: towards deep sparse neural
  networks
		
			Paper
			
•
			2011.09905
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Adaptive Sparse and Monotonic Attention for Transformer-based Automatic
  Speech Recognition
		
			Paper
			
•
			2209.15176
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Memory-efficient NLLB-200: Language-specific Expert Pruning of a
  Massively Multilingual Machine Translation Model
		
			Paper
			
•
			2212.09811
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Sparse Low-rank Adaptation of Pre-trained Language Models
		
			Paper
			
•
			2311.11696
			
•
			Published
				
			•
				
				2
			
 
	
	 
	
	
	
			
			Learning Pruned Structure and Weights Simultaneously from Scratch: an
  Attention based Approach
		
			Paper
			
•
			2111.02399
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Pruning On-the-Fly: A Recoverable Pruning Method without Fine-tuning
		
			Paper
			
•
			2212.12651
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			UPSCALE: Unconstrained Channel Pruning
		
			Paper
			
•
			2307.08771
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			PruMUX: Augmenting Data Multiplexing with Model Compression
		
			Paper
			
•
			2305.14706
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			SlimFit: Memory-Efficient Fine-Tuning of Transformer-based Models Using
  Training Dynamics
		
			Paper
			
•
			2305.18513
			
•
			Published
				
			•
				
				2
			
 
	
	 
	
	
	
			
			Network Pruning via Transformable Architecture Search
		
			Paper
			
•
			1905.09717
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Flash-LLM: Enabling Cost-Effective and Highly-Efficient Large Generative
  Model Inference with Unstructured Sparsity
		
			Paper
			
•
			2309.10285
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			To Asymmetry and Beyond: Structured Pruning of Sequence to Sequence
  Models for Improved Inference Efficiency
		
			Paper
			
•
			2304.02721
			
•
			Published
				
			•
				
				3
			
 
	
	 
	
	
	
		
			Paper
			
•
			2312.17244
			
•
			Published
				
			•
				
				9
			
 
	
	 
	
	
	
			
			GMP*: Well-Tuned Gradual Magnitude Pruning Can Outperform Most
  BERT-Pruning Methods
		
			Paper
			
•
			2210.06384
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			The Truth is in There: Improving Reasoning in Language Models with
  Layer-Selective Rank Reduction
		
			Paper
			
•
			2312.13558
			
•
			Published
				
			•
				
				5
			
 
	
	 
	
	
	
			
			Sparsified Model Zoo Twins: Investigating Populations of Sparsified
  Neural Network Models
		
			Paper
			
•
			2304.13718
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Low-rank lottery tickets: finding efficient low-rank neural networks via
  matrix differential equations
		
			Paper
			
•
			2205.13571
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Trained Rank Pruning for Efficient Deep Neural Networks
		
			Paper
			
•
			1812.02402
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			TRP: Trained Rank Pruning for Efficient Deep Neural Networks
		
			Paper
			
•
			2004.14566
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Plug-in, Trainable Gate for Streamlining Arbitrary Neural Networks
		
			Paper
			
•
			1904.10921
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Conditional Adapters: Parameter-efficient Transfer Learning with Fast
  Inference
		
			Paper
			
•
			2304.04947
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Training Neural Networks with Fixed Sparse Masks
		
			Paper
			
•
			2111.09839
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			A Neural Scaling Law from Lottery Ticket Ensembling
		
			Paper
			
•
			2310.02258
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Methods for Pruning Deep Neural Networks
		
			Paper
			
•
			2011.00241
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			On the Existence of Universal Lottery Tickets
		
			Paper
			
•
			2111.11146
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Quantifying lottery tickets under label noise: accuracy, calibration,
  and complexity
		
			Paper
			
•
			2306.12190
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Generalization Bounds for Magnitude-Based Pruning via Sparse Matrix
  Sketching
		
			Paper
			
•
			2305.18789
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Sparse Training via Boosting Pruning Plasticity with Neuroregeneration
		
			Paper
			
•
			2106.10404
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Lottery Jackpots Exist in Pre-trained Models
		
			Paper
			
•
			2104.08700
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Grokking Tickets: Lottery Tickets Accelerate Grokking
		
			Paper
			
•
			2310.19470
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			SWAMP: Sparse Weight Averaging with Multiple Particles for Iterative
  Magnitude Pruning
		
			Paper
			
•
			2305.14852
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Sparse Model Soups: A Recipe for Improved Pruning via Model Averaging
		
			Paper
			
•
			2306.16788
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			"Understanding Robustness Lottery": A Geometric Visual Comparative
  Analysis of Neural Network Pruning Approaches
		
			Paper
			
•
			2206.07918
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Randomly Initialized Subnetworks with Iterative Weight Recycling
		
			Paper
			
•
			2303.15953
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			DASS: Differentiable Architecture Search for Sparse neural networks
		
			Paper
			
•
			2207.06968
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Ada-QPacknet -- adaptive pruning with bit width reduction as an
  efficient continual learning method without forgetting
		
			Paper
			
•
			2308.07939
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Robust Tickets Can Transfer Better: Drawing More Transferable
  Subnetworks in Transfer Learning
		
			Paper
			
•
			2304.11834
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			AP: Selective Activation for De-sparsifying Pruned Neural Networks
		
			Paper
			
•
			2212.06145
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			HideNseek: Federated Lottery Ticket via Server-side Pruning and Sign
  Supermask
		
			Paper
			
•
			2206.04385
			
•
			Published
				
			
			 
	
	 
	
	
	
			
			Dimensionality Reduced Training by Pruning and Freezing Parts of a Deep
  Neural Network, a Survey
		
			Paper
			
•
			2205.08099
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Structured Pruning is All You Need for Pruning CNNs at Initialization
		
			Paper
			
•
			2203.02549
			
•
			Published
				
			
			 
	
	 
	
	
	
			
			In deep reinforcement learning, a pruned network is a good network
		
			Paper
			
•
			2402.12479
			
•
			Published
				
			•
				
				19
			
 
	
	 
	
	
	
			
			BESA: Pruning Large Language Models with Blockwise Parameter-Efficient
  Sparsity Allocation
		
			Paper
			
•
			2402.16880
			
•
			Published
				
			•
				
				2
			
 
	
	 
	
	
	
			
			Dependency-Aware Semi-Structured Sparsity of GLU Variants in Large
  Language Models
		
			Paper
			
•
			2405.01943
			
•
			Published
				
			
			 
	
	 
	
	
	
			
			Pruning as a Domain-specific LLM Extractor
		
			Paper
			
•
			2405.06275
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			Structural Pruning of Pre-trained Language Models via Neural
  Architecture Search
		
			Paper
			
•
			2405.02267
			
•
			Published
				
			
			 
	
	 
	
	
	
			
			FoldGPT: Simple and Effective Large Language Model Compression Scheme
		
			Paper
			
•
			2407.00928
			
•
			Published
				
			
			 
	
	 
	
	
	
			
			Pruning via Merging: Compressing LLMs via Manifold Alignment Based Layer
  Merging
		
			Paper
			
•
			2406.16330
			
•
			Published
				
			
			 
	
	 
	
	
	
			
			BlockPruner: Fine-grained Pruning for Large Language Models
		
			Paper
			
•
			2406.10594
			
•
			Published
				
			•
				
				1
			
 
	
	 
	
	
	
			
			SPP: Sparsity-Preserved Parameter-Efficient Fine-Tuning for Large
  Language Models
		
			Paper
			
•
			2405.16057
			
•
			Published
				
			
			 
	
	 
	
	
	
			
			Pruning Large Language Models with Semi-Structural Adaptive Sparse
  Training
		
			Paper
			
•
			2407.20584
			
•
			Published
				
			
			 
	
	 
	
	
	
			
			Greedy Output Approximation: Towards Efficient Structured Pruning for
  LLMs Without Retraining
		
			Paper
			
•
			2407.19126
			
•
			Published