-
Rethinking Interpretability in the Era of Large Language Models
Paper • 2402.01761 • Published • 23 -
OpenMoE: An Early Effort on Open Mixture-of-Experts Language Models
Paper • 2402.01739 • Published • 28 -
BitNet: Scaling 1-bit Transformers for Large Language Models
Paper • 2310.11453 • Published • 105 -
Stealing Part of a Production Language Model
Paper • 2403.06634 • Published • 91
Collections
Discover the best community collections!
Collections including paper arxiv:2310.11453
-
YAYI 2: Multilingual Open-Source Large Language Models
Paper • 2312.14862 • Published • 15 -
SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling
Paper • 2312.15166 • Published • 60 -
TrustLLM: Trustworthiness in Large Language Models
Paper • 2401.05561 • Published • 69 -
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
Paper • 2401.06066 • Published • 56
-
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 151 -
BitNet: Scaling 1-bit Transformers for Large Language Models
Paper • 2310.11453 • Published • 105 -
ReFT: Representation Finetuning for Language Models
Paper • 2404.03592 • Published • 101 -
LLM in a flash: Efficient Large Language Model Inference with Limited Memory
Paper • 2312.11514 • Published • 260
-
Multilingual Instruction Tuning With Just a Pinch of Multilinguality
Paper • 2401.01854 • Published • 11 -
LLaMA Beyond English: An Empirical Study on Language Capability Transfer
Paper • 2401.01055 • Published • 55 -
LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning
Paper • 2401.01325 • Published • 27 -
Improving Text Embeddings with Large Language Models
Paper • 2401.00368 • Published • 82
-
QuIP: 2-Bit Quantization of Large Language Models With Guarantees
Paper • 2307.13304 • Published • 2 -
SpQR: A Sparse-Quantized Representation for Near-Lossless LLM Weight Compression
Paper • 2306.03078 • Published • 3 -
OmniQuant: Omnidirectionally Calibrated Quantization for Large Language Models
Paper • 2308.13137 • Published • 18 -
AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
Paper • 2306.00978 • Published • 11
-
Distil-Whisper: Robust Knowledge Distillation via Large-Scale Pseudo Labelling
Paper • 2311.00430 • Published • 57 -
SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis
Paper • 2307.01952 • Published • 90 -
Language Modeling Is Compression
Paper • 2309.10668 • Published • 83 -
Pretraining Data Mixtures Enable Narrow Model Selection Capabilities in Transformer Models
Paper • 2311.00871 • Published • 3
-
Rethinking Interpretability in the Era of Large Language Models
Paper • 2402.01761 • Published • 23 -
OpenMoE: An Early Effort on Open Mixture-of-Experts Language Models
Paper • 2402.01739 • Published • 28 -
BitNet: Scaling 1-bit Transformers for Large Language Models
Paper • 2310.11453 • Published • 105 -
Stealing Part of a Production Language Model
Paper • 2403.06634 • Published • 91
-
Multilingual Instruction Tuning With Just a Pinch of Multilinguality
Paper • 2401.01854 • Published • 11 -
LLaMA Beyond English: An Empirical Study on Language Capability Transfer
Paper • 2401.01055 • Published • 55 -
LLM Maybe LongLM: Self-Extend LLM Context Window Without Tuning
Paper • 2401.01325 • Published • 27 -
Improving Text Embeddings with Large Language Models
Paper • 2401.00368 • Published • 82
-
YAYI 2: Multilingual Open-Source Large Language Models
Paper • 2312.14862 • Published • 15 -
SOLAR 10.7B: Scaling Large Language Models with Simple yet Effective Depth Up-Scaling
Paper • 2312.15166 • Published • 60 -
TrustLLM: Trustworthiness in Large Language Models
Paper • 2401.05561 • Published • 69 -
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
Paper • 2401.06066 • Published • 56
-
QuIP: 2-Bit Quantization of Large Language Models With Guarantees
Paper • 2307.13304 • Published • 2 -
SpQR: A Sparse-Quantized Representation for Near-Lossless LLM Weight Compression
Paper • 2306.03078 • Published • 3 -
OmniQuant: Omnidirectionally Calibrated Quantization for Large Language Models
Paper • 2308.13137 • Published • 18 -
AWQ: Activation-aware Weight Quantization for LLM Compression and Acceleration
Paper • 2306.00978 • Published • 11
-
Self-Rewarding Language Models
Paper • 2401.10020 • Published • 151 -
BitNet: Scaling 1-bit Transformers for Large Language Models
Paper • 2310.11453 • Published • 105 -
ReFT: Representation Finetuning for Language Models
Paper • 2404.03592 • Published • 101 -
LLM in a flash: Efficient Large Language Model Inference with Limited Memory
Paper • 2312.11514 • Published • 260
-
Distil-Whisper: Robust Knowledge Distillation via Large-Scale Pseudo Labelling
Paper • 2311.00430 • Published • 57 -
SDXL: Improving Latent Diffusion Models for High-Resolution Image Synthesis
Paper • 2307.01952 • Published • 90 -
Language Modeling Is Compression
Paper • 2309.10668 • Published • 83 -
Pretraining Data Mixtures Enable Narrow Model Selection Capabilities in Transformer Models
Paper • 2311.00871 • Published • 3