Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2312.09299

Battle of the Backbones: A Large-Scale Comparison of Pretrained Models across Computer Vision Tasks

Paper • 2310.19909 • Published Oct 30, 2023 • 21
Memory Augmented Language Models through Mixture of Word Experts

Paper • 2311.10768 • Published Nov 15, 2023 • 19
FlashDecoding++: Faster Large Language Model Inference on GPUs

Paper • 2311.01282 • Published Nov 2, 2023 • 37
Prompt Cache: Modular Attention Reuse for Low-Latency Inference

Paper • 2311.04934 • Published Nov 7, 2023 • 34

PRDP: Proximal Reward Difference Prediction for Large-Scale Reward Finetuning of Diffusion Models

Paper • 2402.08714 • Published Feb 13, 2024 • 15
Data Engineering for Scaling Language Models to 128K Context

Paper • 2402.10171 • Published Feb 15, 2024 • 25
RLVF: Learning from Verbal Feedback without Overgeneralization

Paper • 2402.10893 • Published Feb 16, 2024 • 12
Coercing LLMs to do and reveal (almost) anything

Paper • 2402.14020 • Published Feb 21, 2024 • 13

FIAT: Fusing learning paradigms with Instruction-Accelerated Tuning

Paper • 2309.04663 • Published Sep 9, 2023 • 6
Textbooks Are All You Need II: phi-1.5 technical report

Paper • 2309.05463 • Published Sep 11, 2023 • 88
Idea2Img: Iterative Self-Refinement with GPT-4V(ision) for Automatic Image Design and Generation

Paper • 2310.08541 • Published Oct 12, 2023 • 18
Let's Synthesize Step by Step: Iterative Dataset Synthesis with Large Language Models by Extrapolating Errors from Small Models

Paper • 2310.13671 • Published Oct 20, 2023 • 19

lshort-transformers

Papers useful when writing the paper: "The Not So Short Transfromers"

ShortGPT: Layers in Large Language Models are More Redundant Than You Expect

Paper • 2403.03853 • Published Mar 6, 2024 • 66
SliceGPT: Compress Large Language Models by Deleting Rows and Columns

Paper • 2401.15024 • Published Jan 26, 2024 • 74
Your Transformer is Secretly Linear

Paper • 2405.12250 • Published May 19, 2024 • 159
Yi: Open Foundation Models by 01.AI

Paper • 2403.04652 • Published Mar 7, 2024 • 65

aMUSEd: An Open MUSE Reproduction

Paper • 2401.01808 • Published Jan 3, 2024 • 31
From Audio to Photoreal Embodiment: Synthesizing Humans in Conversations

Paper • 2401.01885 • Published Jan 3, 2024 • 28
SteinDreamer: Variance Reduction for Text-to-3D Score Distillation via Stein Identity

Paper • 2401.00604 • Published Dec 31, 2023 • 6
LARP: Language-Agent Role Play for Open-World Games

Paper • 2312.17653 • Published Dec 24, 2023 • 33

Battle of the Backbones: A Large-Scale Comparison of Pretrained Models across Computer Vision Tasks

Paper • 2310.19909 • Published Oct 30, 2023 • 21
Memory Augmented Language Models through Mixture of Word Experts

Paper • 2311.10768 • Published Nov 15, 2023 • 19
FlashDecoding++: Faster Large Language Model Inference on GPUs

Paper • 2311.01282 • Published Nov 2, 2023 • 37
Prompt Cache: Modular Attention Reuse for Low-Latency Inference

Paper • 2311.04934 • Published Nov 7, 2023 • 34

lshort-transformers

Papers useful when writing the paper: "The Not So Short Transfromers"

ShortGPT: Layers in Large Language Models are More Redundant Than You Expect

Paper • 2403.03853 • Published Mar 6, 2024 • 66
SliceGPT: Compress Large Language Models by Deleting Rows and Columns

Paper • 2401.15024 • Published Jan 26, 2024 • 74
Your Transformer is Secretly Linear

Paper • 2405.12250 • Published May 19, 2024 • 159
Yi: Open Foundation Models by 01.AI

Paper • 2403.04652 • Published Mar 7, 2024 • 65

PRDP: Proximal Reward Difference Prediction for Large-Scale Reward Finetuning of Diffusion Models

Paper • 2402.08714 • Published Feb 13, 2024 • 15
Data Engineering for Scaling Language Models to 128K Context

Paper • 2402.10171 • Published Feb 15, 2024 • 25
RLVF: Learning from Verbal Feedback without Overgeneralization

Paper • 2402.10893 • Published Feb 16, 2024 • 12
Coercing LLMs to do and reveal (almost) anything

Paper • 2402.14020 • Published Feb 21, 2024 • 13

aMUSEd: An Open MUSE Reproduction

Paper • 2401.01808 • Published Jan 3, 2024 • 31
From Audio to Photoreal Embodiment: Synthesizing Humans in Conversations

Paper • 2401.01885 • Published Jan 3, 2024 • 28
SteinDreamer: Variance Reduction for Text-to-3D Score Distillation via Stein Identity

Paper • 2401.00604 • Published Dec 31, 2023 • 6
LARP: Language-Agent Role Play for Open-World Games

Paper • 2312.17653 • Published Dec 24, 2023 • 33

FIAT: Fusing learning paradigms with Instruction-Accelerated Tuning

Paper • 2309.04663 • Published Sep 9, 2023 • 6
Textbooks Are All You Need II: phi-1.5 technical report

Paper • 2309.05463 • Published Sep 11, 2023 • 88
Idea2Img: Iterative Self-Refinement with GPT-4V(ision) for Automatic Image Design and Generation

Paper • 2310.08541 • Published Oct 12, 2023 • 18
Let's Synthesize Step by Step: Iterative Dataset Synthesis with Large Language Models by Extrapolating Errors from Small Models

Paper • 2310.13671 • Published Oct 20, 2023 • 19

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs