Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2401.02385

Memory Augmented Language Models through Mixture of Word Experts

Paper • 2311.10768 • Published Nov 15, 2023 • 19
System 2 Attention (is something you might need too)

Paper • 2311.11829 • Published Nov 20, 2023 • 44
Fine-tuning Language Models for Factuality

Paper • 2311.08401 • Published Nov 14, 2023 • 30
Orca 2: Teaching Small Language Models How to Reason

Paper • 2311.11045 • Published Nov 18, 2023 • 77

small language models

TinyLlama: An Open-Source Small Language Model

Paper • 2401.02385 • Published Jan 4, 2024 • 95

TinyLlama: An Open-Source Small Language Model

Paper • 2401.02385 • Published Jan 4, 2024 • 95
google/gemma-2-2b-it

Text Generation • 3B • Updated Aug 27, 2024 • 415k • • 1.22k
stabilityai/stable-fast-3d

Image-to-3D • 1B • Updated Apr 8 • 4.02k • 708
mistralai/Mixtral-8x7B-Instruct-v0.1

47B • Updated Jul 24 • 429k • 4.6k

about 1 month ago

meta-llama/Meta-Llama-3-8B

Text Generation • 8B • Updated Sep 27, 2024 • 2.12M • • 6.38k
TinyLlama: An Open-Source Small Language Model

Paper • 2401.02385 • Published Jan 4, 2024 • 95
aifeifei798/DarkIdol-Llama-3.1-8B-Instruct-1.2-Uncensored

Text Generation • 8B • Updated Jul 29, 2024 • 1.06k • • 211
Runtime error

413

BigCode - Playground

🪐

413

TinyLlama: An Open-Source Small Language Model

Paper • 2401.02385 • Published Jan 4, 2024 • 95

DeepSeek LLM: Scaling Open-Source Language Models with Longtermism

Paper • 2401.02954 • Published Jan 5, 2024 • 48
TinyLlama: An Open-Source Small Language Model

Paper • 2401.02385 • Published Jan 4, 2024 • 95

SLM e Moe structure PHD tesis: SOTA e valutazione parametri

collezione di paper utili per redazione tesi 1-2-3- capitolo da valutare cambio di rotta e gestione PHD

2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining

Paper • 2501.00958 • Published Jan 1 • 107
CodeElo: Benchmarking Competition-level Code Generation of LLMs with Human-comparable Elo Ratings

Paper • 2501.01257 • Published Jan 2 • 52
Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models

Paper • 2501.01423 • Published Jan 2 • 43
REDUCIO! Generating 1024times1024 Video within 16 Seconds using Extremely Compressed Motion Latents

Paper • 2411.13552 • Published Nov 20, 2024

TinyLlama: An Open-Source Small Language Model

Paper • 2401.02385 • Published Jan 4, 2024 • 95

meta-llama/Llama-2-7b-chat

Text Generation • Updated Apr 17, 2024 • 105 • 613
TinyLlama: An Open-Source Small Language Model

Paper • 2401.02385 • Published Jan 4, 2024 • 95

Efficient Training

Rethinking Optimization and Architecture for Tiny Language Models

Paper • 2402.02791 • Published Feb 5, 2024 • 13
Specialized Language Models with Cheap Inference from Limited Domain Data

Paper • 2402.01093 • Published Feb 2, 2024 • 47
Scavenging Hyena: Distilling Transformers into Long Convolution Models

Paper • 2401.17574 • Published Jan 31, 2024 • 17
Understanding LLMs: A Comprehensive Overview from Training to Inference

Paper • 2401.02038 • Published Jan 4, 2024 • 65

Memory Augmented Language Models through Mixture of Word Experts

Paper • 2311.10768 • Published Nov 15, 2023 • 19
System 2 Attention (is something you might need too)

Paper • 2311.11829 • Published Nov 20, 2023 • 44
Fine-tuning Language Models for Factuality

Paper • 2311.08401 • Published Nov 14, 2023 • 30
Orca 2: Teaching Small Language Models How to Reason

Paper • 2311.11045 • Published Nov 18, 2023 • 77

DeepSeek LLM: Scaling Open-Source Language Models with Longtermism

Paper • 2401.02954 • Published Jan 5, 2024 • 48
TinyLlama: An Open-Source Small Language Model

Paper • 2401.02385 • Published Jan 4, 2024 • 95

small language models

TinyLlama: An Open-Source Small Language Model

Paper • 2401.02385 • Published Jan 4, 2024 • 95

SLM e Moe structure PHD tesis: SOTA e valutazione parametri

collezione di paper utili per redazione tesi 1-2-3- capitolo da valutare cambio di rotta e gestione PHD

2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining

Paper • 2501.00958 • Published Jan 1 • 107
CodeElo: Benchmarking Competition-level Code Generation of LLMs with Human-comparable Elo Ratings

Paper • 2501.01257 • Published Jan 2 • 52
Reconstruction vs. Generation: Taming Optimization Dilemma in Latent Diffusion Models

Paper • 2501.01423 • Published Jan 2 • 43
REDUCIO! Generating 1024times1024 Video within 16 Seconds using Extremely Compressed Motion Latents

Paper • 2411.13552 • Published Nov 20, 2024

TinyLlama: An Open-Source Small Language Model

Paper • 2401.02385 • Published Jan 4, 2024 • 95
google/gemma-2-2b-it

Text Generation • 3B • Updated Aug 27, 2024 • 415k • • 1.22k
stabilityai/stable-fast-3d

Image-to-3D • 1B • Updated Apr 8 • 4.02k • 708
mistralai/Mixtral-8x7B-Instruct-v0.1

47B • Updated Jul 24 • 429k • 4.6k

TinyLlama: An Open-Source Small Language Model

Paper • 2401.02385 • Published Jan 4, 2024 • 95

about 1 month ago

meta-llama/Meta-Llama-3-8B

Text Generation • 8B • Updated Sep 27, 2024 • 2.12M • • 6.38k
TinyLlama: An Open-Source Small Language Model

Paper • 2401.02385 • Published Jan 4, 2024 • 95
aifeifei798/DarkIdol-Llama-3.1-8B-Instruct-1.2-Uncensored

Text Generation • 8B • Updated Jul 29, 2024 • 1.06k • • 211
Runtime error

413

BigCode - Playground

🪐

413

meta-llama/Llama-2-7b-chat

Text Generation • Updated Apr 17, 2024 • 105 • 613
TinyLlama: An Open-Source Small Language Model

Paper • 2401.02385 • Published Jan 4, 2024 • 95

TinyLlama: An Open-Source Small Language Model

Paper • 2401.02385 • Published Jan 4, 2024 • 95

Efficient Training

Rethinking Optimization and Architecture for Tiny Language Models

Paper • 2402.02791 • Published Feb 5, 2024 • 13
Specialized Language Models with Cheap Inference from Limited Domain Data

Paper • 2402.01093 • Published Feb 2, 2024 • 47
Scavenging Hyena: Distilling Transformers into Long Convolution Models

Paper • 2401.17574 • Published Jan 31, 2024 • 17
Understanding LLMs: A Comprehensive Overview from Training to Inference

Paper • 2401.02038 • Published Jan 4, 2024 • 65

Previous
1
2
3
4
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs