Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2306.11644

Phi-1 family of small language models.

microsoft/phi-1

Text Generation • 1B • Updated about 16 hours ago • 6.73k • 215
microsoft/phi-1_5

Text Generation • 1B • Updated about 16 hours ago • 102k • 1.35k
Textbooks Are All You Need

Paper • 2306.11644 • Published Jun 20, 2023 • 149
Textbooks Are All You Need II: phi-1.5 technical report

Paper • 2309.05463 • Published Sep 11, 2023 • 88

Textbooks Are All You Need

Paper • 2306.11644 • Published Jun 20, 2023 • 149

Lost in the Middle: How Language Models Use Long Contexts

Paper • 2307.03172 • Published Jul 6, 2023 • 43
Textbooks Are All You Need

Paper • 2306.11644 • Published Jun 20, 2023 • 149
Llama 2: Open Foundation and Fine-Tuned Chat Models

Paper • 2307.09288 • Published Jul 18, 2023 • 247

Running

2.9k

AnyCoder

📚

2.9k

Generate code with AI
Running

Featured

274

Qwen2.5 Coder Artifacts

🐢

274

Generate code snippets based on user input
Running

Featured

923

QwQ-32B-Preview

🔍

923

QwQ-32B-Preview
Running on CPU Upgrade

13.7k

Open LLM Leaderboard

🏆

13.7k

Track, rank and evaluate open LLMs and chatbots

Textbooks Are All You Need

Paper • 2306.11644 • Published Jun 20, 2023 • 149

Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 98
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 23
RoBERTa: A Robustly Optimized BERT Pretraining Approach

Paper • 1907.11692 • Published Jul 26, 2019 • 9
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter

Paper • 1910.01108 • Published Oct 2, 2019 • 21

Textbooks Are All You Need

Paper • 2306.11644 • Published Jun 20, 2023 • 149

A collection of arXiv papers from Chip Huyen's AI Engineering organized by chapter and ordered by when each appears in the book.

Will we run out of data? An analysis of the limits of scaling datasets in Machine Learning

Paper • 2211.04325 • Published Oct 26, 2022 • 1
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 23
On the Opportunities and Risks of Foundation Models

Paper • 2108.07258 • Published Aug 16, 2021 • 1
Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks

Paper • 2204.07705 • Published Apr 16, 2022 • 2

Synthetic Data papers

Papers and important approraches for generation of synthetic data

AgentInstruct: Toward Generative Teaching with Agentic Flows

Paper • 2407.03502 • Published Jul 3, 2024 • 51
Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing

Paper • 2406.08464 • Published Jun 12, 2024 • 71
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

Paper • 2404.14219 • Published Apr 22, 2024 • 259
DataDreamer: A Tool for Synthetic Data Generation and Reproducible LLM Workflows

Paper • 2402.10379 • Published Feb 16, 2024 • 31

SciLitLLM: How to Adapt LLMs for Scientific Literature Understanding

Paper • 2408.15545 • Published Aug 28, 2024 • 38
Controllable Text Generation for Large Language Models: A Survey

Paper • 2408.12599 • Published Aug 22, 2024 • 65
To Code, or Not To Code? Exploring Impact of Code in Pre-training

Paper • 2408.10914 • Published Aug 20, 2024 • 44
Automated Design of Agentic Systems

Paper • 2408.08435 • Published Aug 15, 2024 • 40

Phi-1 family of small language models.

microsoft/phi-1

Text Generation • 1B • Updated about 16 hours ago • 6.73k • 215
microsoft/phi-1_5

Text Generation • 1B • Updated about 16 hours ago • 102k • 1.35k
Textbooks Are All You Need

Paper • 2306.11644 • Published Jun 20, 2023 • 149
Textbooks Are All You Need II: phi-1.5 technical report

Paper • 2309.05463 • Published Sep 11, 2023 • 88

Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 98
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 23
RoBERTa: A Robustly Optimized BERT Pretraining Approach

Paper • 1907.11692 • Published Jul 26, 2019 • 9
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter

Paper • 1910.01108 • Published Oct 2, 2019 • 21

Textbooks Are All You Need

Paper • 2306.11644 • Published Jun 20, 2023 • 149

Textbooks Are All You Need

Paper • 2306.11644 • Published Jun 20, 2023 • 149

Lost in the Middle: How Language Models Use Long Contexts

Paper • 2307.03172 • Published Jul 6, 2023 • 43
Textbooks Are All You Need

Paper • 2306.11644 • Published Jun 20, 2023 • 149
Llama 2: Open Foundation and Fine-Tuned Chat Models

Paper • 2307.09288 • Published Jul 18, 2023 • 247

A collection of arXiv papers from Chip Huyen's AI Engineering organized by chapter and ordered by when each appears in the book.

Will we run out of data? An analysis of the limits of scaling datasets in Machine Learning

Paper • 2211.04325 • Published Oct 26, 2022 • 1
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 23
On the Opportunities and Risks of Foundation Models

Paper • 2108.07258 • Published Aug 16, 2021 • 1
Super-NaturalInstructions: Generalization via Declarative Instructions on 1600+ NLP Tasks

Paper • 2204.07705 • Published Apr 16, 2022 • 2

Running

2.9k

AnyCoder

📚

2.9k

Generate code with AI
Running

Featured

274

Qwen2.5 Coder Artifacts

🐢

274

Generate code snippets based on user input
Running

Featured

923

QwQ-32B-Preview

🔍

923

QwQ-32B-Preview
Running on CPU Upgrade

13.7k

Open LLM Leaderboard

🏆

13.7k

Track, rank and evaluate open LLMs and chatbots

Synthetic Data papers

Papers and important approraches for generation of synthetic data

AgentInstruct: Toward Generative Teaching with Agentic Flows

Paper • 2407.03502 • Published Jul 3, 2024 • 51
Magpie: Alignment Data Synthesis from Scratch by Prompting Aligned LLMs with Nothing

Paper • 2406.08464 • Published Jun 12, 2024 • 71
Phi-3 Technical Report: A Highly Capable Language Model Locally on Your Phone

Paper • 2404.14219 • Published Apr 22, 2024 • 259
DataDreamer: A Tool for Synthetic Data Generation and Reproducible LLM Workflows

Paper • 2402.10379 • Published Feb 16, 2024 • 31

Textbooks Are All You Need

Paper • 2306.11644 • Published Jun 20, 2023 • 149

SciLitLLM: How to Adapt LLMs for Scientific Literature Understanding

Paper • 2408.15545 • Published Aug 28, 2024 • 38
Controllable Text Generation for Large Language Models: A Survey

Paper • 2408.12599 • Published Aug 22, 2024 • 65
To Code, or Not To Code? Exploring Impact of Code in Pre-training

Paper • 2408.10914 • Published Aug 20, 2024 • 44
Automated Design of Agentic Systems

Paper • 2408.08435 • Published Aug 15, 2024 • 40

Previous
1
2
3
4
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs