Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2306.11644

Phi-1 family of small language models.

microsoft/phi-1

Text Generation • 1B • Updated 9 days ago • 6.87k • 216
microsoft/phi-1_5

Text Generation • 1B • Updated 9 days ago • 88.1k • 1.35k
Textbooks Are All You Need

Paper • 2306.11644 • Published Jun 20, 2023 • 149
Textbooks Are All You Need II: phi-1.5 technical report

Paper • 2309.05463 • Published Sep 11, 2023 • 88

Textbooks Are All You Need

Paper • 2306.11644 • Published Jun 20, 2023 • 149
Running on Zero

Featured

480

Llama 2 7B Chat

🏆

480

Generate chat responses using Llama-2 7B model

ByteDance/AnimateDiff-Lightning

Text-to-Video • Updated Jan 6 • 51.1k • 971
Textbooks Are All You Need

Paper • 2306.11644 • Published Jun 20, 2023 • 149
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published Jan 22 • 428

Code LMs Evaluation

Unifying the Perspectives of NLP and Software Engineering: A Survey on Language Models for Code

Paper • 2311.07989 • Published Nov 14, 2023 • 26
SWE-bench: Can Language Models Resolve Real-World GitHub Issues?

Paper • 2310.06770 • Published Oct 10, 2023 • 9
CRUXEval: A Benchmark for Code Reasoning, Understanding and Execution

Paper • 2401.03065 • Published Jan 5, 2024 • 11
Copilot Evaluation Harness: Evaluating LLM-Guided Software Programming

Paper • 2402.14261 • Published Feb 22, 2024 • 11

LLM_architectures

Nemotron-4 15B Technical Report

Paper • 2402.16819 • Published Feb 26, 2024 • 46
Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models

Paper • 2402.19427 • Published Feb 29, 2024 • 56
RWKV: Reinventing RNNs for the Transformer Era

Paper • 2305.13048 • Published May 22, 2023 • 20
Reformer: The Efficient Transformer

Paper • 2001.04451 • Published Jan 13, 2020

Textbooks Are All You Need

Paper • 2306.11644 • Published Jun 20, 2023 • 149

LLM foundations

Mixture-of-Depths: Dynamically allocating compute in transformer-based language models

Paper • 2404.02258 • Published Apr 2, 2024 • 107
Textbooks Are All You Need

Paper • 2306.11644 • Published Jun 20, 2023 • 149
Jamba: A Hybrid Transformer-Mamba Language Model

Paper • 2403.19887 • Published Mar 28, 2024 • 111
Large Language Models Struggle to Learn Long-Tail Knowledge

Paper • 2211.08411 • Published Nov 15, 2022 • 3

Dataset generation

Textbooks Are All You Need

Paper • 2306.11644 • Published Jun 20, 2023 • 149
Textbooks Are All You Need II: phi-1.5 technical report

Paper • 2309.05463 • Published Sep 11, 2023 • 88

Textbooks Are All You Need

Paper • 2306.11644 • Published Jun 20, 2023 • 149

Visual In-Context Prompting

Paper • 2311.13601 • Published Nov 22, 2023 • 19
Textbooks Are All You Need

Paper • 2306.11644 • Published Jun 20, 2023 • 149
AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation Framework

Paper • 2308.08155 • Published Aug 16, 2023 • 10
LIDA: A Tool for Automatic Generation of Grammar-Agnostic Visualizations and Infographics using Large Language Models

Paper • 2303.02927 • Published Mar 6, 2023 • 3

Phi-1 family of small language models.

microsoft/phi-1

Text Generation • 1B • Updated 9 days ago • 6.87k • 216
microsoft/phi-1_5

Text Generation • 1B • Updated 9 days ago • 88.1k • 1.35k
Textbooks Are All You Need

Paper • 2306.11644 • Published Jun 20, 2023 • 149
Textbooks Are All You Need II: phi-1.5 technical report

Paper • 2309.05463 • Published Sep 11, 2023 • 88

Textbooks Are All You Need

Paper • 2306.11644 • Published Jun 20, 2023 • 149

Textbooks Are All You Need

Paper • 2306.11644 • Published Jun 20, 2023 • 149
Running on Zero

Featured

480

Llama 2 7B Chat

🏆

480

Generate chat responses using Llama-2 7B model

LLM foundations

Mixture-of-Depths: Dynamically allocating compute in transformer-based language models

Paper • 2404.02258 • Published Apr 2, 2024 • 107
Textbooks Are All You Need

Paper • 2306.11644 • Published Jun 20, 2023 • 149
Jamba: A Hybrid Transformer-Mamba Language Model

Paper • 2403.19887 • Published Mar 28, 2024 • 111
Large Language Models Struggle to Learn Long-Tail Knowledge

Paper • 2211.08411 • Published Nov 15, 2022 • 3

ByteDance/AnimateDiff-Lightning

Text-to-Video • Updated Jan 6 • 51.1k • 971
Textbooks Are All You Need

Paper • 2306.11644 • Published Jun 20, 2023 • 149
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published Jan 22 • 428

Dataset generation

Textbooks Are All You Need

Paper • 2306.11644 • Published Jun 20, 2023 • 149
Textbooks Are All You Need II: phi-1.5 technical report

Paper • 2309.05463 • Published Sep 11, 2023 • 88

Code LMs Evaluation

Unifying the Perspectives of NLP and Software Engineering: A Survey on Language Models for Code

Paper • 2311.07989 • Published Nov 14, 2023 • 26
SWE-bench: Can Language Models Resolve Real-World GitHub Issues?

Paper • 2310.06770 • Published Oct 10, 2023 • 9
CRUXEval: A Benchmark for Code Reasoning, Understanding and Execution

Paper • 2401.03065 • Published Jan 5, 2024 • 11
Copilot Evaluation Harness: Evaluating LLM-Guided Software Programming

Paper • 2402.14261 • Published Feb 22, 2024 • 11

Textbooks Are All You Need

Paper • 2306.11644 • Published Jun 20, 2023 • 149

LLM_architectures

Nemotron-4 15B Technical Report

Paper • 2402.16819 • Published Feb 26, 2024 • 46
Griffin: Mixing Gated Linear Recurrences with Local Attention for Efficient Language Models

Paper • 2402.19427 • Published Feb 29, 2024 • 56
RWKV: Reinventing RNNs for the Transformer Era

Paper • 2305.13048 • Published May 22, 2023 • 20
Reformer: The Efficient Transformer

Paper • 2001.04451 • Published Jan 13, 2020

Visual In-Context Prompting

Paper • 2311.13601 • Published Nov 22, 2023 • 19
Textbooks Are All You Need

Paper • 2306.11644 • Published Jun 20, 2023 • 149
AutoGen: Enabling Next-Gen LLM Applications via Multi-Agent Conversation Framework

Paper • 2308.08155 • Published Aug 16, 2023 • 10
LIDA: A Tool for Automatic Generation of Grammar-Agnostic Visualizations and Infographics using Large Language Models

Paper • 2303.02927 • Published Mar 6, 2023 • 3

Previous
1
2
3
4
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs