Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2401.00908

ControlLLM: Augment Language Models with Tools by Searching on Graphs

Paper • 2310.17796 • Published Oct 26, 2023 • 18
Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection

Paper • 2310.11511 • Published Oct 17, 2023 • 78
upstage/SOLAR-10.7B-Instruct-v1.0

Text Generation • 11B • Updated Sep 10, 2024 • 31.1k • 641
openchat/openchat-3.5-1210

Text Generation • 7B • Updated May 18, 2024 • 607 • 279

ReST meets ReAct: Self-Improvement for Multi-Step Reasoning LLM Agent

Paper • 2312.10003 • Published Dec 15, 2023 • 44
DocLLM: A layout-aware generative language model for multimodal document understanding

Paper • 2401.00908 • Published Dec 31, 2023 • 190

Interesting Papers

Chain of Code: Reasoning with a Language Model-Augmented Code Emulator

Paper • 2312.04474 • Published Dec 7, 2023 • 33
Training Chain-of-Thought via Latent-Variable Inference

Paper • 2312.02179 • Published Nov 28, 2023 • 11
The Unlocking Spell on Base LLMs: Rethinking Alignment via In-Context Learning

Paper • 2312.01552 • Published Dec 4, 2023 • 32
AppAgent: Multimodal Agents as Smartphone Users

Paper • 2312.13771 • Published Dec 21, 2023 • 54

interesting paper

FMViT: A multiple-frequency mixing Vision Transformer

Paper • 2311.05707 • Published Nov 9, 2023 • 9
DocLLM: A layout-aware generative language model for multimodal document understanding

Paper • 2401.00908 • Published Dec 31, 2023 • 190
LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report

Paper • 2405.00732 • Published Apr 29, 2024 • 121
An Introduction to Vision-Language Modeling

Paper • 2405.17247 • Published May 27, 2024 • 90

Training & Architectures

Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 100
FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning

Paper • 2307.08691 • Published Jul 17, 2023 • 9
Mixtral of Experts

Paper • 2401.04088 • Published Jan 8, 2024 • 161
Mistral 7B

Paper • 2310.06825 • Published Oct 10, 2023 • 55

Interesting Papers

Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision

Paper • 2312.09390 • Published Dec 14, 2023 • 33
OneLLM: One Framework to Align All Modalities with Language

Paper • 2312.03700 • Published Dec 6, 2023 • 24
Generative Multimodal Models are In-Context Learners

Paper • 2312.13286 • Published Dec 20, 2023 • 37
The LLM Surgeon

Paper • 2312.17244 • Published Dec 28, 2023 • 9

A Picture is Worth More Than 77 Text Tokens: Evaluating CLIP-Style Models on Dense Captions

Paper • 2312.08578 • Published Dec 14, 2023 • 20
ZeroQuant(4+2): Redefining LLMs Quantization with a New FP6-Centric Strategy for Diverse Generative Tasks

Paper • 2312.08583 • Published Dec 14, 2023 • 11
Vision-Language Models as a Source of Rewards

Paper • 2312.09187 • Published Dec 14, 2023 • 14
StemGen: A music generation model that listens

Paper • 2312.08723 • Published Dec 14, 2023 • 49

Technical Report: Large Language Models can Strategically Deceive their Users when Put Under Pressure

Paper • 2311.07590 • Published Nov 9, 2023 • 17
Exponentially Faster Language Modelling

Paper • 2311.10770 • Published Nov 15, 2023 • 119
AppAgent: Multimodal Agents as Smartphone Users

Paper • 2312.13771 • Published Dec 21, 2023 • 54
DocLLM: A layout-aware generative language model for multimodal document understanding

Paper • 2401.00908 • Published Dec 31, 2023 • 190

meta-llama/Llama-2-7b-hf

Text Generation • 7B • Updated Apr 17, 2024 • 542k • 2.21k
DocLLM: A layout-aware generative language model for multimodal document understanding

Paper • 2401.00908 • Published Dec 31, 2023 • 190
Chain-of-Table: Evolving Tables in the Reasoning Chain for Table Understanding

Paper • 2401.04398 • Published Jan 9, 2024 • 25
PokéLLMon: A Human-Parity Agent for Pokémon Battles with Large Language Models

Paper • 2402.01118 • Published Feb 2, 2024 • 32

UI Layout Generation with LLMs Guided by UI Grammar

Paper • 2310.15455 • Published Oct 24, 2023 • 3
You Only Look at Screens: Multimodal Chain-of-Action Agents

Paper • 2309.11436 • Published Sep 20, 2023 • 1
Never-ending Learning of User Interfaces

Paper • 2308.08726 • Published Aug 17, 2023 • 2
LMDX: Language Model-based Document Information Extraction and Localization

Paper • 2309.10952 • Published Sep 19, 2023 • 66

ControlLLM: Augment Language Models with Tools by Searching on Graphs

Paper • 2310.17796 • Published Oct 26, 2023 • 18
Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection

Paper • 2310.11511 • Published Oct 17, 2023 • 78
upstage/SOLAR-10.7B-Instruct-v1.0

Text Generation • 11B • Updated Sep 10, 2024 • 31.1k • 641
openchat/openchat-3.5-1210

Text Generation • 7B • Updated May 18, 2024 • 607 • 279

Interesting Papers

Weak-to-Strong Generalization: Eliciting Strong Capabilities With Weak Supervision

Paper • 2312.09390 • Published Dec 14, 2023 • 33
OneLLM: One Framework to Align All Modalities with Language

Paper • 2312.03700 • Published Dec 6, 2023 • 24
Generative Multimodal Models are In-Context Learners

Paper • 2312.13286 • Published Dec 20, 2023 • 37
The LLM Surgeon

Paper • 2312.17244 • Published Dec 28, 2023 • 9

ReST meets ReAct: Self-Improvement for Multi-Step Reasoning LLM Agent

Paper • 2312.10003 • Published Dec 15, 2023 • 44
DocLLM: A layout-aware generative language model for multimodal document understanding

Paper • 2401.00908 • Published Dec 31, 2023 • 190

A Picture is Worth More Than 77 Text Tokens: Evaluating CLIP-Style Models on Dense Captions

Paper • 2312.08578 • Published Dec 14, 2023 • 20
ZeroQuant(4+2): Redefining LLMs Quantization with a New FP6-Centric Strategy for Diverse Generative Tasks

Paper • 2312.08583 • Published Dec 14, 2023 • 11
Vision-Language Models as a Source of Rewards

Paper • 2312.09187 • Published Dec 14, 2023 • 14
StemGen: A music generation model that listens

Paper • 2312.08723 • Published Dec 14, 2023 • 49

Interesting Papers

Chain of Code: Reasoning with a Language Model-Augmented Code Emulator

Paper • 2312.04474 • Published Dec 7, 2023 • 33
Training Chain-of-Thought via Latent-Variable Inference

Paper • 2312.02179 • Published Nov 28, 2023 • 11
The Unlocking Spell on Base LLMs: Rethinking Alignment via In-Context Learning

Paper • 2312.01552 • Published Dec 4, 2023 • 32
AppAgent: Multimodal Agents as Smartphone Users

Paper • 2312.13771 • Published Dec 21, 2023 • 54

Technical Report: Large Language Models can Strategically Deceive their Users when Put Under Pressure

Paper • 2311.07590 • Published Nov 9, 2023 • 17
Exponentially Faster Language Modelling

Paper • 2311.10770 • Published Nov 15, 2023 • 119
AppAgent: Multimodal Agents as Smartphone Users

Paper • 2312.13771 • Published Dec 21, 2023 • 54
DocLLM: A layout-aware generative language model for multimodal document understanding

Paper • 2401.00908 • Published Dec 31, 2023 • 190

interesting paper

FMViT: A multiple-frequency mixing Vision Transformer

Paper • 2311.05707 • Published Nov 9, 2023 • 9
DocLLM: A layout-aware generative language model for multimodal document understanding

Paper • 2401.00908 • Published Dec 31, 2023 • 190
LoRA Land: 310 Fine-tuned LLMs that Rival GPT-4, A Technical Report

Paper • 2405.00732 • Published Apr 29, 2024 • 121
An Introduction to Vision-Language Modeling

Paper • 2405.17247 • Published May 27, 2024 • 90

meta-llama/Llama-2-7b-hf

Text Generation • 7B • Updated Apr 17, 2024 • 542k • 2.21k
DocLLM: A layout-aware generative language model for multimodal document understanding

Paper • 2401.00908 • Published Dec 31, 2023 • 190
Chain-of-Table: Evolving Tables in the Reasoning Chain for Table Understanding

Paper • 2401.04398 • Published Jan 9, 2024 • 25
PokéLLMon: A Human-Parity Agent for Pokémon Battles with Large Language Models

Paper • 2402.01118 • Published Feb 2, 2024 • 32

Training & Architectures

Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 100
FlashAttention-2: Faster Attention with Better Parallelism and Work Partitioning

Paper • 2307.08691 • Published Jul 17, 2023 • 9
Mixtral of Experts

Paper • 2401.04088 • Published Jan 8, 2024 • 161
Mistral 7B

Paper • 2310.06825 • Published Oct 10, 2023 • 55

UI Layout Generation with LLMs Guided by UI Grammar

Paper • 2310.15455 • Published Oct 24, 2023 • 3
You Only Look at Screens: Multimodal Chain-of-Action Agents

Paper • 2309.11436 • Published Sep 20, 2023 • 1
Never-ending Learning of User Interfaces

Paper • 2308.08726 • Published Aug 17, 2023 • 2
LMDX: Language Model-based Document Information Extraction and Localization

Paper • 2309.10952 • Published Sep 19, 2023 • 66

Previous
1
...
8
9
10
11
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs