Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2510.14528

- OCR - Optical Character Recognition

PaddleOCR-VL: Boosting Multilingual Document Parsing via a 0.9B Ultra-Compact Vision-Language Model

Paper • 2510.14528 • Published Oct 16 • 95
deepseek-ai/DeepSeek-OCR

Image-Text-to-Text • 3B • Updated 21 days ago • 5.19M • 2.83k
PaddlePaddle/PaddleOCR-VL

Image-Text-to-Text • 1.0B • Updated 12 days ago • 32.2k • 1.36k
nanonets/Nanonets-OCR2-3B

Image-Text-to-Text • 4B • Updated Oct 16 • 103k • 449

Read Later Stack

Demystifying Reinforcement Learning in Agentic Reasoning

Paper • 2510.11701 • Published Oct 13 • 31
Self-Improving LLM Agents at Test-Time

Paper • 2510.07841 • Published Oct 9 • 9
Making Mathematical Reasoning Adaptive

Paper • 2510.04617 • Published Oct 6 • 22
DocReward: A Document Reward Model for Structuring and Stylizing

Paper • 2510.11391 • Published Oct 13 • 27

MinerU2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Document Parsing

Paper • 2509.22186 • Published Sep 26 • 134
CommonForms: A Large, Diverse Dataset for Form Field Detection

Paper • 2509.16506 • Published Sep 20 • 19
Automated Structured Radiology Report Generation with Rich Clinical Context

Paper • 2510.00428 • Published Oct 1 • 7
Extract-0: A Specialized Language Model for Document Information Extraction

Paper • 2509.22906 • Published Sep 26

rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking

Paper • 2501.04519 • Published Jan 8 • 286
Transformer^2: Self-adaptive LLMs

Paper • 2501.06252 • Published Jan 9 • 54
Multimodal LLMs Can Reason about Aesthetics in Zero-Shot

Paper • 2501.09012 • Published Jan 15 • 10
FAST: Efficient Action Tokenization for Vision-Language-Action Models

Paper • 2501.09747 • Published Jan 16 • 27

Chan-Y/Florence-2-LaTex

Image-Text-to-Text • 0.3B • Updated Jul 16, 2024 • 4 • 2
meta-llama/CodeLlama-7b-Instruct-hf

Text Generation • 7B • Updated Mar 14, 2024 • 4.08k • 58
hamzab/roberta-fake-news-classification

Text Classification • Updated Jul 4, 2023 • 952 • • 8
Paper2Agent: Reimagining Research Papers As Interactive and Reliable AI Agents

Paper • 2509.06917 • Published Sep 8 • 41

DLER: Doing Length pEnalty Right - Incentivizing More Intelligence per Token via Reinforcement Learning

Paper • 2510.15110 • Published Oct 16 • 15
PaddleOCR-VL: Boosting Multilingual Document Parsing via a 0.9B Ultra-Compact Vision-Language Model

Paper • 2510.14528 • Published Oct 16 • 95
Bee: A High-Quality Corpus and Full-Stack Suite to Unlock Advanced Fully Open MLLMs

Paper • 2510.13795 • Published Oct 15 • 56
UniME-V2: MLLM-as-a-Judge for Universal Multimodal Embedding Learning

Paper • 2510.13515 • Published Oct 15 • 11

Visual Multi Modal LLM

NaViL: Rethinking Scaling Properties of Native Multimodal Large Language Models under Data Constraints

Paper • 2510.08565 • Published Oct 9 • 19
Detect Anything via Next Point Prediction

Paper • 2510.12798 • Published Oct 14 • 46
PaddleOCR-VL: Boosting Multilingual Document Parsing via a 0.9B Ultra-Compact Vision-Language Model

Paper • 2510.14528 • Published Oct 16 • 95
DeepEyesV2: Toward Agentic Multimodal Model

Paper • 2511.05271 • Published 17 days ago • 41

Running

Featured

907

Qwen3 Coder WebDev

🌍

907

Generate web application code from descriptions
openai/whisper-large-v3

Automatic Speech Recognition • 2B • Updated Aug 12, 2024 • 4.63M • • 5.14k
PaddlePaddle/PaddleOCR-VL

Image-Text-to-Text • 1.0B • Updated 12 days ago • 32.2k • 1.36k
PaddleOCR-VL: Boosting Multilingual Document Parsing via a 0.9B Ultra-Compact Vision-Language Model

Paper • 2510.14528 • Published Oct 16 • 95

minlik/docllm-yi-34b

Text Generation • 38B • Updated Mar 20, 2024 • 8 • 1
JinghuiLuAstronaut/DocLLM_baichuan2_7b

Text Generation • 9B • Updated Feb 29, 2024 • 13 • 5
docling-project/docling-models

Updated Jul 23 • 460k • 184
Running

Featured

177

DocLayout YOLO

🚀

177

Demo for DocLayout-YOLO

Large Language Model (LLM) and NLP related papers.

LoRA+: Efficient Low Rank Adaptation of Large Models

Paper • 2402.12354 • Published Feb 19, 2024 • 6
The FinBen: An Holistic Financial Benchmark for Large Language Models

Paper • 2402.12659 • Published Feb 20, 2024 • 23
TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization

Paper • 2402.13249 • Published Feb 20, 2024 • 13
TrustLLM: Trustworthiness in Large Language Models

Paper • 2401.05561 • Published Jan 10, 2024 • 69

- OCR - Optical Character Recognition

PaddleOCR-VL: Boosting Multilingual Document Parsing via a 0.9B Ultra-Compact Vision-Language Model

Paper • 2510.14528 • Published Oct 16 • 95
deepseek-ai/DeepSeek-OCR

Image-Text-to-Text • 3B • Updated 21 days ago • 5.19M • 2.83k
PaddlePaddle/PaddleOCR-VL

Image-Text-to-Text • 1.0B • Updated 12 days ago • 32.2k • 1.36k
nanonets/Nanonets-OCR2-3B

Image-Text-to-Text • 4B • Updated Oct 16 • 103k • 449

DLER: Doing Length pEnalty Right - Incentivizing More Intelligence per Token via Reinforcement Learning

Paper • 2510.15110 • Published Oct 16 • 15
PaddleOCR-VL: Boosting Multilingual Document Parsing via a 0.9B Ultra-Compact Vision-Language Model

Paper • 2510.14528 • Published Oct 16 • 95
Bee: A High-Quality Corpus and Full-Stack Suite to Unlock Advanced Fully Open MLLMs

Paper • 2510.13795 • Published Oct 15 • 56
UniME-V2: MLLM-as-a-Judge for Universal Multimodal Embedding Learning

Paper • 2510.13515 • Published Oct 15 • 11

Read Later Stack

Demystifying Reinforcement Learning in Agentic Reasoning

Paper • 2510.11701 • Published Oct 13 • 31
Self-Improving LLM Agents at Test-Time

Paper • 2510.07841 • Published Oct 9 • 9
Making Mathematical Reasoning Adaptive

Paper • 2510.04617 • Published Oct 6 • 22
DocReward: A Document Reward Model for Structuring and Stylizing

Paper • 2510.11391 • Published Oct 13 • 27

Visual Multi Modal LLM

NaViL: Rethinking Scaling Properties of Native Multimodal Large Language Models under Data Constraints

Paper • 2510.08565 • Published Oct 9 • 19
Detect Anything via Next Point Prediction

Paper • 2510.12798 • Published Oct 14 • 46
PaddleOCR-VL: Boosting Multilingual Document Parsing via a 0.9B Ultra-Compact Vision-Language Model

Paper • 2510.14528 • Published Oct 16 • 95
DeepEyesV2: Toward Agentic Multimodal Model

Paper • 2511.05271 • Published 17 days ago • 41

MinerU2.5: A Decoupled Vision-Language Model for Efficient High-Resolution Document Parsing

Paper • 2509.22186 • Published Sep 26 • 134
CommonForms: A Large, Diverse Dataset for Form Field Detection

Paper • 2509.16506 • Published Sep 20 • 19
Automated Structured Radiology Report Generation with Rich Clinical Context

Paper • 2510.00428 • Published Oct 1 • 7
Extract-0: A Specialized Language Model for Document Information Extraction

Paper • 2509.22906 • Published Sep 26

Running

Featured

907

Qwen3 Coder WebDev

🌍

907

Generate web application code from descriptions
openai/whisper-large-v3

Automatic Speech Recognition • 2B • Updated Aug 12, 2024 • 4.63M • • 5.14k
PaddlePaddle/PaddleOCR-VL

Image-Text-to-Text • 1.0B • Updated 12 days ago • 32.2k • 1.36k
PaddleOCR-VL: Boosting Multilingual Document Parsing via a 0.9B Ultra-Compact Vision-Language Model

Paper • 2510.14528 • Published Oct 16 • 95

rStar-Math: Small LLMs Can Master Math Reasoning with Self-Evolved Deep Thinking

Paper • 2501.04519 • Published Jan 8 • 286
Transformer^2: Self-adaptive LLMs

Paper • 2501.06252 • Published Jan 9 • 54
Multimodal LLMs Can Reason about Aesthetics in Zero-Shot

Paper • 2501.09012 • Published Jan 15 • 10
FAST: Efficient Action Tokenization for Vision-Language-Action Models

Paper • 2501.09747 • Published Jan 16 • 27

minlik/docllm-yi-34b

Text Generation • 38B • Updated Mar 20, 2024 • 8 • 1
JinghuiLuAstronaut/DocLLM_baichuan2_7b

Text Generation • 9B • Updated Feb 29, 2024 • 13 • 5
docling-project/docling-models

Updated Jul 23 • 460k • 184
Running

Featured

177

DocLayout YOLO

🚀

177

Demo for DocLayout-YOLO

Chan-Y/Florence-2-LaTex

Image-Text-to-Text • 0.3B • Updated Jul 16, 2024 • 4 • 2
meta-llama/CodeLlama-7b-Instruct-hf

Text Generation • 7B • Updated Mar 14, 2024 • 4.08k • 58
hamzab/roberta-fake-news-classification

Text Classification • Updated Jul 4, 2023 • 952 • • 8
Paper2Agent: Reimagining Research Papers As Interactive and Reliable AI Agents

Paper • 2509.06917 • Published Sep 8 • 41

Large Language Model (LLM) and NLP related papers.

LoRA+: Efficient Low Rank Adaptation of Large Models

Paper • 2402.12354 • Published Feb 19, 2024 • 6
The FinBen: An Holistic Financial Benchmark for Large Language Models

Paper • 2402.12659 • Published Feb 20, 2024 • 23
TofuEval: Evaluating Hallucinations of LLMs on Topic-Focused Dialogue Summarization

Paper • 2402.13249 • Published Feb 20, 2024 • 13
TrustLLM: Trustworthiness in Large Language Models

Paper • 2401.05561 • Published Jan 10, 2024 • 69

Previous
1
2
3
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs