Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2510.26692

Kimi-Linear-A3B

Moonshot's experimental MoE model with Kimi Delta Attention

moonshotai/Kimi-Linear-48B-A3B-Instruct

Text Generation • 49B • Updated 4 days ago • 19.4k • 362
moonshotai/Kimi-Linear-48B-A3B-Base

Text Generation • 49B • Updated 4 days ago • 216 • 46
Kimi Linear: An Expressive, Efficient Attention Architecture

Paper • 2510.26692 • Published 6 days ago • 93

about 15 hours ago

Kimi Linear: An Expressive, Efficient Attention Architecture

Paper • 2510.26692 • Published 6 days ago • 93

LLM Architectures

Kimi Linear: An Expressive, Efficient Attention Architecture

Paper • 2510.26692 • Published 6 days ago • 93

Agentic AI Training and Tuning

Tongyi DeepResearch Technical Report

Paper • 2510.24701 • Published 8 days ago • 89
Kimi Linear: An Expressive, Efficient Attention Architecture

Paper • 2510.26692 • Published 6 days ago • 93

Less is More: Recursive Reasoning with Tiny Networks

Paper • 2510.04871 • Published about 1 month ago • 464
Cache-to-Cache: Direct Semantic Communication Between Large Language Models

Paper • 2510.03215 • Published Oct 3 • 95
When Thoughts Meet Facts: Reusable Reasoning for Long-Context LMs

Paper • 2510.07499 • Published 28 days ago • 48
StreamingVLM: Real-Time Understanding for Infinite Video Streams

Paper • 2510.09608 • Published 26 days ago • 49

EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters

Paper • 2402.04252 • Published Feb 6, 2024 • 28
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models

Paper • 2402.03749 • Published Feb 6, 2024 • 14
ScreenAI: A Vision-Language Model for UI and Infographics Understanding

Paper • 2402.04615 • Published Feb 7, 2024 • 44
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss

Paper • 2402.05008 • Published Feb 7, 2024 • 23

Defeating the Training-Inference Mismatch via FP16

Paper • 2510.26788 • Published 6 days ago • 24
Kimi Linear: An Expressive, Efficient Attention Architecture

Paper • 2510.26692 • Published 6 days ago • 93

Kimi Linear: An Expressive, Efficient Attention Architecture

Paper • 2510.26692 • Published 6 days ago • 93

about 19 hours ago

WebWatcher: Breaking New Frontier of Vision-Language Deep Research Agent

Paper • 2508.05748 • Published Aug 7 • 137
Kimi Linear: An Expressive, Efficient Attention Architecture

Paper • 2510.26692 • Published 6 days ago • 93

FastVLM: Efficient Vision Encoding for Vision Language Models

Paper • 2412.13303 • Published Dec 17, 2024 • 70
rStar2-Agent: Agentic Reasoning Technical Report

Paper • 2508.20722 • Published Aug 28 • 114
AgentScope 1.0: A Developer-Centric Framework for Building Agentic Applications

Paper • 2508.16279 • Published Aug 22 • 52
OmniWorld: A Multi-Domain and Multi-Modal Dataset for 4D World Modeling

Paper • 2509.12201 • Published Sep 15 • 103

Kimi-Linear-A3B

Moonshot's experimental MoE model with Kimi Delta Attention

moonshotai/Kimi-Linear-48B-A3B-Instruct

Text Generation • 49B • Updated 4 days ago • 19.4k • 362
moonshotai/Kimi-Linear-48B-A3B-Base

Text Generation • 49B • Updated 4 days ago • 216 • 46
Kimi Linear: An Expressive, Efficient Attention Architecture

Paper • 2510.26692 • Published 6 days ago • 93

EVA-CLIP-18B: Scaling CLIP to 18 Billion Parameters

Paper • 2402.04252 • Published Feb 6, 2024 • 28
Vision Superalignment: Weak-to-Strong Generalization for Vision Foundation Models

Paper • 2402.03749 • Published Feb 6, 2024 • 14
ScreenAI: A Vision-Language Model for UI and Infographics Understanding

Paper • 2402.04615 • Published Feb 7, 2024 • 44
EfficientViT-SAM: Accelerated Segment Anything Model Without Performance Loss

Paper • 2402.05008 • Published Feb 7, 2024 • 23

about 15 hours ago

Kimi Linear: An Expressive, Efficient Attention Architecture

Paper • 2510.26692 • Published 6 days ago • 93

Defeating the Training-Inference Mismatch via FP16

Paper • 2510.26788 • Published 6 days ago • 24
Kimi Linear: An Expressive, Efficient Attention Architecture

Paper • 2510.26692 • Published 6 days ago • 93

LLM Architectures

Kimi Linear: An Expressive, Efficient Attention Architecture

Paper • 2510.26692 • Published 6 days ago • 93

Kimi Linear: An Expressive, Efficient Attention Architecture

Paper • 2510.26692 • Published 6 days ago • 93

Agentic AI Training and Tuning

Tongyi DeepResearch Technical Report

Paper • 2510.24701 • Published 8 days ago • 89
Kimi Linear: An Expressive, Efficient Attention Architecture

Paper • 2510.26692 • Published 6 days ago • 93

about 19 hours ago

WebWatcher: Breaking New Frontier of Vision-Language Deep Research Agent

Paper • 2508.05748 • Published Aug 7 • 137
Kimi Linear: An Expressive, Efficient Attention Architecture

Paper • 2510.26692 • Published 6 days ago • 93

Less is More: Recursive Reasoning with Tiny Networks

Paper • 2510.04871 • Published about 1 month ago • 464
Cache-to-Cache: Direct Semantic Communication Between Large Language Models

Paper • 2510.03215 • Published Oct 3 • 95
When Thoughts Meet Facts: Reusable Reasoning for Long-Context LMs

Paper • 2510.07499 • Published 28 days ago • 48
StreamingVLM: Real-Time Understanding for Infinite Video Streams

Paper • 2510.09608 • Published 26 days ago • 49

FastVLM: Efficient Vision Encoding for Vision Language Models

Paper • 2412.13303 • Published Dec 17, 2024 • 70
rStar2-Agent: Agentic Reasoning Technical Report

Paper • 2508.20722 • Published Aug 28 • 114
AgentScope 1.0: A Developer-Centric Framework for Building Agentic Applications

Paper • 2508.16279 • Published Aug 22 • 52
OmniWorld: A Multi-Domain and Multi-Modal Dataset for 4D World Modeling

Paper • 2509.12201 • Published Sep 15 • 103

Previous
1
2
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs