10 43 30

Rui-Jie Zhu

ridger

AI & ML interests

None yet

Recent Activity

new activity 5 days ago

ByteDance/Ouro-2.6B-Thinking:Update Ouro-2.6B-Thinking model card with paper and code links

new activity 5 days ago

ByteDance/Ouro-2.6B:Improve model card: Add prominent paper and project links

new activity 5 days ago

ByteDance/Ouro-1.4B-Thinking:Improve model card: Update paper/code links and BibTeX citation

View all activity

Organizations

upvoted a paper 5 days ago

Kimi Linear: An Expressive, Efficient Attention Architecture

Paper • 2510.26692 • Published 6 days ago • 93

upvoted 2 papers 6 days ago

Scaling Latent Reasoning via Looped Language Models

Paper • 2510.25741 • Published 7 days ago • 196

Parallel Loop Transformer for Efficient Test-Time Computation Scaling

Paper • 2510.24824 • Published 8 days ago • 13

upvoted a collection 7 days ago

Ouro

Collection

a family of pre-trained Looped Language Models. • 4 items • Updated 7 days ago • 10

upvoted a paper 12 days ago

Seed3D 1.0: From Images to High-Fidelity Simulation-Ready 3D Assets

Paper • 2510.19944 • Published 14 days ago • 19

upvoted a paper about 1 month ago

Knapsack RL: Unlocking Exploration of LLMs via Optimizing Budget Allocation

Paper • 2509.25849 • Published Sep 30 • 47

upvoted 2 papers about 2 months ago

OctoThinker: Mid-training Incentivizes Reinforcement Learning Scaling

Paper • 2506.20512 • Published Jun 25 • 47

Why Language Models Hallucinate

Paper • 2509.04664 • Published Sep 4 • 189

upvoted 2 papers 3 months ago

WideSearch: Benchmarking Agentic Broad Info-Seeking

Paper • 2508.07999 • Published Aug 11 • 109

Seed-Prover: Deep and Broad Reasoning for Automated Theorem Proving

Paper • 2507.23726 • Published Jul 31 • 113

upvoted a paper 4 months ago

A Systematic Analysis of Hybrid Linear Attention

Paper • 2507.06457 • Published Jul 8 • 24

upvoted an article 4 months ago

Article

SmolLM3: smol, multilingual, long-context reasoner

Jul 8

• 712

upvoted 2 papers 4 months ago

CriticLean: Critic-Guided Reinforcement Learning for Mathematical Formalization

Paper • 2507.06181 • Published Jul 8 • 43

A Survey on Latent Reasoning

Paper • 2507.06203 • Published Jul 8 • 92

upvoted an article 4 months ago

Article

All LLMs Will Be Sparse BitNet Hybrids

•

May 14

• 16

upvoted a collection 4 months ago

Hybrid Linear Attention Research

Collection

All 1.3B & 340M hybrid linear-attention experiments. • 62 items • Updated Sep 11 • 12

upvoted 2 papers 4 months ago

A Survey on Vision-Language-Action Models: An Action Tokenization Perspective

Paper • 2507.01925 • Published Jul 2 • 38

Kwai Keye-VL Technical Report

Paper • 2507.01949 • Published Jul 2 • 130

upvoted a collection 4 months ago

ERNIE 4.5

Collection

collection of ERNIE 4.5 models. "-Paddle" models use PaddlePaddle weights, while "-PT" models use Transformer-style PyTorch weights. • 26 items • Updated Sep 24 • 174

upvoted a paper 4 months ago

Essential-Web v1.0: 24T tokens of organized web data

Paper • 2506.14111 • Published Jun 17 • 46

Rui-Jie Zhu

AI & ML interests

Recent Activity

Organizations

ridger's activity

SmolLM3: smol, multilingual, long-context reasoner

All LLMs Will Be Sparse BitNet Hybrids