Qian Liu's picture

Qian Liu

SivilTaram

·

http://siviltaram.github.io/

AI & ML interests

Cooking cool things

Recent Activity

upvoted a paper 6 days ago

Scaling Latent Reasoning via Looped Language Models

published a dataset 19 days ago

SVRL/general-sharding-output-fineweb-1014

published a dataset 19 days ago

SVRL/general-sharding-output-megamath-1014

View all activity

Organizations

upvoted a paper 6 days ago

Scaling Latent Reasoning via Looped Language Models

Paper • 2510.25741 • Published 6 days ago • 191

published 2 datasets 19 days ago

SVRL/general-sharding-output-fineweb-1014

Updated 19 days ago • 7

SVRL/general-sharding-output-megamath-1014

Updated 19 days ago • 10

upvoted a paper 22 days ago

BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution

Paper • 2510.08697 • Published 26 days ago • 34

upvoted a paper about 1 month ago

Knapsack RL: Unlocking Exploration of LLMs via Optimizing Budget Allocation

Paper • 2509.25849 • Published Sep 30 • 47

liked a dataset about 1 month ago

zai-org/CC-Bench-trajectories

Viewer • Updated Sep 30 • 260 • 3.03k • 76

upvoted 2 papers about 2 months ago

Mini-o3: Scaling Up Reasoning Patterns and Interaction Turns for Visual Search

Paper • 2509.07969 • Published Sep 9 • 59

Reverse-Engineered Reasoning for Open-Ended Generation

Paper • 2509.06160 • Published Sep 7 • 147

liked a Space 2 months ago

BigCodeArena

Compare two AI models by sending them code and seeing their responses

authored a paper 2 months ago

SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning

Paper • 2509.02479 • Published Sep 2 • 83

upvoted 3 papers 2 months ago

UI-TARS-2 Technical Report: Advancing GUI Agent with Multi-Turn Reinforcement Learning

Paper • 2509.02544 • Published Sep 2 • 123

VerlTool: Towards Holistic Agentic Reinforcement Learning with Tool Use

Paper • 2509.01055 • Published Sep 1 • 72

SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning

Paper • 2509.02479 • Published Sep 2 • 83

commented a paper 2 months ago

SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning

Paper • 2509.02479 • Published Sep 2 • 83 •

upvoted a paper 2 months ago

TreePO: Bridging the Gap of Policy Optimization and Efficacy and Inference Efficiency with Heuristic Tree-based Modeling

Paper • 2508.17445 • Published Aug 24 • 80

upvoted 5 papers 3 months ago

Chain-of-Agents: End-to-End Agent Foundation Models via Multi-Agent Distillation and Agentic RL

Paper • 2508.13167 • Published Aug 6 • 127

FutureX: An Advanced Live Benchmark for LLM Agents in Future Prediction

Paper • 2508.11987 • Published Aug 16 • 70

Agent Lightning: Train ANY AI Agents with Reinforcement Learning

Paper • 2508.03680 • Published Aug 5 • 96

Efficient Agents: Building Effective Agents While Reducing Cost

Paper • 2508.02694 • Published Jul 24 • 85

R-Zero: Self-Evolving Reasoning LLM from Zero Data

Paper • 2508.05004 • Published Aug 7 • 126