Attention as a Compass: Efficient Exploration for Process-Supervised RL in Reasoning Models Paper • 2509.26628 • Published Sep 30 • 14
The Markovian Thinker Collection Reformulating the RL of reasoning LLMs through Markovian Thinking paradigm. • 7 items • Updated Oct 9 • 10
In-the-Flow Agentic System Optimization for Effective Planning and Tool Use Paper • 2510.05592 • Published Oct 7 • 101
EditReward Collection EditReward: A Human-Aligned Reward Model for Instruction-Guided Image Editing • 11 items • Updated 30 days ago • 4
⚛️ Liquid Nanos Collection Library of task-specific models: https://www.liquid.ai/blog/introducing-liquid-nanos-frontier-grade-performance-on-everyday-devices • 21 items • Updated 13 days ago • 92
InternVL3.5 Collection This collection includes all released checkpoints of InternVL3.5, covering different training stages (e.g., Pretraining, SFT, MPO, Cascade RL). • 54 items • Updated Sep 28 • 102