Kuo-Hsin Tu's picture

Kuo-Hsin Tu PRO

dapumptu

·

AI & ML interests

None yet

Recent Activity

liked a dataset 1 day ago

nvidia/Nemotron-Competitive-Programming-v1

liked a dataset 1 day ago

nvidia/Nemotron-Agentic-v1

liked a dataset 2 days ago

google/deepsearchqa

View all activity

Organizations

upvoted a paper 10 days ago

Are We on the Right Way to Assessing LLM-as-a-Judge?

Paper • 2512.16041 • Published 14 days ago • 32

upvoted 9 papers 12 days ago

OpenDataArena: A Fair and Open Arena for Benchmarking Post-Training Dataset Value

Paper • 2512.14051 • Published 16 days ago • 40

Rethinking Expert Trajectory Utilization in LLM Post-training

Paper • 2512.11470 • Published 20 days ago • 7

Towards a Science of Scaling Agent Systems

Paper • 2512.08296 • Published 23 days ago • 13

Confucius Code Agent: An Open-sourced AI Software Engineer at Industrial Scale

Paper • 2512.10398 • Published 21 days ago • 6

BEAVER: An Efficient Deterministic LLM Verifier

Paper • 2512.05439 • Published 27 days ago • 35

Long-horizon Reasoning Agent for Olympiad-Level Mathematical Problem Solving

Paper • 2512.10739 • Published 21 days ago • 45

VOYAGER: A Training Free Approach for Generating Diverse Datasets using LLMs

Paper • 2512.12072 • Published 20 days ago • 17

Can LLMs Guide Their Own Exploration? Gradient-Guided Reinforcement Learning for LLM Reasoning

Paper • 2512.15687 • Published 15 days ago • 17

Nemotron-Math: Efficient Long-Context Distillation of Mathematical Reasoning from Multi-Mode Supervision

Paper • 2512.15489 • Published 15 days ago • 6

upvoted 10 papers 3 months ago

Every Sample Matters: Leveraging Mixture-of-Experts and High-Quality Data for Efficient and Accurate Code LLM

Paper • 2503.17793 • Published Mar 22, 2025 • 23

LMEnt: A Suite for Analyzing Knowledge in Language Models from Pretraining Data to Representations

Paper • 2509.03405 • Published Sep 3, 2025 • 23

Reasoning Vectors: Transferring Chain-of-Thought Capabilities via Task Arithmetic

Paper • 2509.01363 • Published Sep 1, 2025 • 58

Open Data Synthesis For Deep Research

Paper • 2509.00375 • Published Aug 30, 2025 • 70

The Landscape of Agentic Reinforcement Learning for LLMs: A Survey

Paper • 2509.02547 • Published Sep 2, 2025 • 227

SimpleQA Verified: A Reliable Factuality Benchmark to Measure Parametric Knowledge

Paper • 2509.07968 • Published Sep 9, 2025 • 14

SFR-DeepResearch: Towards Effective Reinforcement Learning for Autonomously Reasoning Single Agents

Paper • 2509.06283 • Published Sep 8, 2025 • 17

Paper2Agent: Reimagining Research Papers As Interactive and Reliable AI Agents

Paper • 2509.06917 • Published Sep 8, 2025 • 41

Language Self-Play For Data-Free Training

Paper • 2509.07414 • Published Sep 9, 2025 • 29

Reverse-Engineered Reasoning for Open-Ended Generation

Paper • 2509.06160 • Published Sep 7, 2025 • 149