MCPMark: A Benchmark for Stress-Testing Realistic and Comprehensive MCP Use Paper • 2509.24002 • Published Sep 28 • 171
The Devil behind the mask: An emergent safety vulnerability of Diffusion LLMs Paper • 2507.11097 • Published Jul 15 • 64
PC-Agent: A Hierarchical Multi-Agent Collaboration Framework for Complex Task Automation on PC Paper • 2502.14282 • Published Feb 20 • 29
Logic-RL: Unleashing LLM Reasoning with Rule-Based Reinforcement Learning Paper • 2502.14768 • Published Feb 20 • 47
DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning Paper • 2501.12948 • Published Jan 22 • 426
OpenScholar: Synthesizing Scientific Literature with Retrieval-augmented LMs Paper • 2411.14199 • Published Nov 21, 2024 • 31
Agent S: An Open Agentic Framework that Uses Computers Like a Human Paper • 2410.08164 • Published Oct 10, 2024 • 26
From MOOC to MAIC: Reshaping Online Teaching and Learning through LLM-driven Agents Paper • 2409.03512 • Published Sep 5, 2024 • 28
Refuse Whenever You Feel Unsafe: Improving Safety in LLMs via Decoupled Refusal Training Paper • 2407.09121 • Published Jul 12, 2024 • 6
LLaMAX: Scaling Linguistic Horizons of LLM by Enhancing Translation Capabilities Beyond 100 Languages Paper • 2407.05975 • Published Jul 8, 2024 • 37