DAComp: Benchmarking Data Agents across the Full Data Intelligence Lifecycle Paper • 2512.04324 • Published 3 days ago • 126
Reasoning-Table: Exploring Reinforcement Learning for Table Reasoning Paper • 2506.01710 • Published Jun 2 • 2
S3Eval: A Synthetic, Scalable, Systematic Evaluation Suite for Large Language Models Paper • 2310.15147 • Published Oct 23, 2023 • 2
MoELoRA: Contrastive Learning Guided Mixture of Experts on Parameter-Efficient Fine-Tuning for Large Language Models Paper • 2402.12851 • Published Feb 20, 2024 • 2
Neeko: Leveraging Dynamic LoRA for Efficient Multi-Character Role-Playing Agent Paper • 2402.13717 • Published Feb 21, 2024 • 3
OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments Paper • 2404.07972 • Published Apr 11, 2024 • 50