InstructCoder: Empowering Language Models for Code Editing Paper • 2310.20329 • Published Oct 31, 2023 • 2
MMCode: Evaluating Multi-Modal Code Large Language Models with Visually Rich Programming Problems Paper • 2404.09486 • Published Apr 15, 2024 • 2
Towards Better Text-to-Image Generation Alignment via Attention Modulation Paper • 2404.13899 • Published Apr 22, 2024
MCTS-Judge: Test-Time Scaling in LLM-as-a-Judge for Code Correctness Evaluation Paper • 2502.12468 • Published Feb 18 • 1
ScreenSpot-Pro: GUI Grounding for Professional High-Resolution Computer Use Paper • 2504.07981 • Published Apr 4 • 2
Enhancing Visual Grounding for GUI Agents via Self-Evolutionary Reinforcement Learning Paper • 2505.12370 • Published May 18
AdamMeme: Adaptively Probe the Reasoning Capacity of Multimodal Large Language Models on Harmfulness Paper • 2507.01702 • Published Jul 2 • 3
Data Whisperer: Efficient Data Selection for Task-Specific LLM Fine-Tuning via Few-Shot In-Context Learning Paper • 2505.12212 • Published May 18
Beyond Policy Optimization: A Data Curation Flywheel for Sparse-Reward Long-Horizon Planning Paper • 2508.03018 • Published Aug 5
AmbiGraph-Eval: Can LLMs Effectively Handle Ambiguous Graph Queries? Paper • 2508.09631 • Published Aug 13
BigCodeArena: Unveiling More Reliable Human Preferences in Code Generation via Execution Paper • 2510.08697 • Published 26 days ago • 34