Sample By Step, Optimize By Chunk: Chunk-Level GRPO For Text-to-Image Generation Paper • 2510.21583 • Published 12 days ago • 30
UniGenBench++: A Unified Semantic Evaluation Benchmark for Text-to-Image Generation Paper • 2510.18701 • Published 15 days ago • 66
Pref-GRPO: Pairwise Preference Reward-based GRPO for Stable Text-to-Image Reinforcement Learning Paper • 2508.20751 • Published Aug 28 • 89
Consensus Entropy: Harnessing Multi-VLM Agreement for Self-Verifying and Self-Improving OCR Paper • 2504.11101 • Published Apr 15
IFDECORATOR: Wrapping Instruction Following Reinforcement Learning with Verifiable Rewards Paper • 2508.04632 • Published Aug 6 • 2
Intern-S1: A Scientific Multimodal Foundation Model Paper • 2508.15763 • Published Aug 21 • 255
SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines Paper • 2502.14739 • Published Feb 20 • 104
IFDECORATOR: Wrapping Instruction Following Reinforcement Learning with Verifiable Rewards Paper • 2508.04632 • Published Aug 6 • 2
IFDecorator Collection Dataset and Models for ''IFDECORATOR: Wrapping Instruction Following Reinforcement Learning with Verifiable Rewards'' • 6 items • Updated Aug 7
IFDecorator Collection Dataset and Models for ''IFDECORATOR: Wrapping Instruction Following Reinforcement Learning with Verifiable Rewards'' • 6 items • Updated Aug 7