3 7 8

guox18

guox18

AI & ML interests

Alignment

Recent Activity

upvoted a paper 9 days ago

Sample By Step, Optimize By Chunk: Chunk-Level GRPO For Text-to-Image Generation

upvoted a paper 15 days ago

UniGenBench++: A Unified Semantic Evaluation Benchmark for Text-to-Image Generation

upvoted a paper 2 months ago

Pref-GRPO: Pairwise Preference Reward-based GRPO for Stable Text-to-Image Reinforcement Learning

View all activity

Organizations

None yet

upvoted a paper 9 days ago

Sample By Step, Optimize By Chunk: Chunk-Level GRPO For Text-to-Image Generation

Paper • 2510.21583 • Published 12 days ago • 30

upvoted a paper 15 days ago

UniGenBench++: A Unified Semantic Evaluation Benchmark for Text-to-Image Generation

Paper • 2510.18701 • Published 15 days ago • 66

upvoted a paper 2 months ago

Pref-GRPO: Pairwise Preference Reward-based GRPO for Stable Text-to-Image Reinforcement Learning

Paper • 2508.20751 • Published Aug 28 • 89

authored 3 papers 3 months ago

Consensus Entropy: Harnessing Multi-VLM Agreement for Self-Verifying and Self-Improving OCR

Paper • 2504.11101 • Published Apr 15

IFDECORATOR: Wrapping Instruction Following Reinforcement Learning with Verifiable Rewards

Paper • 2508.04632 • Published Aug 6 • 2

Intern-S1: A Scientific Multimodal Foundation Model

Paper • 2508.15763 • Published Aug 21 • 255

upvoted a paper 3 months ago

SuperGPQA: Scaling LLM Evaluation across 285 Graduate Disciplines

Paper • 2502.14739 • Published Feb 20 • 104

updated 2 models 3 months ago

guox18/Llama3.1-8B-Instruct-IFDecorator

8B • Updated Aug 10 • 4

guox18/Qwen2.5-7B-Instruct-IFDecorator

Text Generation • 8B • Updated Aug 8 • 8

New activity in guox18/IFDecorator 3 months ago

Add paper, project page, and code links to dataset card

#1 opened 3 months ago by

nielsr

upvoted a paper 3 months ago

IFDECORATOR: Wrapping Instruction Following Reinforcement Learning with Verifiable Rewards

Paper • 2508.04632 • Published Aug 6 • 2

updated a collection 3 months ago

IFDecorator

Collection

Dataset and Models for ''IFDECORATOR: Wrapping Instruction Following Reinforcement Learning with Verifiable Rewards'' • 6 items • Updated Aug 7

published 3 models 3 months ago

updated a dataset 3 months ago

guox18/IFDecorator

Preview • Updated Aug 8 • 236 • 1

updated a model 3 months ago

guox18/Qwen3-8B-IFDecorator

8B • Updated Aug 7 • 7