ppplong (pplong)

upvoted a paper 2 months ago

Is Chain-of-Thought Reasoning of LLMs a Mirage? A Data Distribution Lens

Paper • 2508.01191 • Published Aug 2 • 236

upvoted 2 papers 3 months ago

Group Sequence Policy Optimization

Paper • 2507.18071 • Published Jul 24 • 307

A Survey of Context Engineering for Large Language Models

Paper • 2507.13334 • Published Jul 17 • 258

upvoted 2 papers 4 months ago

Benchmarking Multimodal Mathematical Reasoning with Explicit Visual Dependency

Paper • 2504.18589 • Published Apr 24 • 13

Drag-and-Drop LLMs: Zero-Shot Prompt-to-Weights

Paper • 2506.16406 • Published Jun 19 • 126

upvoted 15 papers 5 months ago

Time Blindness: Why Video-Language Models Can't See What Humans Can?

Paper • 2505.24867 • Published May 30 • 80

Wait, We Don't Need to "Wait"! Removing Thinking Tokens Improves Reasoning Efficiency

Paper • 2506.08343 • Published Jun 10 • 54

MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention

Paper • 2506.13585 • Published Jun 16 • 269

Confidence Is All You Need: Few-Shot RL Fine-Tuning of Language Models

Paper • 2506.06395 • Published Jun 5 • 131

Will It Still Be True Tomorrow? Multilingual Evergreen Question Classification to Improve Trustworthy QA

Paper • 2505.21115 • Published May 27 • 139

ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models

Paper • 2505.24864 • Published May 30 • 138

Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning

Paper • 2506.01939 • Published Jun 2 • 185

Reinforcement Pre-Training

Paper • 2506.08007 • Published Jun 9 • 262

Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning

Paper • 2505.24726 • Published May 30 • 274

Mutarjim: Advancing Bidirectional Arabic-English Translation with a Small Language Model

Paper • 2505.17894 • Published May 23 • 219

Shifting AI Efficiency From Model-Centric to Data-Centric Compression

Paper • 2505.19147 • Published May 25 • 144

Vision-R1: Incentivizing Reasoning Capability in Multimodal Large Language Models

Paper • 2503.06749 • Published Mar 9 • 31

Kuwain 1.5B: An Arabic SLM via Language Injection

Paper • 2504.15120 • Published Apr 21 • 121

MMaDA: Multimodal Large Diffusion Language Models

Paper • 2505.15809 • Published May 21 • 96

Perception, Reason, Think, and Plan: A Survey on Large Multimodal Reasoning Models

Paper • 2505.04921 • Published May 8 • 185

pplong

AI & ML interests

Organizations

Is Chain-of-Thought Reasoning of LLMs a Mirage? A Data Distribution Lens

Group Sequence Policy Optimization

A Survey of Context Engineering for Large Language Models

Benchmarking Multimodal Mathematical Reasoning with Explicit Visual Dependency

Drag-and-Drop LLMs: Zero-Shot Prompt-to-Weights

Time Blindness: Why Video-Language Models Can't See What Humans Can?

Wait, We Don't Need to "Wait"! Removing Thinking Tokens Improves Reasoning Efficiency

MiniMax-M1: Scaling Test-Time Compute Efficiently with Lightning Attention

Confidence Is All You Need: Few-Shot RL Fine-Tuning of Language Models

Will It Still Be True Tomorrow? Multilingual Evergreen Question Classification to Improve Trustworthy QA

ProRL: Prolonged Reinforcement Learning Expands Reasoning Boundaries in Large Language Models

Beyond the 80/20 Rule: High-Entropy Minority Tokens Drive Effective Reinforcement Learning for LLM Reasoning

Reinforcement Pre-Training

Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning

Mutarjim: Advancing Bidirectional Arabic-English Translation with a Small Language Model

Shifting AI Efficiency From Model-Centric to Data-Centric Compression

Vision-R1: Incentivizing Reasoning Capability in Multimodal Large Language Models

Kuwain 1.5B: An Arabic SLM via Language Injection

MMaDA: Multimodal Large Diffusion Language Models

Perception, Reason, Think, and Plan: A Survey on Large Multimodal Reasoning Models

pplong

AI & ML interests

Organizations

ppplong's activity