Collections
Discover the best community collections!
Collections including paper arxiv:2507.14241
-
microsoft/Phi-4-mini-flash-reasoning
Text Generation • 4B • Updated • 3.13k • 252 -
Promptomatix: An Automatic Prompt Optimization Framework for Large Language Models
Paper • 2507.14241 • Published • 17 -
Kosmos-2.5: A Multimodal Literate Model
Paper • 2309.11419 • Published • 55 -
Self-Adapting Language Models
Paper • 2506.10943 • Published • 6
-
lusxvr/nanoVLM-222M
Image-Text-to-Text • 0.2B • Updated • 250 • 98 -
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning
Paper • 2503.09516 • Published • 36 -
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time
Paper • 2505.24863 • Published • 97 -
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning
Paper • 2505.17667 • Published • 88
-
Inverse Reinforcement Learning Meets Large Language Model Post-Training: Basics, Advances, and Opportunities
Paper • 2507.13158 • Published • 23 -
DrafterBench: Benchmarking Large Language Models for Tasks Automation in Civil Engineering
Paper • 2507.11527 • Published • 32 -
Promptomatix: An Automatic Prompt Optimization Framework for Large Language Models
Paper • 2507.14241 • Published • 17 -
The Prompt Report: A Systematic Survey of Prompting Techniques
Paper • 2406.06608 • Published • 68
-
Confidence Is All You Need: Few-Shot RL Fine-Tuning of Language Models
Paper • 2506.06395 • Published • 132 -
Magistral
Paper • 2506.10910 • Published • 65 -
Overclocking LLM Reasoning: Monitoring and Controlling Thinking Path Lengths in LLMs
Paper • 2506.07240 • Published • 7 -
Multiverse: Your Language Models Secretly Decide How to Parallelize and Merge Generation
Paper • 2506.09991 • Published • 55
-
MARS: A Multi-Agent Framework Incorporating Socratic Guidance for Automated Prompt Optimization
Paper • 2503.16874 • Published • 44 -
System Prompt Optimization with Meta-Learning
Paper • 2505.09666 • Published • 71 -
UniRL: Self-Improving Unified Multimodal Models via Supervised and Reinforcement Learning
Paper • 2505.23380 • Published • 22 -
DeepTheorem: Advancing LLM Reasoning for Theorem Proving Through Natural Language and Reinforcement Learning
Paper • 2505.23754 • Published • 15
-
Inverse Reinforcement Learning Meets Large Language Model Post-Training: Basics, Advances, and Opportunities
Paper • 2507.13158 • Published • 23 -
DrafterBench: Benchmarking Large Language Models for Tasks Automation in Civil Engineering
Paper • 2507.11527 • Published • 32 -
Promptomatix: An Automatic Prompt Optimization Framework for Large Language Models
Paper • 2507.14241 • Published • 17 -
The Prompt Report: A Systematic Survey of Prompting Techniques
Paper • 2406.06608 • Published • 68
-
microsoft/Phi-4-mini-flash-reasoning
Text Generation • 4B • Updated • 3.13k • 252 -
Promptomatix: An Automatic Prompt Optimization Framework for Large Language Models
Paper • 2507.14241 • Published • 17 -
Kosmos-2.5: A Multimodal Literate Model
Paper • 2309.11419 • Published • 55 -
Self-Adapting Language Models
Paper • 2506.10943 • Published • 6
-
Confidence Is All You Need: Few-Shot RL Fine-Tuning of Language Models
Paper • 2506.06395 • Published • 132 -
Magistral
Paper • 2506.10910 • Published • 65 -
Overclocking LLM Reasoning: Monitoring and Controlling Thinking Path Lengths in LLMs
Paper • 2506.07240 • Published • 7 -
Multiverse: Your Language Models Secretly Decide How to Parallelize and Merge Generation
Paper • 2506.09991 • Published • 55
-
lusxvr/nanoVLM-222M
Image-Text-to-Text • 0.2B • Updated • 250 • 98 -
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning
Paper • 2503.09516 • Published • 36 -
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time
Paper • 2505.24863 • Published • 97 -
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning
Paper • 2505.17667 • Published • 88
-
MARS: A Multi-Agent Framework Incorporating Socratic Guidance for Automated Prompt Optimization
Paper • 2503.16874 • Published • 44 -
System Prompt Optimization with Meta-Learning
Paper • 2505.09666 • Published • 71 -
UniRL: Self-Improving Unified Multimodal Models via Supervised and Reinforcement Learning
Paper • 2505.23380 • Published • 22 -
DeepTheorem: Advancing LLM Reasoning for Theorem Proving Through Natural Language and Reinforcement Learning
Paper • 2505.23754 • Published • 15