-
A Survey on Personalized Alignment -- The Missing Piece for Large Language Models in Real-World Applications
Paper • 2503.17003 • Published -
AMOR: A Recipe for Building Adaptable Modular Knowledge Agents Through Process Feedback
Paper • 2402.01469 • Published • 1 -
Depicting Beyond Scores: Advancing Image Quality Assessment through Multi-modal Language Models
Paper • 2312.08962 • Published -
Persona-judge: Personalized Alignment of Large Language Models via Token-level Self-judgment
Paper • 2504.12663 • Published
Collections
Discover the best community collections!
Collections including paper arxiv:2510.04618
-
The Leaderboard Illusion
Paper • 2504.20879 • Published • 72 -
SmolVLM: Redefining small and efficient multimodal models
Paper • 2504.05299 • Published • 200 -
Seedance 1.0: Exploring the Boundaries of Video Generation Models
Paper • 2506.09113 • Published • 102 -
Small Language Models are the Future of Agentic AI
Paper • 2506.02153 • Published • 21
-
RuCCoD: Towards Automated ICD Coding in Russian
Paper • 2502.21263 • Published • 133 -
Unified Reward Model for Multimodal Understanding and Generation
Paper • 2503.05236 • Published • 123 -
Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching
Paper • 2503.05179 • Published • 46 -
R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning
Paper • 2503.05592 • Published • 27
-
Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models
Paper • 2312.06585 • Published • 29 -
Enable Language Models to Implicitly Learn Self-Improvement From Data
Paper • 2310.00898 • Published • 23 -
ReST meets ReAct: Self-Improvement for Multi-Step Reasoning LLM Agent
Paper • 2312.10003 • Published • 44 -
Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models
Paper • 2401.01335 • Published • 68
-
lusxvr/nanoVLM-222M
Image-Text-to-Text • 0.2B • Updated • 255 • 98 -
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning
Paper • 2503.09516 • Published • 36 -
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time
Paper • 2505.24863 • Published • 97 -
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning
Paper • 2505.17667 • Published • 88
-
nvidia/Llama-3_1-Nemotron-Ultra-253B-v1
Text Generation • 253B • Updated • 9.08k • • 339 -
WebThinker: Empowering Large Reasoning Models with Deep Research Capability
Paper • 2504.21776 • Published • 59 -
PrimeIntellect/INTELLECT-2
33B • Updated • 133 • 204 -
Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models
Paper • 2510.04618 • Published • 120
-
Mobile-Agent-E: Self-Evolving Mobile Assistant for Complex Tasks
Paper • 2501.11733 • Published • 28 -
Learn-by-interact: A Data-Centric Framework for Self-Adaptive Agents in Realistic Environments
Paper • 2501.10893 • Published • 26 -
Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models
Paper • 2510.04618 • Published • 120
-
GLaMM: Pixel Grounding Large Multimodal Model
Paper • 2311.03356 • Published • 37 -
Random Field Augmentations for Self-Supervised Representation Learning
Paper • 2311.03629 • Published • 10 -
Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models
Paper • 2510.04618 • Published • 120
-
A Survey on Personalized Alignment -- The Missing Piece for Large Language Models in Real-World Applications
Paper • 2503.17003 • Published -
AMOR: A Recipe for Building Adaptable Modular Knowledge Agents Through Process Feedback
Paper • 2402.01469 • Published • 1 -
Depicting Beyond Scores: Advancing Image Quality Assessment through Multi-modal Language Models
Paper • 2312.08962 • Published -
Persona-judge: Personalized Alignment of Large Language Models via Token-level Self-judgment
Paper • 2504.12663 • Published
-
lusxvr/nanoVLM-222M
Image-Text-to-Text • 0.2B • Updated • 255 • 98 -
Search-R1: Training LLMs to Reason and Leverage Search Engines with Reinforcement Learning
Paper • 2503.09516 • Published • 36 -
AlphaOne: Reasoning Models Thinking Slow and Fast at Test Time
Paper • 2505.24863 • Published • 97 -
QwenLong-L1: Towards Long-Context Large Reasoning Models with Reinforcement Learning
Paper • 2505.17667 • Published • 88
-
The Leaderboard Illusion
Paper • 2504.20879 • Published • 72 -
SmolVLM: Redefining small and efficient multimodal models
Paper • 2504.05299 • Published • 200 -
Seedance 1.0: Exploring the Boundaries of Video Generation Models
Paper • 2506.09113 • Published • 102 -
Small Language Models are the Future of Agentic AI
Paper • 2506.02153 • Published • 21
-
nvidia/Llama-3_1-Nemotron-Ultra-253B-v1
Text Generation • 253B • Updated • 9.08k • • 339 -
WebThinker: Empowering Large Reasoning Models with Deep Research Capability
Paper • 2504.21776 • Published • 59 -
PrimeIntellect/INTELLECT-2
33B • Updated • 133 • 204 -
Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models
Paper • 2510.04618 • Published • 120
-
RuCCoD: Towards Automated ICD Coding in Russian
Paper • 2502.21263 • Published • 133 -
Unified Reward Model for Multimodal Understanding and Generation
Paper • 2503.05236 • Published • 123 -
Sketch-of-Thought: Efficient LLM Reasoning with Adaptive Cognitive-Inspired Sketching
Paper • 2503.05179 • Published • 46 -
R1-Searcher: Incentivizing the Search Capability in LLMs via Reinforcement Learning
Paper • 2503.05592 • Published • 27
-
Mobile-Agent-E: Self-Evolving Mobile Assistant for Complex Tasks
Paper • 2501.11733 • Published • 28 -
Learn-by-interact: A Data-Centric Framework for Self-Adaptive Agents in Realistic Environments
Paper • 2501.10893 • Published • 26 -
Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models
Paper • 2510.04618 • Published • 120
-
Beyond Human Data: Scaling Self-Training for Problem-Solving with Language Models
Paper • 2312.06585 • Published • 29 -
Enable Language Models to Implicitly Learn Self-Improvement From Data
Paper • 2310.00898 • Published • 23 -
ReST meets ReAct: Self-Improvement for Multi-Step Reasoning LLM Agent
Paper • 2312.10003 • Published • 44 -
Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models
Paper • 2401.01335 • Published • 68
-
GLaMM: Pixel Grounding Large Multimodal Model
Paper • 2311.03356 • Published • 37 -
Random Field Augmentations for Self-Supervised Representation Learning
Paper • 2311.03629 • Published • 10 -
Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models
Paper • 2510.04618 • Published • 120