-
Nuclear Norm Regularization for Deep Learning
Paper • 2405.14544 • Published • 1 -
Token embeddings violate the manifold hypothesis
Paper • 2504.01002 • Published • 1 -
Approximate Nullspace Augmented Finetuning for Robust Vision Transformers
Paper • 2403.10476 • Published • 1 -
ElaLoRA: Elastic & Learnable Low-Rank Adaptation for Efficient Model Fine-Tuning
Paper • 2504.00254 • Published • 1
Collections
Discover the best community collections!
Collections including paper arxiv:2510.25992
-
Reinforcement Learning for Reasoning in Large Language Models with One Training Example
Paper • 2504.20571 • Published • 98 -
One RL to See Them All: Visual Triple Unified Reinforcement Learning
Paper • 2505.18129 • Published • 60 -
Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't
Paper • 2503.16219 • Published • 52 -
Performance Trade-offs of Optimizing Small Language Models for E-Commerce
Paper • 2510.21970 • Published • 2
-
MADD: Multi-Agent Drug Discovery Orchestra
Paper • 2511.08217 • Published • 54 -
The Station: An Open-World Environment for AI-Driven Discovery
Paper • 2511.06309 • Published • 34 -
An AI system to help scientists write expert-level empirical software
Paper • 2509.06503 • Published • 6 -
The Era of Agentic Organization: Learning to Organize with Language Models
Paper • 2510.26658 • Published • 26
-
The Era of Agentic Organization: Learning to Organize with Language Models
Paper • 2510.26658 • Published • 26 -
Supervised Reinforcement Learning: From Expert Trajectories to Step-wise Reasoning
Paper • 2510.25992 • Published • 43 -
The End of Manual Decoding: Towards Truly End-to-End Language Models
Paper • 2510.26697 • Published • 113
-
Demystifying Reinforcement Learning in Agentic Reasoning
Paper • 2510.11701 • Published • 31 -
LoongRL:Reinforcement Learning for Advanced Reasoning over Long Contexts
Paper • 2510.19363 • Published • 60 -
Supervised Reinforcement Learning: From Expert Trajectories to Step-wise Reasoning
Paper • 2510.25992 • Published • 43 -
Teaching Pretrained Language Models to Think Deeper with Retrofitted Recurrence
Paper • 2511.07384 • Published • 15
-
Nuclear Norm Regularization for Deep Learning
Paper • 2405.14544 • Published • 1 -
Token embeddings violate the manifold hypothesis
Paper • 2504.01002 • Published • 1 -
Approximate Nullspace Augmented Finetuning for Robust Vision Transformers
Paper • 2403.10476 • Published • 1 -
ElaLoRA: Elastic & Learnable Low-Rank Adaptation for Efficient Model Fine-Tuning
Paper • 2504.00254 • Published • 1
-
MADD: Multi-Agent Drug Discovery Orchestra
Paper • 2511.08217 • Published • 54 -
The Station: An Open-World Environment for AI-Driven Discovery
Paper • 2511.06309 • Published • 34 -
An AI system to help scientists write expert-level empirical software
Paper • 2509.06503 • Published • 6 -
The Era of Agentic Organization: Learning to Organize with Language Models
Paper • 2510.26658 • Published • 26
-
The Era of Agentic Organization: Learning to Organize with Language Models
Paper • 2510.26658 • Published • 26 -
Supervised Reinforcement Learning: From Expert Trajectories to Step-wise Reasoning
Paper • 2510.25992 • Published • 43 -
The End of Manual Decoding: Towards Truly End-to-End Language Models
Paper • 2510.26697 • Published • 113
-
Reinforcement Learning for Reasoning in Large Language Models with One Training Example
Paper • 2504.20571 • Published • 98 -
One RL to See Them All: Visual Triple Unified Reinforcement Learning
Paper • 2505.18129 • Published • 60 -
Reinforcement Learning for Reasoning in Small LLMs: What Works and What Doesn't
Paper • 2503.16219 • Published • 52 -
Performance Trade-offs of Optimizing Small Language Models for E-Commerce
Paper • 2510.21970 • Published • 2
-
Demystifying Reinforcement Learning in Agentic Reasoning
Paper • 2510.11701 • Published • 31 -
LoongRL:Reinforcement Learning for Advanced Reasoning over Long Contexts
Paper • 2510.19363 • Published • 60 -
Supervised Reinforcement Learning: From Expert Trajectories to Step-wise Reasoning
Paper • 2510.25992 • Published • 43 -
Teaching Pretrained Language Models to Think Deeper with Retrofitted Recurrence
Paper • 2511.07384 • Published • 15