SFT or RL? An Early Investigation into Training R1-Like Reasoning Large Vision-Language Models Paper ⢠2504.11468 ⢠Published Apr 10, 2025 ⢠30
SFT Memorizes, RL Generalizes: A Comparative Study of Foundation Model Post-training Paper ⢠2501.17161 ⢠Published Jan 28, 2025 ⢠123
Cosmos-Predict2 Collection World Foundation Model for Future Prediction ⢠13 items ⢠Updated 10 days ago ⢠33
MCP-Bench: Benchmarking Tool-Using LLM Agents with Complex Real-World Tasks via MCP Servers Paper ⢠2508.20453 ⢠Published Aug 28, 2025 ⢠63
Wan: Open and Advanced Large-Scale Video Generative Models Paper ⢠2503.20314 ⢠Published Mar 26, 2025 ⢠56
Physical AI Collection Collection of open, commercial-grade datasets for physical AI developers ⢠23 items ⢠Updated 10 days ago ⢠103
AceReason Collection Math and Code reasoning model trained through reinforcement learning (RL) ⢠7 items ⢠Updated 10 days ago ⢠20
Reward Models 06-2025 Collection Nemotron reward models. For use in RLHF pipelines and LLM-as-a-Judge ⢠8 items ⢠Updated 10 days ago ⢠23
Psychoacoustic Challenges Of Speech Enhancement On VoIP Platforms Paper ⢠2310.07161 ⢠Published Oct 11, 2023 ⢠1
Qwen2.5-1M Collection The long-context version of Qwen2.5, supporting 1M-token context lengths ⢠3 items ⢠Updated 3 days ago ⢠126
OpenReasoning-Nemotron Collection Collection of models for OpenReasoning-Nemotron which are trained on 5M reasoning traces for Math, Code and Science. ⢠6 items ⢠Updated 10 days ago ⢠46
Cosmos Collection ā ļø This collection is archived. š https://huggingface.co/collections/nvidia/cosmos-predict25 ⢠31 items ⢠Updated 10 days ago ⢠299