Collections
Discover the best community collections!
Collections including paper arxiv:2506.15677
-
benjamin-paine/steamboat-willie-14b
Text-to-Video • Updated • 98 • 41 -
Embodied Web Agents: Bridging Physical-Digital Realms for Integrated Agent Intelligence
Paper • 2506.15677 • Published • 23 -
Self Forcing Wan 2.1
🎥320Real-time video generation
-
nasa-ibm-ai4science/Surya-1.0
Updated • 135 • 94
-
StdGEN: Semantic-Decomposed 3D Character Generation from Single Images
Paper • 2411.05738 • Published • 15 -
A Pointer Network-based Approach for Joint Extraction and Detection of Multi-Label Multi-Class Intents
Paper • 2410.22476 • Published • 27 -
OS-ATLAS: A Foundation Action Model for Generalist GUI Agents
Paper • 2410.23218 • Published • 49 -
Training-free Regional Prompting for Diffusion Transformers
Paper • 2411.02395 • Published • 25
-
microsoft/bitnet-b1.58-2B-4T
Text Generation • 0.8B • Updated • 5.71k • 1.22k -
M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models
Paper • 2504.10449 • Published • 15 -
nvidia/Llama-3.1-Nemotron-8B-UltraLong-2M-Instruct
Text Generation • 8B • Updated • 111 • 15 -
ReTool: Reinforcement Learning for Strategic Tool Use in LLMs
Paper • 2504.11536 • Published • 63
-
ReLearn: Unlearning via Learning for Large Language Models
Paper • 2502.11190 • Published • 30 -
Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention
Paper • 2502.11089 • Published • 165 -
Explorer: Scaling Exploration-driven Web Trajectory Synthesis for Multimodal Web Agents
Paper • 2502.11357 • Published • 11 -
DeepPerception: Advancing R1-like Cognitive Visual Perception in MLLMs for Knowledge-Intensive Visual Grounding
Paper • 2503.12797 • Published • 32
-
microsoft/bitnet-b1.58-2B-4T
Text Generation • 0.8B • Updated • 5.71k • 1.22k -
M1: Towards Scalable Test-Time Compute with Mamba Reasoning Models
Paper • 2504.10449 • Published • 15 -
nvidia/Llama-3.1-Nemotron-8B-UltraLong-2M-Instruct
Text Generation • 8B • Updated • 111 • 15 -
ReTool: Reinforcement Learning for Strategic Tool Use in LLMs
Paper • 2504.11536 • Published • 63
-
benjamin-paine/steamboat-willie-14b
Text-to-Video • Updated • 98 • 41 -
Embodied Web Agents: Bridging Physical-Digital Realms for Integrated Agent Intelligence
Paper • 2506.15677 • Published • 23 -
Self Forcing Wan 2.1
🎥320Real-time video generation
-
nasa-ibm-ai4science/Surya-1.0
Updated • 135 • 94
-
ReLearn: Unlearning via Learning for Large Language Models
Paper • 2502.11190 • Published • 30 -
Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention
Paper • 2502.11089 • Published • 165 -
Explorer: Scaling Exploration-driven Web Trajectory Synthesis for Multimodal Web Agents
Paper • 2502.11357 • Published • 11 -
DeepPerception: Advancing R1-like Cognitive Visual Perception in MLLMs for Knowledge-Intensive Visual Grounding
Paper • 2503.12797 • Published • 32
-
StdGEN: Semantic-Decomposed 3D Character Generation from Single Images
Paper • 2411.05738 • Published • 15 -
A Pointer Network-based Approach for Joint Extraction and Detection of Multi-Label Multi-Class Intents
Paper • 2410.22476 • Published • 27 -
OS-ATLAS: A Foundation Action Model for Generalist GUI Agents
Paper • 2410.23218 • Published • 49 -
Training-free Regional Prompting for Diffusion Transformers
Paper • 2411.02395 • Published • 25