Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
medelharchaoui 's Collections
Diffusion Language
LLM+Search
RL-LLMs
Robotics
Interessting papers

Interessting papers

updated Sep 12
Upvote
-

  • PVPO: Pre-Estimated Value-Based Policy Optimization for Agentic Reasoning

    Paper • 2508.21104 • Published Aug 28 • 35

  • FNet: Mixing Tokens with Fourier Transforms

    Paper • 2105.03824 • Published May 9, 2021 • 1

  • SimpleTIR: End-to-End Reinforcement Learning for Multi-Turn Tool-Integrated Reasoning

    Paper • 2509.02479 • Published Sep 2 • 83

  • RL + Transformer = A General-Purpose Problem Solver

    Paper • 2501.14176 • Published Jan 24 • 28

  • A Survey of Reinforcement Learning for Large Reasoning Models

    Paper • 2509.08827 • Published Sep 10 • 188

  • AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcement Learning

    Paper • 2509.08755 • Published Sep 10 • 56
Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs