<think> So let's replace this phrase with insult... </think> Lessons learned from generation of toxic texts with LLMs Paper • 2509.08358 • Published Sep 10 • 13
AgentGym-RL: Training LLM Agents for Long-Horizon Decision Making through Multi-Turn Reinforcement Learning Paper • 2509.08755 • Published Sep 10 • 56
view article Article PP-OCRv5 on Hugging Face: A Specialized Approach to OCR By baidu and 5 others • Sep 10 • 108