AgentReview: Exploring Peer Review Dynamics with LLM Agents Paper • 2406.12708 • Published Jun 18, 2024 • 8
Your Agent May Misevolve: Emergent Risks in Self-evolving LLM Agents Paper • 2509.26354 • Published Sep 30 • 17
Large Reasoning Models Learn Better Alignment from Flawed Thinking Paper • 2510.00938 • Published Oct 1 • 57
Virus: Harmful Fine-tuning Attack for Large Language Models Bypassing Guardrail Moderation Paper • 2501.17433 • Published Jan 29 • 10