Machine Bullshit: Characterizing the Emergent Disregard for Truth in Large Language Models Paper • 2507.07484 • Published Jul 10 • 17 • 2
The Landscape of Memorization in LLMs: Mechanisms, Measurement, and Mitigation Paper • 2507.05578 • Published Jul 8 • 5 • 1
AgentSynth: Scalable Task Generation for Generalist Computer-Use Agents Paper • 2506.14205 • Published Jun 17 • 7 • 3
Are You Getting What You Pay For? Auditing Model Substitution in LLM APIs Paper • 2504.04715 • Published Apr 7 • 13 • 2