When Models Lie, We Learn: Multilingual Span-Level Hallucination Detection with PsiloQA Paper • 2510.04849 • Published Oct 6 • 111
Kaleidoscope: In-language Exams for Massively Multilingual Vision Evaluation Paper • 2504.07072 • Published Apr 9 • 9
Accelerating Nash Learning from Human Feedback via Mirror Prox Paper • 2505.19731 • Published May 26 • 6