Logical Reasoning over Natural Language as Knowledge Representation: A Survey Paper • 2303.12023 • Published Mar 21, 2023 • 2
Recent Advances in Deep Learning Based Dialogue Systems: A Systematic Survey Paper • 2105.04387 • Published May 10, 2021
Long-Context Inference with Retrieval-Augmented Speculative Decoding Paper • 2502.20330 • Published Feb 27
SynthRL: Scaling Visual Reasoning with Verifiable Data Synthesis Paper • 2506.02096 • Published Jun 2 • 52
MCPMark: A Benchmark for Stress-Testing Realistic and Comprehensive MCP Use Paper • 2509.24002 • Published Sep 28 • 171
NoisyRollout: Reinforcing Visual Reasoning with Data Augmentation Paper • 2504.13055 • Published Apr 17 • 19
ResearchBench: Benchmarking LLMs in Scientific Discovery via Inspiration-Based Task Decomposition Paper • 2503.21248 • Published Mar 27 • 21
MixEval-X: Any-to-Any Evaluations from Real-World Data Mixtures Paper • 2410.13754 • Published Oct 17, 2024 • 75
OpenMoE: An Early Effort on Open Mixture-of-Experts Language Models Paper • 2402.01739 • Published Jan 29, 2024 • 28
MixEval: Deriving Wisdom of the Crowd from LLM Benchmark Mixtures Paper • 2406.06565 • Published Jun 3, 2024 • 10