Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
Sherlock's picture
7 5 10

Sherlock

eyuansu71
david-future's profile picture lilaczheng's profile picture 21world's profile picture
·
https://scholar.google.com/citations?user=75pkx3YAAAAJ&hl=en

AI & ML interests

None yet

Recent Activity

upvoted a paper 15 days ago
Do Vision-Language Models Measure Up? Benchmarking Visual Measurement Reading with MeasureBench
upvoted a paper about 2 months ago
SWE-Bench Pro: Can AI Agents Solve Long-Horizon Software Engineering Tasks?
upvoted a paper about 2 months ago
FlagEval Findings Report: A Preliminary Evaluation of Large Reasoning Models on Automatically Verifiable Textual and Visual Questions
View all activity

Organizations

Beijing Academy of Artificial Intelligence's profile picture FlagEval's profile picture The BIRD Team's profile picture LiveSQLBench's profile picture

commented a paper 4 months ago

One Token to Fool LLM-as-a-Judge

Paper • 2507.08794 • Published Jul 11 • 31 •
3
commented 2 papers almost 2 years ago

WARM: On the Benefits of Weight Averaged Reward Models

Paper • 2401.12187 • Published Jan 22, 2024 • 19 •
7

WARM: On the Benefits of Weight Averaged Reward Models

Paper • 2401.12187 • Published Jan 22, 2024 • 19 •
7
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs