Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up
OpenEvals 's Collections
Research collaborations
Making evals easy
YourBench
Archived Open LLM Leaderboard (2024-2025)
Archived Open LLM Leaderboard (2023-2024)

Archived Open LLM Leaderboard (2024-2025)

updated Oct 7

This leaderboard has been evaluating LLMs from Jun 2024 on IFEval, MuSR, GPQA, MATH, BBH and MMLU-Pro

Upvote
-

  • Running
    Featured
    125

    Open-LLM performances are plateauing, let’s make the leaderboard steep again

    πŸ”
    125

    Explore and compare advanced language models on a new leaderboard

    Note Blog on why we made a new version of the Open LLM Leaderboard


  • Running on CPU Upgrade
    13.7k

    Open LLM Leaderboard

    πŸ†
    13.7k

    Track, rank and evaluate open LLMs and chatbots

    Note The actual leaderboard! With a stylish new ux :)


  • open-llm-leaderboard/contents

    Viewer β€’ Updated Mar 20 β€’ 4.58k β€’ 9.77k β€’ 20

    Note If you want to download the main leaderboard table, you'll find the dataset here!


  • open-llm-leaderboard/results

    Preview β€’ Updated Mar 15 β€’ 48.6k β€’ 15

    Note To extract more detailed aggregated results for each model, look here!


  • open-llm-leaderboard/requests

    Preview β€’ Updated Mar 17 β€’ 130k β€’ 12

    Note All models ever submitted to the leaderboard


  • Running on CPU Upgrade
    107

    Open LLM Leaderboard Model Comparator

    πŸ†
    107

    Compare Open LLM Leaderboard results

Upvote
-
  • Collection guide
  • Browse collections
Company
TOS Privacy About Jobs
Website
Models Datasets Spaces Pricing Docs