Spaces:
Running
Running
metadata
title: languagebench
emoji: π
colorFrom: purple
colorTo: pink
sdk: docker
app_port: 8000
license: cc-by-sa-4.0
short_description: AI model evaluations for every language in the world.
datasets:
- openlanguagedata/flores_plus
- google/fleurs
- mozilla-foundation/common_voice_1_0
- CohereForAI/Global-MMLU
models:
- meta-llama/Llama-3.3-70B-Instruct
- mistralai/Mistral-Small-24B-Instruct-2501
- deepseek-ai/DeepSeek-V3
- microsoft/phi-4
- openai/whisper-large-v3
- google/gemma-3-27b-it
tags:
- leaderboard
- submission:manual
- test:public
- judge:auto
- modality:text
- modality:artefacts
- eval:generation
- language:English
- language:German
languagebench π
AI model evaluations for every language in the world
Evaluate
Local Development
uv run --extra dev evals/main.py
Explore
uv run evals/backend.py
cd frontend && npm i && npm start
System Architecture
See notes/system-architecture-diagram.md for the complete system architecture diagram and component descriptions.