State-of-the-art Danish Models - a danish-foundation-models Collection

danish-foundation-models 's Collections

Dynaword Paper artifacts

Papers

Danish Text Datasets

Danish Benchmarks

State-of-the-art Danish Models

updated 14 days ago

These models constitute state-of-the-art models for Danish within their respective domain (highlighted below the model).

Upvote

mistralai/Mistral-Small-3.1-24B-Instruct-2503

24B • Updated Jul 28 • 104k • 1.33k

Note Among the best performing open-weight ~10-100b generative models which has been instruction-tuned. Determined by EuroEval Danish NLG (2025/11/04).
google/gemma-3-27b-it

Image-Text-to-Text • 27B • Updated Mar 21 • 894k • • 1.69k

Note Among the best performing open-weight ~10-100b generative models which has been instruction-tuned. Determined by EuroEval Danish NLG (2025/11/04).
google/gemma-3n-E4B-it

Image-Text-to-Text • 8B • Updated Jul 14 • 39.8k • 817

Note Among the best performing open-weight ~7-9b generative models which has been instruction-tuned. Determined by EuroEval Danish NLG (2025/11/04).
google/gemma-2-9b-it

Text Generation • 9B • Updated Aug 27, 2024 • 109k • • 745

Note Among the best performing open-weight ~7-9b generative models which has been instruction-tuned. Determined by EuroEval Danish NLG (2025/11/04).
google/gemma-2-9b

Text Generation • 9B • Updated Aug 7, 2024 • 31.7k • • 678

Note Among the best performing open-weight ~7-9b generative models which hasn't been instruction-tuned. Determined by EuroEval Danish NLG (2025/11/04).
KennethEnevoldsen/dfm-sentence-encoder-large

Feature Extraction • 0.4B • Updated Nov 27, 2024 • 196 • 2

Note Among the best large-sized encoder for Danish determined by EuroEval Danish NLU (2025/11/04)
AI-Sweden-Models/roberta-large-1160k

Fill-Mask • 0.4B • Updated May 22 • 93 • 11

Note Among the best large-sized encoder for Danish determined by EuroEval Danish NLU (2025/11/04)
KennethEnevoldsen/dfm-sentence-encoder-medium

Sentence Similarity • Updated Jul 9, 2023 • 1

Note Among the best medium-sized encoder for Danish determined by EuroEval Danish NLU (2025/11/04)
ltg/norbert3-small

Fill-Mask • Updated May 27 • 12k • 2

Note Among the best small sized encoder for Danish as determined by EuroEval Danish NLU (2025/11/04)
syvai/hviske-v3-conversation

Automatic Speech Recognition • 2B • Updated Aug 22 • 186 • 5

Note Automatic speech recognition based on Whisper 3 and fine-tuned on CoRal Obtains the lowest word error rate on CoRal conversations (2025/11/04), might be slightly overfit
openai/whisper-large-v3

Automatic Speech Recognition • 2B • Updated Aug 12, 2024 • 4.28M • • 5.11k

Note Automatic speech recognition (ASR) Best multilingual ASR model for Danish (2025/11/04)
CoRal-project/roest-wav2vec2-315m-v2

Automatic Speech Recognition • 0.3B • Updated Jun 26 • 1.86k • 4

Note Speech Encoder (Wav2Vec2.0) The encoder which obtains the lowest word error rate on CoRal (2025/11/04). Also exist in a 1B version.
jinaai/jina-embeddings-v3

Feature Extraction • 0.6B • Updated Feb 24 • 4.83M • 1.1k

Note Among the best large-sized embedding model with flexible embedding sizes and long-document understanding. Determined by The Scandinavian Embedding Benchmark (SEB) (2025/11/04)
intfloat/multilingual-e5-large-instruct

Feature Extraction • 0.6B • Updated Jul 10 • 1.3M • • 578

Note Among the best large-sized embedding model with Instructions. Determined by The Scandinavian Embedding Benchmark (SEB) (2025/11/04)
intfloat/multilingual-e5-large

Feature Extraction • 0.6B • Updated Feb 17 • 2.64M • • 1.08k

Note Among the best large-sized embedding model which does not require instructions. Determined by The Scandinavian Embedding Benchmark (SEB) (2025/11/04)
intfloat/multilingual-e5-base

Sentence Similarity • 0.3B • Updated Feb 17 • 1.76M • • 316

Note Among the best medium-sized embedding model which does not require instructions. Determined by The Scandinavian Embedding Benchmark (SEB) (2025/11/04)
intfloat/multilingual-e5-small

Sentence Similarity • 0.1B • Updated Feb 17 • 1.82M • • 245

Note Among the best small-sized embedding model which does not require instructions. Determined by The Scandinavian Embedding Benchmark (SEB) (2025/11/04)
facebook/seamless-m4t-v2-large

Automatic Speech Recognition • 2B • Updated Jan 4, 2024 • 41k • 920

Note Machine translation (and other tasks)

Upvote