Yiddish Whisper Training Collection Yiddish based Whisper post-training - Crowd Sourced Open Data • 10 items • Updated 12 days ago • 2
Scaling Low-Res MT via Synthetic Data Generation with LLMs Collection Synthetic baselines trained for our paper "Scaling Low-Resource MT via Synthetic Data Generation with LLMs" accepted as a main in EMNLP 2025. • 8 items • Updated Sep 16 • 1
Scaling Low-Resource MT via Synthetic Data Generation with LLMs Paper • 2505.14423 • Published May 20 • 1
DictaBERT Collection Collection of state-of-the-art language model for Hebrew, finetuned for various tasks, as detailed in the article: https://arxiv.org/abs/2308.16687 • 17 items • Updated Apr 4, 2024 • 5
Arabic-Nougat: Fine-Tuning Vision Transformers for Arabic OCR and Markdown Extraction Paper • 2411.17835 • Published Nov 19, 2024 • 3
ZeroGPU Spaces Collection ZeroGPU Spaces made by the community • 17 items • Updated Jun 6, 2024 • 246