Single sample eval for WER on various Whisper models
ASR benchmark comparing local and cloud models
Comparing STT models against audio