Wav2Vec2-BERT Model for Oromo ASR
This model is a fine-tuned version of facebook/w2v-bert-2.0 for automatic speech recognition in Oromo.
Evaluation
The model was evaluated on a held-out speech recognition test set.
- WER (Word Error Rate): 0.3745
- CER (Character Error Rate): 0.0687
Usage
from transformers import Wav2Vec2Processor, Wav2Vec2ForCTC
import torch
import librosa
# Load model and processor
model = Wav2Vec2ForCTC.from_pretrained("misterkissi/w2v-bert-2.0-oromo-colab-CV1.0")
processor = Wav2Vec2Processor.from_pretrained("misterkissi/w2v-bert-2.0-oromo-colab-CV1.0")
# Load and preprocess audio
audio, rate = librosa.load("audio.wav", sr=16000)
inputs = processor(audio, sampling_rate=16000, return_tensors="pt")
# Perform inference
with torch.no_grad():
logits = model(**inputs).logits
pred_ids = torch.argmax(logits, dim=-1)
# Decode prediction
transcription = processor.batch_decode(pred_ids)[0]
print(transcription)
Model Details
Base model: facebook/w2v-bert-2.0
Fine-tuned on Oromo language data
WER: 0.3745
CER: 0.0687
model-index: - name: w2v-bert-2.0-oromo-colab-CV1.0 results: - task: type: Automatic Speech Generation metrics: - name: WER value: 0.3745
- Downloads last month
- 4
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
๐
Ask for provider support
Model tree for DarliAI/kissi-w2v-bert-2.0-oromo
Base model
facebook/w2v-bert-2.0