parakeet-ctc-0.6b-with-meta
This is a multilingual Automatic Speech Recognition (ASR) model fine-tuned with NVIDIA NeMo. It is different from standard transcription models, as it can mark intents, get voice bio, and emotions in streaming.
How to Use
You can use this model directly with the NeMo toolkit for inference.
import nemo.collections.asr as nemo_asr
# Load the model from Hugging Face Hub
asr_model = nemo_asr.models.ASRModel.from_pretrained("WhissleAI/parakeet-ctc-0.6b-with-meta")
# Transcribe an audio file
transcriptions = asr_model.transcribe(["/path/to/your/audio.wav"])
print(transcriptions)
This model can also be used with the inference server provided in the PromptingNemo repository.
See this folder for fine-tuning and inference scripts https://github.com/WhissleAI/PromptingNemo/scripts/asr/meta-asr for details.
- Downloads last month
- 2
Model tree for WhissleAI/parakeet-ctc-0.6b-with-meta
Base model
nvidia/parakeet-ctc-0.6b