| language: | |
| - ar | |
| - be | |
| - bg | |
| - bn | |
| - cs | |
| - cy | |
| - da | |
| - de | |
| - el | |
| - en | |
| - es | |
| - et | |
| - fa | |
| - fi | |
| - fr | |
| - gl | |
| - hi | |
| - hu | |
| - it | |
| - ja | |
| - ka | |
| - lt | |
| - lv | |
| - mk | |
| - mr | |
| - nl | |
| - pl | |
| - pt | |
| - ro | |
| - ru | |
| - sk | |
| - sl | |
| - sr | |
| - sv | |
| - sw | |
| - ta | |
| - th | |
| - tr | |
| - uk | |
| - ur | |
| - vi | |
| - zh | |
| library_name: transformers | |
| license: mit | |
| metrics: | |
| - bleu | |
| pipeline_tag: audio-text-to-text | |
| Test ultravox model. More coming soon... I hope so. | |
| ```python | |
| import transformers | |
| import numpy as np | |
| import librosa | |
| pipe = transformers.pipeline(model='AtAndDev/UVOX-50k-Llama-3.2-1B-Instruct', trust_remote_code=True, device="cuda") | |
| path = "voice_input.mp3" | |
| audio, sr = librosa.load(path, sr=16000) | |
| turns = [] | |
| pipe({'audio': audio, 'turns': turns, 'sampling_rate': sr}, max_new_tokens=100) | |
| ``` |