Parakeet TDT 0.6B V3 pt-BR TAGARELA (ONNX)
This repository provides an ONNX conversion of NVIDIAβs multilingual Parakeet TDT 0.6B V3 ASR model for use with onnx-asr.
This ONNX release is based on Alexandre Costa Ferro Filhoβs fine-tuned checkpoint alexandreacff/parakeet-tdt-0.6b-v3-ptBR-plus and is intended to be used natively with onnx-asr.
Overview
The original NVIDIA model is a multilingual speech recognition model covering 25 European languages. This version inherits that foundation, but it was further adapted for Brazilian Portuguese through fine-tuning on the TAGARELA dataset, a large-scale Portuguese speech dataset derived from podcasts.
Because this checkpoint was optimized for Portuguese, especially pt-BR, performance on other languages may differ from the original multilingual model and is not guaranteed.
Intended use
This model is suitable for:
- Automatic speech recognition (ASR)
- Portuguese speech transcription
- Fast ONNX-based inference pipelines
- Deployment with onnx-asr
Installation
Install onnx-asr with Hugging Face Hub support:
pip install "onnx-asr[cpu,hub]"
Download
You can download the model files locally with:
from huggingface_hub import snapshot_download
snapshot_download(
repo_id="alefiury/parakeet-tdt-0.6b-v3-ptBR-TAGARELA-onnx",
local_dir="./parakeet-tdt-0.6b-v3-ptBR-TAGARELA-onnx",
)
Usage
Load the model with onnx-asr and transcribe a WAV file:
import onnx_asr
model = onnx_asr.load_model(
"nemo-conformer-tdt",
"./parakeet-tdt-0.6b-v3-ptBR-TAGARELA-onnx",
)
print(model.recognize("test.wav", language="pt"))
Dataset-level benchmark results
The datasets below are grouped by speech style.
Prepared speech datasets: CETUC, Common Voice 21.0, MLS (Portuguese), MTEDx (Portuguese)
Spontaneous speech datasets: ALIP, C-ORAL Brasil I, NURC-Recife, SP2010, NURC-SP, MuPe, Private Dataset
Lower values are better for WER.
Prepared speech (WER β)
| Model | CETUC | Common Voice 21.0 | MLS (Portuguese) | MTEDx (Portuguese) | Prepared Avg WER β |
|---|---|---|---|---|---|
| Parakeet TDT V3 0.6B - ONNX (TAGARELA) | 0.006 | 0.051 | 0.108 | 0.133 | 0.075 |
| Parakeet TDT V3 0.6B | 0.027 | 0.081 | 0.064 | 0.186 | 0.090 |
| Qwen3ASR 1.7B | 0.028 | 0.068 | 0.077 | 0.157 | 0.083 |
| Whisper large-v3 | 0.021 | 0.065 | 0.073 | 0.176 | 0.084 |
| Voxtral-Small 24B | 0.019 | 0.055 | 0.054 | 0.168 | 0.074 |
| Canary V2 1B | 0.036 | 0.120 | 0.078 | 0.178 | 0.103 |
| Distil-Whisper large-v3 PT-BR | 0.030 | 0.094 | 0.092 | 0.168 | 0.096 |
| ElevenLabs Scribe v2 | 0.018 | 0.047 | 0.038 | 0.138 | 0.060 |
Spontaneous speech (WER β)
| Model | ALIP | C-ORAL Brasil I | NURC-Recife | SP2010 | NURC-SP | MuPe | Private Dataset | Spontaneous Avg WER β |
|---|---|---|---|---|---|---|---|---|
| Parakeet TDT V3 0.6B - ONNX (TAGARELA) | 0.213 | 0.137 | 0.138 | 0.104 | 0.160 | 0.120 | 0.127 | 0.143 |
| Parakeet TDT V3 0.6B | 0.316 | 0.213 | 0.269 | 0.180 | 0.202 | 0.176 | 0.173 | 0.218 |
| Qwen3ASR 1.7B | 0.316 | 0.222 | 0.255 | 0.191 | 0.202 | 0.187 | 0.147 | 0.217 |
| Whisper large-v3 | 0.345 | 0.220 | 0.290 | 0.236 | 0.218 | 0.177 | 0.150 | 0.234 |
| Voxtral-Small 24B | 0.396 | 0.227 | 0.298 | 0.196 | 0.216 | 0.179 | 0.143 | 0.236 |
| Canary V2 1B | 0.415 | 0.290 | 0.366 | 0.273 | 0.247 | 0.229 | 0.174 | 0.285 |
| Distil-Whisper large-v3 PT-BR | 0.374 | 0.242 | 0.297 | 0.226 | 0.226 | 0.193 | 0.154 | 0.245 |
| ElevenLabs Scribe v2 | 0.329 | 0.170 | 0.307 | 0.155 | 0.207 | 0.205 | 0.121 | 0.213 |
Notes
- This repository contains an ONNX-exported version of the model for inference.
- It is designed for compatibility with onnx-asr.
- For the original PyTorch/NeMo model, please refer to:
Training background
This ONNX checkpoint is derived from a Portuguese-adapted version of NVIDIAβs multilingual Parakeet TDT 0.6B V3 model. The Portuguese adaptation was performed using the TAGARELA dataset, which was created to support ASR and TTS research in Portuguese.
Limitations
- Best performance is expected on Portuguese speech, particularly Brazilian Portuguese.
Acknowledgments
- Base multilingual model: NVIDIA Parakeet TDT 0.6B V3
- Portuguese-adapted checkpoint: alexandreacff/parakeet-tdt-0.6b-v3-ptBR-plus
- Inference framework: onnx-asr
- Dataset: TAGARELA
- Downloads last month
- 116
Model tree for alefiury/parakeet-tdt-0.6b-v3-ptBR-TAGARELA-onnx
Base model
alexandreacff/parakeet-tdt-0.6b-v3-ptBR-plus