FireRedVAD -- GGUF

Available variants

File	Variant	Size	Params	Notes
`firered-vad.gguf`	VAD	2.4 MB	588K	Non-streaming, lookback+lookahead
`firered-stream-vad.gguf`	Stream-VAD	2.3 MB	568K	Streaming (no lookahead)
`firered-aed-vad.gguf`	AED	2.4 MB	589K	Multi-label: speech/singing/music

All variants are F32 (no quantization needed — models are already tiny).

Architecture: DFSMN (Deep Feedforward Sequential Memory Network) — 8 blocks with depthwise lookback/lookahead convolutions (k=20)
Parameters: ~588K (2.4 MB)
Languages: 100+ (language-agnostic voice activity detection)
F1 Score: 97.57% on FLEURS-VAD-102 (outperforms Silero-VAD, TEN-VAD, FunASR-VAD, WebRTC-VAD)
License: Apache 2.0

python models/convert-firered-vad-to-gguf.py --input FireRedTeam/FireRedVAD --variant VAD --output firered-vad.gguf

GGUF

Model size

589k params

Architecture

firered-vad

Hardware compatibility

We're not able to determine the quantization variants.

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Base model

Quantized

(4)

this model