Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> There are like 10 models that are smaller and faster and outperform both of them.

As someone who is currently relying on Whisper for some things, what models are those exactly? I still haven't found anything that is accurate as Whisper (large), are those models just faster or also as accurate/more accurate?



Nvidia parakeet and canary are better and faster, here is a leaderboard: https://huggingface.co/spaces/hf-audio/open_asr_leaderboard


> Nvidia parakeet and canary are better and faster

Is that based on your own experience using those and also Whisper, comparing them side-by-side? Or is that based just on those benchmark results?


Yes for parakeet, but only comparing benchmark results for canary. Whisper also has severe hallucinations on silence and noise and WhisperX helps a lot, it adds voice activity detection i.e. a model to detect when someone speaks, to filter the input before running whisper. https://github.com/m-bain/whisperX


Parakeet isn’t more accurate than whisper large




Consider applying for YC's Winter 2026 batch! Applications are open till Nov 10

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: