Hacker News new | past | comments | ask | show | jobs | submit login

I use Apple dictation heavily for transcribing interviews. I've tried all the voice-to-text services out there and none have been reliable enough *at transcribing an audio file. I've settled on playing audio in my headphones and pausing while I carefully dictate text into a document. If I could upload the audio file, get a first-pass transcription, and then go through and edit / make corrections with voice, that would be awesome.

A difference in error rate from 20-something percent down to less than 5 percent sounds incredible.




Have you tried using Whisper from OpenAI ? Aiko [0] have Whisper-v2-large built-in and allow for transcription of audio file

[0] https://apps.apple.com/fr/app/aiko/id1672085276


Is there anything like this for watching foreign television (or radio)? I don't want to create a document, I just want real-time translated subtitles, but I can't do it in advance for live shows.


This is amazing. Just tried really mumbling a long for a while and it got every word.


Have you tried openai whisper? Last time I compared it was quite a bit better than all the other options.


Check out Descript. It's been awesome when I used it in the past


Deepgram has been incredibly accurate for me.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: