Yea, I use a cheapie voice recorder that only saves .wav files for ~10 memos per...

neverokay · on June 2, 2024

I’d add that I had better luck using smaller chunks (about 20 seconds) per wav file for accuracy. Whisper seems to go berserk if you pump in lengthy audio (30+ seconds).

I’d be tempted to at least try breaking down the notes into one line long images (about a sentence) each and give it ago with Gemini. I haven’t tested their ocr, but even if it has errors, I bet you could just ask Gemini again to best fix the sentence.

IanCal · on June 6, 2024

Whisper works on 30s chunks iirc. You need to use something that's automatically splitting up your input if it's longer.