Hacker Newsnew | past | comments | ask | show | jobs | submit | ldenoue's commentslogin

Check out https://ldenoue.github.io/readabletranscripts/ and the website https://www.appblit.com/scribe that use Gemini to post correct the raw transcripts


Unless you fetch directly from your browser. It works by getting the YouTube json including the captions track. And then you get the baseUrl to download the xml.

I wrote this webapp that uses this method: it calls Gemini in the background to polish the raw transcript and produce a much better version with punctuation and paragraphs.

https://www.appblit.com/scribe

Open source with code to see how to fetch from YouTube servers from the browser https://ldenoue.github.io/readabletranscripts/





How is the turn detection working? LLM prompting or a special AI audio plus text model?



Here’s one video above 1 hour and it works with Scribe https://www.appblit.com/scribe?v=FQUo2r-ow-k


looks great. I made a similar app called Scribe where you can highlight passages of the transcript. It's working on the web but also as an iOS app. https://www.appblit.com/scribe

To solve the server IP sometimes being blocked by YouTube, the app fetches the transcripts in the browser.


Same method as my open source lib https://github.com/ldenoue/readabletranscripts but several folks asked for a hosted version so here you go.

200 free minutes on signup so you can try for free.

LLM corrected transcripts are really good, and you can highlight text which I find super useful to study and share quotes.


This thing is amazing can you add recording?

Perhaps some samples you or visitors create?

Then add a little sampler for beat and it’s a fantastic tool


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: