Hacker News new | past | comments | ask | show | jobs | submit login

It took some time, but we finally got Kokoro TTS (v1.0) running in-browser w/ WebGPU acceleration! This enables real-time text-to-speech without the need for a server. Looking forward to your feedback!



Now that's what I call "server-less" computing!


Amazing! I'm interested in models running locally and Kokoro seems amazing. Are you aware of similar models but for Speech to text?



The realtime Whisper demo is amazing.

How can I understand what's in the compiled JS though? Is there some source for that?


whisper


This is brilliant. All we need now is for someone to code a frontend for it so we can input an article's URL and have this voice read it out loud... built-in local voices on MacOS are not even close to this Kokoro model


There are a few already, I assume MacWhisper will add it. That being said, I am also working on a (crossplatform, in Flutter) UI for this.


My understanding is that MacWhisper is a front-end for Whisper.cpp so... it does Speech-to-text? (transcribing what you dictate)

Here I'm talking about the model shared in this thread, which is text-to-speech (reading out loud content from the web)


Yes, I am saying they might include features for TTS in addition to their current STT feature set. Seems like many of these sorts of apps are looking to add both to be more full fledged.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: