I've been extremely happy with their generated voice! I had GPT read something t...

modeless · on Nov 6, 2023

The voices sound great! But the latency is too high, and it's clunky to use with voice alone, it only listens to you at specific times and you can't interrupt it with your voice. I wanted something that felt more like having a casual conversation with a real person, instead of "Siri but smarter". And I know it's possible because I built something closer to what I want.

hfjjbf · on Nov 6, 2023

I’m really curious how much of the speech to text is happening on device, I’m fairly sure the answer is “none”. That would have a fairly immediate performance boost. Right now it is cool as hell, just way too slow to be truly useful.

modeless · on Nov 6, 2023

Yeah I expect the speech recognition is happening in the cloud. But that doesn't mean it necessarily has to be slow. You can stream audio with very low latency on most connections. The problem is what they do with the audio once it gets to the server. I suspect it would be far too expensive to dedicate a GPU to each customer so they need to run everything in batch mode which increases latency.

DeWilde · on Nov 6, 2023

Was this just ChatGPT or a custom implementation with access to some data?

surprisetalk · on Nov 6, 2023

Normal ChatGPT. I asked it a question about the Soviet Union, and the response was lengthy