chetjan's comments

chetjan · on Feb 5, 2024

I like this. I tried something similar ~10 years ago, but it didn't go very well. I'm sure an LLM can do much better than the nonsense I hacked together.

chetjan · on Feb 5, 2024

Very cool, just signed up. What advantages does this have over the one built into the ChatGPT app? Also, it would be great if I could see the text output in addition to the voice.

bonamiko · on Feb 5, 2024

The main differences fundamentally come down to OpenAI treating it more like a party trick demo, rather than a core functionality. I think it has a lot of potential if I can just fine tune a couple rough edges. (When you chat with someone in person, you don't pull out notebooks a write messages to each other. I see writing as a fallback medium.)

To answer your question more specifically,

Pro Bonamiko:

  - Faster average first response latency (but higher first audio latency since OpenAI uses a ding). This is the main focus currently, reducing latency as much as I can. I'd like to be able to avoid the ding, but we'll see how low I can get it.
  - Can be used anywhere with a browser, OpenAI requires a mobile app installed. (I.E. Desktop support)
  - In the future we can support deeper customization since we are focused on the audio medium. As soon as you have to run a function in the ChatGPT app there is a long response latency, which could easily be fixed by something as simple as the AI saying "Let me perform a search to get the details"

Pro ChatGPT:

  - Nice animation
  - Already has built in tool support such as web search
  - Supports language switching automatically between messages, Bonamiko requires manually changing the language

chetjan · on Feb 19, 2016

IP -> Intellectual Property, LUT -> Look Up Table. IP is often used by Intel to describe a "unit" of propriety technology, such a chip or just a adder design. Look up tables are in essence ROM that is the basis for how FPGAs work.