Hacker News new | past | comments | ask | show | jobs | submit | 8ta4's comments login

Thank you, joshxyz, you've just given me the perfect excuse to show off my work, all under the pretense of seeking feedback. Check it out!

https://github.com/8ta4/say/blob/7dad55cc3faf65e367cd9cbe18d...

But seriously, I do value your feedback.


this is somewhat appealing.

it seems like it's targeted to technical people.

it raises questions like.. how can non technical people use it? what are the other api provider options can users use? will it be extendable, can other developers create extensions on top of it? how could it perform on non-mac devices and users in areas with non gigabit speed? are there business use-cases fpr this? what specific roles, teams, and industries can benefit from it? how soon can you ship a working product to your potential users? do you feel going fully open source or open core? do you feel like building it in public on twitter so you can tweet about it and get more feedback along the way?

personally i'm not interested in it (i'm just a very private guy, and i prefer my notes written manually) - but that's me, and i don't speak for everybody else.

if you're scratching your own itch building it that will be cool as hell, i'm sure there are other people (users and developers alike) interested on these problems and solutions, and maybe you just need to reach them at the right platforms e.g. on twitter, slack, discord, or telegram group.. it varies.


You've got some great points there!

How about this for a killer biz idea: 24/7 employee surveillance.

And I'm intrigued by the idea of building in public.


Can Windows, Android, or iOS dictation run continuously 24/7?

Actually, I'm using the built-in macOS speech recognition to answer your que


I don't know if a built in app does that, but if not, it should be possible to write one and use the OS APIs to do the actual recognition?


"Why didn't I consider it earlier?" was my first thought. Then, "Maybe I talk too much and think too little?"

Your idea is definitely easy on the wallet, but I should've mentioned before that accuracy is crucial.

When it comes to speech recognition accuracy:

- Windows might work well, since Office 365 has good speech recognition features.

- The speech recognition on macOS isn't as accurate, and iOS might have the same problem.

- As for Chrome and Google Docs, their speech recognition quality is lower, and Android might be similar.


Do you think an open-source solution that only uses Deepgram API and does not store any recordings would satisfy your privacy requirements?

How many hours per day or month do you actively use speech recognition?

60,000 minutes per month. I had to double-check my calculations. It seems you've found a 30th hour in your day.

Let me give you some context:

I saw your blog post about Deepgram. They charge $0.0059 per minute for pay-as-you-go.

- If you use it 24/7, it costs:

    - $8.496 per day

    - $254.88 per month
- If you use it 8 hours a day (with voice activity detection), it costs:

    - $2.832 per day

    - $84.96 per month
I know the 24/7 cost is too high for your budget ($60-80 a month). But voice activity detection can save you a lot of money.

About privacy and trust, open-sourcing the solution might give you some confidence. Deepgram is backed by YC and has many users, which might also make you feel better.


> Out of curiosity, what do you do with the transcriptions after you record them?

I use them as a dictation tool. I speak out what I want to write and then I use a language model to polish it later.

> You mentioned that you usually talk out loud while working, but do you keep working as usual after you save the transcription for future use?

Yes, I continue working as "normal". But you see, there's this slight concern that if I keep talking all day, every day, someone might reserve a spot for me in a mental asylum.

> Do you ever stop your work to do something with the transcription right away?

Sometimes, yes. If I'm writing a specific message, I might pause my work to polish it immediately. But if I'm just voicing my random thoughts, I would like to access them later to write my messages or posts.


Thank you! With the CPU usage, it seems like every season will be summertime.


Do you care about what platform the solution is on? I.e Linux/Windows desktop application, Chrome extension, mobile app, website in a separate tab, etc.?


Not seeing Mac really caught my eye! You see, Mac is the apple of my eye.

I want to try a standalone Mac app for the trial.

After the trial, I'll use a stationary device that can record all the time. It could be a Raspberry Pi, an old phone, or something cheap. I don't want to miss anything or record other people's talk.

I'll edit on my main Mac. I can't use it for recording because it's not always on and I might take it out.

For syncing, I prefer a cloud service that can transcribe well. The device will send my audio to the cloud and the transcription API will do its magic. The transcript will be ready on my Mac right away.


Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: