Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Can I suggest something? Integrate this with some TTS and LLM so I can say “send my tanks north”, that rough command should translate. I’m not a game dev, so it’ll take me forever to even figure out where in the code to do this interception, but it’s a dream of mine.


I'll assume you meant speech recognition and not TTS. I'm also assuming you intended for the LLM to act as the intent classifier which isn't totally unreasonable although it may be a bit overkill.

Given the precision you'd need to set the target unit(s)/coordinates/waypoints - I just can't see this being very useful.

https://www.youtube.com/watch?v=yt8XYV-IKxY

A fun use of voice recognition I've seen before (Skyrim had it for shouts) is using your voice to cast spells though.

EDIT: Also forgot to mention, at one point I had a utility that you could map voice commands to keyboard shortcuts which I used in Microsoft Flight Simulator for commands like setting trim, adjusting fuel-air mixture, deploying flaps, etc.


Yes my mistake. We’re talking about a feature for a game, so useful, you know, what is useful? I want to walk around my room like a General and shout orders.


Sure I get you - I think it'd be a fun gimmick. But the reality is you'd be a general where all of your subordinates are half-deaf and constantly misinterpreting your orders.

Good premise for an Abbott and Costello skit though.


You might want to try Odama on gamecube. Sort of an RTS with voice commands and pinball.

https://en.m.wikipedia.org/wiki/Odama


Pointing at where you want to go/who you want to attack/which thing you want to build is much less clunky in practice...

I remember a game like you describe: https://store.steampowered.com/app/319740/There_Came_an_Echo... (from 2015, so speech recognition wasn't that good)


Shouldn't be too hard to implement. The only hurdle would be that you need the user to host or pay for the TTS and LLM usage. So you need to ask the users for API key or add some payment solution.


Would you really need a complicated LLM just to bark some instructions to a game? There is a finite way of expressing instructions to the game, you don't want to get into a discussion about what 'my tanks' means in the middle of the action. Siri could do this in 2011 and later versions work fine on device without connectivity.


I would love to see the code of an on device implementation that could accomplish a similar task. Pre LLM I have never seen an app that can take a variety of voice commands and execute on them with a good user experience. Are there any good open source implementations that can run without depending on proprietary calls to the device?

In fact I still haven't seen speech-to-command being implemented well in an end user app regardless of technology. Maybe I have looked at the wrong apps. I haven't used Siri to any large extent. Google home seems very gimmicky.

I recently tried doing a small bit of STT and I was underwhelmed by the quality I get on device. I used Whisper. It was bad enough that I concluded that I'd probably need to call to a hosted service instead.

And even if the TTS works really well, for most use cases you'll need to have an LLM reinterpret what your saying into commands.


IMO voice control is a bad fit for things that can be expressed by a few mouse clicks.

If you just want to interpret a few commands you don't need LLM, just hook up the TTS to the possible commands and it will limit itself to the possible words from the context. That's what Siri does and it works fine for stuff like turning the TV off (provided every action is available to the voice assistant code). My not-connected 2015 car also uses that for handsfree navigation and while it's a bit slow it works quite reliably. I really think you overestimate the amount of stuff you really need an AI service for.


Mechanics in real time strategy games are integral parts of it though.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: