See with Ears: Auditory Display for Synthetic Vision (2018)

yodon · on April 11, 2019

Back in about 2000 I remember exchanging emails with a guy who had built a modified renderer for (I think) the Half-Life 1 engine. He was using audio rendering to enable blind players to play first person shooters. My recollection is he was using a series of horizontal scan lines across the display region with different octaves or similar indicating which scan line was being "rendered" and different tones indicating the brightness value along the sweep of each scan line. I think he was primarily using mono audio at the time, but it's been long enough that I can't entirely recall.

jolmg · on April 11, 2019

Sounds extremely interesting.

seeingwithsound · on April 11, 2019

For serious users, The Android version of The vOICe also runs on smart glasses from Vuzix (Vuzix Blade, M300) https://www.vuzix.com/appstore/app/the-voice-for-android, EPSON (Moverio BT-300) https://moverio.epson.com/jsp/pc/pc_application_detail.jsp?p... as well as on sub-$200 VISION-800 smart glasses https://www.seeingwithsound.com/android-glasses.htm

Also try the experimental visual object recognition under Options | Say object names.

trafnar · on April 11, 2019

You can actually try this out in your browser with your webcam here: https://www.seeingwithsound.com/webvoice/webvoice.htm

syllable_studio · on April 11, 2019

Cool, that's an interesting scheme for translating vision to audio.

I have experimented a little with using 3D binaural audio for navigating environments and translating objects' location in 3D space into audio. Here's my blog post about my experiment https://blog.syllablehq.com/project-sonorous-a-proposed-navi...

The blog post links to an app you can demo where you can practice navigating a maze with your eyes closed using only binaural audio to guide you. Anyone can do it with very little practice!

I would love to spend more time on this in the future and I'd love to consider adopting this vOICe scheme.

blipblap · on April 11, 2019

Can you do phase offsets between the ears to get depth perception?

hammock · on April 10, 2019

This feels really groundbreaking to me. It's like echolocation. Although it also seems to be projecting 3d onto 2d, while true echolocation gives you a 3d picture(?)

perilunar · on April 11, 2019

There's stories of blind people using echolocation to get around, so why not do a synthetic version of that instead of this made up encoding scheme (as cool as it is).

I.e. model how sound would propagate in an environment and create the sound an echolocator would actually hear reflected back.

hammock · on April 11, 2019

What I was getting at is that would be much more complex, because this uses a camera (2d), while echolocation/our ears is 3d when all is said and done. So to do what you're suggesting, you'd first need some way create a 3d model of the world from the camera's 2d image, and then encode into sound.

At which point, echolocation is probably better.

shelvacu · on April 10, 2019

As I understand it, echolocation gives you a rough 3d model of everything around you, but no indication of flat textures on things. This project would seem to give a 2d image as a sound.

jrootabega · on April 11, 2019

Has some overlap with the wave scanner from Elite Dangerous.

http://wavescanner.net/

wben · on April 11, 2019

It would be lovely to have a smart phone app that does this in real time. Does it already exist?

kej · on April 11, 2019

It's mentioned early in the article, but easy to miss if you jump straight to the technical details: https://play.google.com/store/apps/details?id=vOICe.vOICe