How Surround Sound for Headphones Works (2014)

fxtentacle · on Aug 20, 2023

FYI the resulting product is https://www.newaudiotechnology.com/products/spatial-sound-ca...

(but I moved on to speech recognition a few years ago)

We improved the algorithm to allow for more customization and real time repositioning of the virtual speakers by mathematically decomposing the measured impulse data into components. Also, you need higher order approximations to get great bass, but the usual sources (e.g. my website and Wikipedia) only talk about IRs which are linear multiplications in frequency space.

But the most expensive part needed to turn it into a proper professional product was knowing people and paying for access to Hollywood monitoring studios. Because you want cinema sound, not a realistic approximation of your living room ;)

xattt · on Aug 20, 2023

How did your algorithm differ from what was offered by Aureal?

fxtentacle · on Aug 20, 2023

I'm not sure there is much overlap and I hadn't heard of them before.

Aureal A3D appears to be a method for determinig acoustic reflection, filtering and delay parameters based on in-game geometry. That means they necessarily enforce a mathematical model of how the sound is to be modified or else there wouldn't be fixed parameters to determine. They then apply that to in-game sound sources. So kind of like when you tell Unreal Engine to add reverb to a large room before streaming the data to real physical speakers.

This is about accurately simulating a 3D speaker array, so the source audio channels already contain potentially multiple objects with all the reflections, filtering and delays baked in. So kind of like converting real speakers to virtual ones.

caveman9000 · on Aug 20, 2023

I still miss my Aureal soundcard. I remember being able to pinpoint where people were in Counterstrike by sound. I have a Steel Series set now and they definitely aren't pinpoint.

HeWhoLurksLate · on Aug 20, 2023

Have you checked your audio settings in CS? I can't remember the names, but the newer audio engine handles spatial positioning bits really well on its own, and sometimes getting reprocessed by something else makes it worse.

arthur2e5 · on Aug 20, 2023

Interesting exercise, but for those looking for something closer finished product: look no further than openal-soft. Its HRTF support is what makes sound effects in Minecraft (Java Edition, on Windows and Linux -- headphone detection was very recently added for macOS) truly sound where they're coming from. Plus there's no intermediate mixing through any surround system, so you can even tell up from down. (HRTF in alsoft is not quite complete: it lacks support for near-field effects.)

Oh, (2014). That was a different time -- alsoft had just added HRTF and the technique was less known by then.

arthur2e5 · on Aug 20, 2023

(Lack of near field is related to the lack of physical distance units in OpenAL, according to the author in the two issues related to this feature. Games have always used whatever they wanted to use; if OpenAL-soft, a plug-and-play replacement library, added near-field, you might get Zombie ASMR breathing down your neck in L4D2 or something similarly unexpected. The same reason was given for not doing headtracking.)

willis936 · on Aug 20, 2023

Do you know what minecraft java version this was introduced?

arthur2e5 · on Aug 20, 2023

Unfortunately no, but you should be able to find the bundled openal-soft version by unpacking lwjgl-native*.jar.

willis936 · on Aug 20, 2023

Good tip, thanks.

Prism Launcher's MC 1.12.2 lwjgl-platform-2.9.4-nightly-20150209 has OpenAL DLLs with metadata dates of 2013/04/07.

Do you think simply repacking the DLLs in this jar and adding an `alsoft.ini` with `hrtf = true` would work? I'll try soon.

Edit: Looks like MC uses a paulscode OpenAL implementation dated 2010/08/24. Doesn't look trivial to replace.

TheAceOfHearts · on Aug 20, 2023

I've been going down this spatial audio rabbit hole for a few weeks now, and the tooling has left a lot to be desired. Admittedly, my own ignorance about tooling and audio mastering has probably been the largest barrier.

Ultimately I want a 3D environment where I can place a dummy head (the listener), and manipulate sound sources as objects while making them follow programmed trajectories. The best solution I've found so far seems to be Unreal Engine with some plugins, but progress quickly stalls as I'm overwhelmed by all of the technologies.

Honestly, considering how freaking cool this spatial audio can be, I'm amazed that simple creation tools aren't more widely available.

I enjoy listening to 3D ASMR recordings on YouTube but after a while most creators get a bit same-y. It feels like you could generate much more immersive spatial audio experiences, but only a small handful of channels are doing it.

fxtentacle · on Aug 20, 2023

I used to work on this https://www.newaudiotechnology.com/products/spatial-audio-de... some years ago. You can drive the object positions using EMBER or ProTools consoles and then export audio plus 3D info as MPEG-H, which you can later render out into a variety of formats, including binaural.

Or combined with a DAW like Cubase/Reaper/Logic, you can record object paths as automation tracks with your mouse using this: https://www.newaudiotechnology.com/products/spatial-audio-de...

userbinator · on Aug 20, 2023

Article is from 2014 but even in the days of AC97 codecs this sort of "3D sound" effect was already available in hardware. Realtek's drivers came with a utility to test this feature:

https://www.youtube.com/watch?v=WZ0aHFwuCRo

Older versions used the more technically precise term "HRTF":

https://infosys.beckhoff.com/content/1033/cx1020_cx1030_hw/I...

fxtentacle · on Aug 20, 2023

Yes, but this article was about a then-new method to use precisely measured filters, as opposed to mathematical approximations of those filters.

You can think of the AC97 IIR filters like a cut off tailor series expansion of the actually measured causal FIR filter.

_Microft · on Aug 20, 2023

See also: https://en.wikipedia.org/wiki/Head-related_transfer_function

aatd86 · on Aug 20, 2023

Solutions have improved from a couple years ago, but in general, the issue I've found is that azimuth encoding wasn't working too well.

Maybe by interpolating on a lot of dynamically recorded impulse responses, that can be done better.

theLiminator · on Aug 20, 2023

Tangentially related you should really try listening to binaural audio recordings on headphones to experience music in a totally new way.

tmikaeld · on Aug 20, 2023

Is this the same as Dolby DTS Headphone:X (That's in some higher priced gaming headphones these days) or different?

fxtentacle · on Aug 20, 2023

This one is about 100k-tap FIR filters, my experiments suggest that DTS HP:X uses 32-tap IIR filters. But my impression is that Dolby Atmos works similarly.

arthur2e5 · on Aug 20, 2023

The other thing I've noticed about your work is that you don't use filters derived from lab-measured human HRTFs, but instead approximate them from first principles. Very interesting approach, although I do worry about missing out on the up/down asymmetry of the auricles -- not an issue for planar 5.1 though.

aDfbrtVt · on Aug 20, 2023

Why so many taps? That's north of 2 seconds.