(but I moved on to speech recognition a few years ago)
We improved the algorithm to allow for more customization and real time repositioning of the virtual speakers by mathematically decomposing the measured impulse data into components. Also, you need higher order approximations to get great bass, but the usual sources (e.g. my website and Wikipedia) only talk about IRs which are linear multiplications in frequency space.
But the most expensive part needed to turn it into a proper professional product was knowing people and paying for access to Hollywood monitoring studios. Because you want cinema sound, not a realistic approximation of your living room ;)
I'm not sure there is much overlap and I hadn't heard of them before.
Aureal A3D appears to be a method for determinig acoustic reflection, filtering and delay parameters based on in-game geometry. That means they necessarily enforce a mathematical model of how the sound is to be modified or else there wouldn't be fixed parameters to determine. They then apply that to in-game sound sources. So kind of like when you tell Unreal Engine to add reverb to a large room before streaming the data to real physical speakers.
This is about accurately simulating a 3D speaker array, so the source audio channels already contain potentially multiple objects with all the reflections, filtering and delays baked in. So kind of like converting real speakers to virtual ones.
I still miss my Aureal soundcard. I remember being able to pinpoint where people were in Counterstrike by sound. I have a Steel Series set now and they definitely aren't pinpoint.
Have you checked your audio settings in CS? I can't remember the names, but the newer audio engine handles spatial positioning bits really well on its own, and sometimes getting reprocessed by something else makes it worse.
Interesting exercise, but for those looking for something closer finished product: look no further than openal-soft. Its HRTF support is what makes sound effects in Minecraft (Java Edition, on Windows and Linux -- headphone detection was very recently added for macOS) truly sound where they're coming from. Plus there's no intermediate mixing through any surround system, so you can even tell up from down. (HRTF in alsoft is not quite complete: it lacks support for near-field effects.)
Oh, (2014). That was a different time -- alsoft had just added HRTF and the technique was less known by then.
(Lack of near field is related to the lack of physical distance units in OpenAL, according to the author in the two issues related to this feature. Games have always used whatever they wanted to use; if OpenAL-soft, a plug-and-play replacement library, added near-field, you might get Zombie ASMR breathing down your neck in L4D2 or something similarly unexpected. The same reason was given for not doing headtracking.)
I've been going down this spatial audio rabbit hole for a few weeks now, and the tooling has left a lot to be desired. Admittedly, my own ignorance about tooling and audio mastering has probably been the largest barrier.
Ultimately I want a 3D environment where I can place a dummy head (the listener), and manipulate sound sources as objects while making them follow programmed trajectories. The best solution I've found so far seems to be Unreal Engine with some plugins, but progress quickly stalls as I'm overwhelmed by all of the technologies.
Honestly, considering how freaking cool this spatial audio can be, I'm amazed that simple creation tools aren't more widely available.
I enjoy listening to 3D ASMR recordings on YouTube but after a while most creators get a bit same-y. It feels like you could generate much more immersive spatial audio experiences, but only a small handful of channels are doing it.
I used to work on this
https://www.newaudiotechnology.com/products/spatial-audio-de...
some years ago. You can drive the object positions using EMBER or ProTools consoles and then export audio plus 3D info as MPEG-H, which you can later render out into a variety of formats, including binaural.
Article is from 2014 but even in the days of AC97 codecs this sort of "3D sound" effect was already available in hardware. Realtek's drivers came with a utility to test this feature:
This one is about 100k-tap FIR filters, my experiments suggest that DTS HP:X uses 32-tap IIR filters. But my impression is that Dolby Atmos works similarly.
The other thing I've noticed about your work is that you don't use filters derived from lab-measured human HRTFs, but instead approximate them from first principles. Very interesting approach, although I do worry about missing out on the up/down asymmetry of the auricles -- not an issue for planar 5.1 though.
(but I moved on to speech recognition a few years ago)
We improved the algorithm to allow for more customization and real time repositioning of the virtual speakers by mathematically decomposing the measured impulse data into components. Also, you need higher order approximations to get great bass, but the usual sources (e.g. my website and Wikipedia) only talk about IRs which are linear multiplications in frequency space.
But the most expensive part needed to turn it into a proper professional product was knowing people and paying for access to Hollywood monitoring studios. Because you want cinema sound, not a realistic approximation of your living room ;)