C++ audio mixing library design

Jasper_ · on Oct 17, 2022

I love audio programming, it's one of those things that's provided me with a lot of joy. I've been working on our in-house audio library for our game engine as well, and designing custom effects for it. In our case, we wanted an 80s DSP sound, so it was fun diving into the old literature. In the end, a simple design won out: https://twitter.com/JasperRLZ/status/1567546255454371842

throwaway17_17 · on Oct 17, 2022

Do you have any references for the old literature you looked at while trying to capture that sound. I would love to add some of it to my reading list. Also, the processing and track itself in your video are fantastic.

Jasper_ · on Oct 17, 2022

My code has these links as comments for reading material:

https://www.soundonsound.com/techniques/springs-plates-bucke...

https://www.dsprelated.com/freebooks/pasp/Schroeder_Reverber...

http://www.music.mcgill.ca/~gary/courses/papers/Moorer-Rever...

https://www.dsprelated.com/freebooks/pasp/Freeverb_Main_Loop...

Most reverbs these days are convolutional. I looked at those too! But they require an expensive FFT (usually run in a thread), and the quality sounded "too good" for the audio target I wanted to hit.

After trying a lot of stuff, my main primitive ended up being a delay line hooked up to a low-pass. The output sample is a delay line going through a low-pass. That output is mixed with the incoming sample, and shoved back into the delay. Effectively, this makes a comb filter. I run eight of these in parallel at slightly different delay lengths to comb different frequencies. I was trying to shoot for "digital blur" that had a bit of crunch and aliasing that I wanted, while still sounding like a reverb.

And the track was done by Jake Kaufmann, our composer at Yacht Club Games! I can't take any credit for it.

emsy · on Oct 17, 2022

I'm a huge fan of Shovel Knights sound so I'm looking forward to Mina the Hollower!

stodor89 · on Oct 17, 2022

> All my engine is in C++, and I’m really sick of SDL_DoThatThing(&options, pointer_to_void, size_in_magical_units, &callback, user_data, legacy_value) (sorry for all C admirers, I do respect you, it’s just that I’m not one of you).

I started as a C++ programmer (back when C++0x and boost were all the rage), but the more time passes, the more of a C admirer I seem to become.

embedded3 · on Oct 17, 2022

Same. I've written probably more C++ than any other language, and I have a lot of respect for the language overall, but I recently had to use only C for something, and I found it rather refreshing. My use case didn't hit a lot of the pain points of C, but it did feel refreshing as there are fewer ways of doing things, fewer options, compared to C++, and my code ended up feeling more straightforward and readable as a result.

pjmlp · on Oct 17, 2022

C23 says hello.

C is becoming into everything C++ has, minus OOP and the additional type safety.

jstimpfle · on Oct 17, 2022

Examples? Because the list of accepted proposals seems conservative. https://en.wikipedia.org/wiki/C2x https://thephd.dev/c23-is-coming-here-is-what-is-on-the-menu

pjmlp · on Oct 17, 2022

You have to look forward to what in the plate post C23 like lambdas, improved constexpr, more _Generic support, and so forth.

https://www.open-std.org/jtc1/sc22/wg14/www/wg14_document_lo...

jstimpfle · on Oct 17, 2022

Lambdas and some other complicated looking extensions have been on the plate for quite some time, and it seems like none of them made it into C23. Lambdas in particular have been promoted for the most part by a single person AFAICT. "C is becoming into everything C++ has" is a drastic misrepresentation.

throwaway17_17 · on Oct 17, 2022

I think I take the middle position between you and GP, C is slowly working it’s way towards unnecessary complexity. Most of the ‘X language is a C replacement’ posts and threads here on HN seem to disregard C’s development (particularly from 2017 onward) through committee standards releases. Everyone keeps arguing about Rust, Zig, Odin, etc and claiming one or the other is clearly more C like and the others are too complicated/expansive/growth-prone. But no one ever stops and claims that the simple C in there heads is not C in 2022, but C in 1990.

C is not yet at the point where added features, idioms, or concepts have totally removed the ability to return to or regularly restrict yourself to some ‘simple’ subset of the language. However, the push to add more and more to the standard does not inspire hope that this situation will remain.

So while I don’t think it’s there yet, it certainly looks like complexity via continual expansion is the path forward. I may not like it, but it does seem to be a reasonable supposition by GP.

jstimpfle · on Oct 17, 2022

I agree, there is some proposals in there that I doubt they fit well with the culture. However what has actually been accepted that makes you think C in the heads is not C in 2022? In my head, the biggest practical change from 1990 C is declare-anywhere (including declare-in-for-loop) which in my estimation has afforded a lot of added ergonomics at practically zero semantic cost. The next most relevant one is standardization of memory models, which is a good one AFAICT (but I used it only a little, so far). One unfortunate addition from 1999, VLAs, has even been demoted in the following revision.

In 2022 I can jump into basically any C codebase and be immediately productive. Which can't be said of some other languages.

throwaway17_17 · on Oct 17, 2022

I don’t think there is any one addition to the C that has moved C to far up the complexity hierarchy. It is the cumulative effect of all the proposals submitted, some of which are approved, and the movement to continue to add more. Like I said, C still enables developers to restrict themselves to a simple and long existing subset of features and idioms making coding in C a productive exercise for experienced developers. But the continued push from all directions, and with varying likelihood of being accepted, to change the language wear on me, to the point I continually want to just make a C99 clone with my preferred improvements and just use that instead.

Also, I suspect I phrased the ‘C in head CS C in 2022’ line backwards from what I thought I was saying. In my mind I was saying the arguments for the simplicity of C had less to do with the language as it exists now and more to do with the closeness of the language to some ISA/actual machine’s operating characteristic in the past. It was that closeness that enabled (or necessitated) C’s simplicity.

It wasn’t about the changes the committee has made to the language since the 1990’s. The tie in point was (in my mind) implied to be that now that C has become even less of a mapping to actual hardware the simplicity sought in a C replacement should look different, but commenters seem to be looking at a very surface level.

jstimpfle · on Oct 17, 2022

Consider that some of the C23 changes have in fact cleared up ambiguities and removed support for ancient hardware. So I find that the renewed pushes behind the standard are not all promises of eventual disintegration. There are a few accomplished and considerate people in WG14. If you visit the page I linked above you'll get a sense of that.

I never particularly cared about ISAs - when I think about the "simplicity" and "closeness to hardware" I mean the control over data layout and control flow. Data layout and architecture of the global data flow is often the key to 90-100% of the performance gains you can expect; and more or less stable binary interfaces mean interopability and modularity, minimizing ecosystems lock-in. Where you need to use a certain instruction is the rare case, and you are recommended to code it in assembly or using compiler extensions.

pjmlp · on Oct 17, 2022

I did not assert that C23 was it, rather that is how it looks like going forward.

What did not land in C23, will land in C26, C29, ....

hgs3 · on Oct 17, 2022

It is conservative, except for nullptr which duplicates NULL. This violates C's own charter of "provide only one way to do an operation."

ohgodplsno · on Oct 17, 2022

Because NULL is dogshit. It's a #define 0. That's not one way to do an operation, that's one way to do an operation, horribly badly. That int on your stack ? Sure it can equal NULL. Hope that wasn't the result of (2 - 2).

hgs3 · on Oct 17, 2022

In C the NULL macro can be defined as either (void*)0 or 0. It's only mandated as 0 in C++.

The nullptr concept was introduced into C to fix a type ambiguity when NULL is used with generic selection or varargs functions. The ambiguity could have been solved by mandating that NULL be defined as (void*)0. My issue with nullptr is its an overkill solution that unnecessarily duplicates the concept of NULL in the language.

jstimpfle · on Oct 17, 2022

I agree, it should have been (void*)0. I doubt that nullptr_t will see much use (as much as _Generic is a fringe addition), but we'll find out.

loup-vaillant · on Oct 17, 2022

Well, since 0 is guaranteed to compare equal to the null pointer, my current code compare my pointers to it directly:

  if (ptr != 0) { foo(*ptr); }

The type mismatch is ugly, but that saves me an include (this particular code minimises its dependencies to maximise portability).

qalmakka · on Oct 17, 2022

    - A stream is something you can play as a sound. It may be loaded from a file or generated on the fly. It may be finite or infinite. One important thing is that a stream is single-shot: it doesn’t have a restart method or anything like that.
    - A channel is something where a stream is actually played. You can request the channel to stop, or you can replace the stream this channel is playing.

Am I mistaken, or isn't this very close to what GStreamer already does with sinks and sources?

tonyarkles · on Oct 17, 2022

Yeah, very similar model for sure. I say this having used GStreamer in anger though: the complexity of Glib and GObject makes OP’s API quite a bit nicer to use. There are tradeoffs though; while GStreamer is complex and error-prone, it does have two big pieces that OP’s approach doesn’t:

- metadata in the streams. GStreamer streams have rich data attached to them that allows both sides of the chain to automatically negotiate the specific format and properties of the data in the stream.

- introspection. Given an opaque/genetic GObject pointer to an element in the processing graph, you can query its configuration and sources/sinks without knowing what the object is up front. This ultimately can allow you to walk the entire graph if you want and inspect all of the elements.

lisyarus · on Oct 17, 2022

True! But also my use case it quite different compared to GStreamer, and I don't really need all that metadata & introspection.

qalmakka · on Oct 17, 2022

> the complexity of Glib and GObject makes OP’s API quite a bit nicer to use

Yeah, Glib and GObject are a massive PITA, thankfully both have C++ bindings and can be used from Vala, so it's not that bad in practice. The C API is a nightmare though.

lisyarus · on Oct 17, 2022

It pretty much may be!

chaosprint · on Oct 17, 2022

"All my engine is in C++, and I’m really sick of SDL_DoThatThing(&options, pointer_to_void, size_in_magical_units, &callback, user_data, legacy_value) (sorry for all C admirers, I do respect you, it’s just that I’m not one of you)."

How about Rust audio (https://github.com/RustAudio) and then FFI?

I am developing the audio lib for the music live coding language I design in Rust and I think the experience of Rust is really good, because of `cargo` (the package management), the community, the syntax and compiler, the ergonomics, and so on. For one of my usages, the audio lib in Rust compiles to wasm and runs in browsers smoothly. I can also write VST plugins with the same audio library in Rust:

https://github.com/chaosprint/glicol

There are also many other Rust audio libs such as dasp, fundsp and hexosynth, all worth checking out. Before I became addicted to Rust audio, I had checked some C++ audio project, such as the Maximillian lib especially the JS bindings, the source code of SuperCollider and the JUCE for VST. But apparently I have made my decision for the reasons abovementioned.

lisyarus · on Oct 17, 2022

Adding a whole new language (meaning compiler, packaging, libs, ecosystem) into my engine just for the audio lib which I _wanted_ to implement myself in the first place? No, thanks.

UncleEntity · on Oct 17, 2022

I think you’re meant to rewrite your whole game engine in rust to make it easier.

secondcoming · on Oct 17, 2022

You insensitive clod!

pjmlp · on Oct 17, 2022

One does the same approach as Rust, wrap the unsafe C stuff into type safe C++ wrappers.

Which the author did quite alright, minus the issue I discussed on separate comment.

You will have better luck moving the audio industry from bad C practices into modern C++, than making most learn a complete new language.

JUCE rules in the audio industry for example.

chaosprint · on Oct 17, 2022

This argument can apply to every application scene of Rust: no need to learn a new language. Yet the fact is that the Rust community is growing rapidly, from audio to embedded devices. I think the key is how much you can get from the investment, and I totally understand if people want to stick to the ecosystem they are already familiar with, a completely different story compared with a beginner.

Also, I don't think the JUCE "rules in the audio industry". First, the most typical use case for JUCE is to write VST plugins. Second, the company ROLI was trying to develop another language called SOUL. Although its progress seems to stop now, it somehow shows that they are trying to provide something different from C++.

pjmlp · on Oct 17, 2022

It is growing, it is like going through the C++ adoption as I did 30 years ago, and still there are domains it could never take over from C.

So while it is nice for the security of the IT stack, we also need to keep things in perspective.

Narew · on Oct 17, 2022

Doesn't you will have the same issue, a C-like interface. Maybe I missed something, but Rust and ffi mean you use a C interface from C++ ? The only C++<->Rust I know is either using C intermediate code or using cxx.rs but the support is limited ?

chaosprint · on Oct 17, 2022

Yes. But we can do it in the last step, right?

    #[no_mangle]
    pub extern "C" fn process(
        in_ptr: *mut f32,
        out_ptr: *mut f32,
        size: usize,
        result_ptr: \*mut u8
    ) {
        let _in_buf: &mut [f32] = unsafe { std::slice::from_raw_parts_mut(in_ptr, size) };
    }

juunpp · on Oct 18, 2022

> The stream interface is the core of the library, so let’s discuss it first. It is remarkably simple

It's interesting (odd?) that the interface doesn't just expose a function to get the next value, similar to a lazy list in Haskell, which also goes well with the original statement in the post that a stream is basically a function. The 'sine_wave' stream for example is 90% filling the array and 10% actual stream logic, which I imagine you're duplicating in other stream implementations.

lisyarus · on Oct 18, 2022

That's for efficiency reasons: most streams have way more logic than just filling the array, which can hardly be represented in terms of "getting the next value", and loops are good for optimization by compiler, etc. In fact, the actual sine wave implementation looks exactly the way you describe: it handles a generating function to a helper class that just fills the array https://bitbucket.org/lisyarus/psemek/src/master/libs/audio/...

urban_winter · on Oct 17, 2022

I wonder if, in retrospect, the decision to use floats was a good one. The author mentioned issues with the time of each sample for the sine wave, which were float-related. I get that audio effects (compresser, reverb etc) are probably easier using floats but I don't immediately see why it's better to have float as the core data structure and convert to int at the end rather than having int as the core and convert to/from float only when needed.

lisyarus · on Oct 17, 2022

The alternative that my older code used was int16, and it has way bigger precision issues (32-bit float has 23 bits of precision, and in16 has 15). Fundamentally audio samples are just numbers; the fact that they have to be in the [-1..1] range in the end is a hardware technical limitation (the speakers' membranes can only vibrate so much), but there's nothing in the audio itself that says it should be constrained to some range. So, floating-points are a nice fit. Fixed-point math like 16.16 might be a good alternative, but floats have hardware support, so here we go.

PennRobotics · on Oct 17, 2022

The smallest nitpick: The accepted variable for frequency is f or nu (ν). The formula and units are correct in the article, but lowercase omega is understood as angular frequency and would take the place of 2πf. I also got 3 hours of sleep, so I could be wrong.

Also, thanks for the great article and a possible alternative to my own SDL_mixer woes!

lisyarus · on Oct 17, 2022

Indeed, thanks!

dspig · on Oct 17, 2022

Yes it's a good decision, and 'the standard' for audio in the same way as 32-bit ARGB makes everything easier for graphics. Of course there are some things that are better done with ints or maybe doubles, like keeping track of the time position, but for the sine wave generator it doesn't need to know how long it's been running in total, just the position in the current cycle, so floats are fine for that.

p0nce · on Oct 17, 2022

Freedom from overflow, and noise floor stays level-dependent. Meaning that when you multiply your float signal by a certain volume the quantization noise is at most -140 dB RMS. If it were integer, the quantization noise would depend upon how loud your (signal x gain) is. Besides, top-level compressor/reverbs use double :)

KayEss · on Oct 17, 2022

One big reason is that when you add two samples together, if you overflow a signed int that's UB, but if you go over 1.0 on a float it's just business as usual. This means you can do a ton of audio processing and only have to deal with fixing the overflows in one place right at the end (they use a compressor, but other algorithms are also possible).

pjmlp · on Oct 17, 2022

Very nice article, however in 2022, why using C arrays as parameters in C++ code, when std::span, std::vector exist?

lisyarus · on Oct 17, 2022

std::vector would definitely be the wrong choice, since I typically want to pass random subsections of random arrays to be filled by audio samples. std::span would be a good choice, though (I actually have my own span in the engine, from before C++20). Why didn't I use it? No idea, just didn't come to my mind. Guess it might be the influence of the book I was reading, or of SDL_Mixer interface, etc. I may do some refactoring in the future and use spans instead.

pjmlp · on Oct 17, 2022

Thanks for the feeback.

I only raised the issue, because this is how new programmers keep doing mistakes in C++ when learning the code from others, specially projects that they find are cool to learn from.

As you seem to use best practices in the rest of the code that seemed strange to me.

throwaway17_17 · on Oct 17, 2022

SDL takes array pointers as arguments for its audio output functions in general, so I would assume they were coding against the SDL Audio API.

pjmlp · on Oct 17, 2022

No reason to use them on C++ method definitions.

    void engine::callback(std::span<float> output)
    {
        std::size_t samples = 0;
        if (stream_)
        {
            samples = stream_->read(output.data(), output.size());
            if (samples == 0) stream_ = nullptr;
        }
        auto remaining = output.subspan(samples);
        std::fill(remaining.begin(), remaining.end(), 0.f);
    }

planede · on Oct 17, 2022

std::span is new enough to not be in muscle memory of C++ programmers.

Even the standard library commits a similar sin with `from_chars` and not taking a `string_view`.

gpderetta · on Oct 17, 2022

std::span is new, but I have seen some custom form of it in pretty much every codebase I've worked with for a very long time (and implemented some myself). Like it used be the case for std::string in older codebases.

pencilguin · on Oct 17, 2022

string_view would definitely have been wrong.

std::span did not exist in the Standard at the time std::from_chars was adopted.

An overload that takes std::span would be easy, and harmless. But as std::from_chars is only ever used from within some other abstraction, any benefit would be minimal.

planede · on Oct 20, 2022

Why would string_view be wrong there?

pencilguin · on Oct 20, 2022

Because, then, to parse out a number you would first need to get it into a string.