I love audio programming, it's one of those things that's provided me with a lot of joy. I've been working on our in-house audio library for our game engine as well, and designing custom effects for it. In our case, we wanted an 80s DSP sound, so it was fun diving into the old literature. In the end, a simple design won out: https://twitter.com/JasperRLZ/status/1567546255454371842
Do you have any references for the old literature you looked at while trying to capture that sound. I would love to add some of it to my reading list. Also, the processing and track itself in your video are fantastic.
Most reverbs these days are convolutional. I looked at those too! But they require an expensive FFT (usually run in a thread), and the quality sounded "too good" for the audio target I wanted to hit.
After trying a lot of stuff, my main primitive ended up being a delay line hooked up to a low-pass. The output sample is a delay line going through a low-pass. That output is mixed with the incoming sample, and shoved back into the delay. Effectively, this makes a comb filter. I run eight of these in parallel at slightly different delay lengths to comb different frequencies. I was trying to shoot for "digital blur" that had a bit of crunch and aliasing that I wanted, while still sounding like a reverb.
And the track was done by Jake Kaufmann, our composer at Yacht Club Games! I can't take any credit for it.
> All my engine is in C++, and I’m really sick of SDL_DoThatThing(&options, pointer_to_void, size_in_magical_units, &callback, user_data, legacy_value) (sorry for all C admirers, I do respect you, it’s just that I’m not one of you).
I started as a C++ programmer (back when C++0x and boost were all the rage), but the more time passes, the more of a C admirer I seem to become.
Same. I've written probably more C++ than any other language, and I have a lot of respect for the language overall, but I recently had to use only C for something, and I found it rather refreshing. My use case didn't hit a lot of the pain points of C, but it did feel refreshing as there are fewer ways of doing things, fewer options, compared to C++, and my code ended up feeling more straightforward and readable as a result.
Lambdas and some other complicated looking extensions have been on the plate for quite some time, and it seems like none of them made it into C23. Lambdas in particular have been promoted for the most part by a single person AFAICT. "C is becoming into everything C++ has" is a drastic misrepresentation.
I think I take the middle position between you and GP, C is slowly working it’s way towards unnecessary complexity. Most of the ‘X language is a C replacement’ posts and threads here on HN seem to disregard C’s development (particularly from 2017 onward) through committee standards releases. Everyone keeps arguing about Rust, Zig, Odin, etc and claiming one or the other is clearly more C like and the others are too complicated/expansive/growth-prone. But no one ever stops and claims that the simple C in there heads is not C in 2022, but C in 1990.
C is not yet at the point where added features, idioms, or concepts have totally removed the ability to return to or regularly restrict yourself to some ‘simple’ subset of the language. However, the push to add more and more to the standard does not inspire hope that this situation will remain.
So while I don’t think it’s there yet, it certainly looks like complexity via continual expansion is the path forward. I may not like it, but it does seem to be a reasonable supposition by GP.
I agree, there is some proposals in there that I doubt they fit well with the culture. However what has actually been accepted that makes you think C in the heads is not C in 2022? In my head, the biggest practical change from 1990 C is declare-anywhere (including declare-in-for-loop) which in my estimation has afforded a lot of added ergonomics at practically zero semantic cost. The next most relevant one is standardization of memory models, which is a good one AFAICT (but I used it only a little, so far). One unfortunate addition from 1999, VLAs, has even been demoted in the following revision.
In 2022 I can jump into basically any C codebase and be immediately productive. Which can't be said of some other languages.
I don’t think there is any one addition to the C that has moved C to far up the complexity hierarchy. It is the cumulative effect of all the proposals submitted, some of which are approved, and the movement to continue to add more. Like I said, C still enables developers to restrict themselves to a simple and long existing subset of features and idioms making coding in C a productive exercise for experienced developers. But the continued push from all directions, and with varying likelihood of being accepted, to change the language wear on me, to the point I continually want to just make a C99 clone with my preferred improvements and just use that instead.
Also, I suspect I phrased the ‘C in head CS C in 2022’ line backwards from what I thought I was saying. In my mind I was saying the arguments for the simplicity of C had less to do with the language as it exists now and more to do with the closeness of the language to some ISA/actual machine’s operating characteristic in the past. It was that closeness that enabled (or necessitated) C’s simplicity.
It wasn’t about the changes the committee has made to the language since the 1990’s. The tie in point was (in my mind) implied to be that now that C has become even less of a mapping to actual hardware the simplicity sought in a C replacement should look different, but commenters seem to be looking at a very surface level.
Consider that some of the C23 changes have in fact cleared up ambiguities and removed support for ancient hardware. So I find that the renewed pushes behind the standard are not all promises of eventual disintegration. There are a few accomplished and considerate people in WG14. If you visit the page I linked above you'll get a sense of that.
I never particularly cared about ISAs - when I think about the "simplicity" and "closeness to hardware" I mean the control over data layout and control flow. Data layout and architecture of the global data flow is often the key to 90-100% of the performance gains you can expect; and more or less stable binary interfaces mean interopability and modularity, minimizing ecosystems lock-in. Where you need to use a certain instruction is the rare case, and you are recommended to code it in assembly or using compiler extensions.
Because NULL is dogshit. It's a #define 0. That's not one way to do an operation, that's one way to do an operation, horribly badly. That int on your stack ? Sure it can equal NULL. Hope that wasn't the result of (2 - 2).
In C the NULL macro can be defined as either (void*)0 or 0. It's only mandated as 0 in C++.
The nullptr concept was introduced into C to fix a type ambiguity when NULL is used with generic selection or varargs functions. The ambiguity could have been solved by mandating that NULL be defined as (void*)0. My issue with nullptr is its an overkill solution that unnecessarily duplicates the concept of NULL in the language.
- A stream is something you can play as a sound. It may be loaded from a file or generated on the fly. It may be finite or infinite. One important thing is that a stream is single-shot: it doesn’t have a restart method or anything like that.
- A channel is something where a stream is actually played. You can request the channel to stop, or you can replace the stream this channel is playing.
Am I mistaken, or isn't this very close to what GStreamer already does with sinks and sources?
Yeah, very similar model for sure. I say this having used GStreamer in anger though: the complexity of Glib and GObject makes OP’s API quite a bit nicer to use. There are tradeoffs though; while GStreamer is complex and error-prone, it does have two big pieces that OP’s approach doesn’t:
- metadata in the streams. GStreamer streams have rich data attached to them that allows both sides of the chain to automatically negotiate the specific format and properties of the data in the stream.
- introspection. Given an opaque/genetic GObject pointer to an element in the processing graph, you can query its configuration and sources/sinks without knowing what the object is up front. This ultimately can allow you to walk the entire graph if you want and inspect all of the elements.
> the complexity of Glib and GObject makes OP’s API quite a bit nicer to use
Yeah, Glib and GObject are a massive PITA, thankfully both have C++ bindings and can be used from Vala, so it's not that bad in practice. The C API is a nightmare though.
"All my engine is in C++, and I’m really sick of SDL_DoThatThing(&options, pointer_to_void, size_in_magical_units, &callback, user_data, legacy_value) (sorry for all C admirers, I do respect you, it’s just that I’m not one of you)."
I am developing the audio lib for the music live coding language I design in Rust and I think the experience of Rust is really good, because of `cargo` (the package management), the community, the syntax and compiler, the ergonomics, and so on. For one of my usages, the audio lib in Rust compiles to wasm and runs in browsers smoothly. I can also write VST plugins with the same audio library in Rust:
There are also many other Rust audio libs such as dasp, fundsp and hexosynth, all worth checking out. Before I became addicted to Rust audio, I had checked some C++ audio project, such as the Maximillian lib especially the JS bindings, the source code of SuperCollider and the JUCE for VST. But apparently I have made my decision for the reasons abovementioned.
Adding a whole new language (meaning compiler, packaging, libs, ecosystem) into my engine just for the audio lib which I _wanted_ to implement myself in the first place? No, thanks.
This argument can apply to every application scene of Rust: no need to learn a new language. Yet the fact is that the Rust community is growing rapidly, from audio to embedded devices. I think the key is how much you can get from the investment, and I totally understand if people want to stick to the ecosystem they are already familiar with, a completely different story compared with a beginner.
Also, I don't think the JUCE "rules in the audio industry". First, the most typical use case for JUCE is to write VST plugins. Second, the company ROLI was trying to develop another language called SOUL. Although its progress seems to stop now, it somehow shows that they are trying to provide something different from C++.
Doesn't you will have the same issue, a C-like interface.
Maybe I missed something, but Rust and ffi mean you use a C interface from C++ ?
The only C++<->Rust I know is either using C intermediate code or using cxx.rs but the support is limited ?
> The stream interface is the core of the library, so let’s discuss it first. It is remarkably simple
It's interesting (odd?) that the interface doesn't just expose a function to get the next value, similar to a lazy list in Haskell, which also goes well with the original statement in the post that a stream is basically a function. The 'sine_wave' stream for example is 90% filling the array and 10% actual stream logic, which I imagine you're duplicating in other stream implementations.
That's for efficiency reasons: most streams have way more logic than just filling the array, which can hardly be represented in terms of "getting the next value", and loops are good for optimization by compiler, etc. In fact, the actual sine wave implementation looks exactly the way you describe: it handles a generating function to a helper class that just fills the array https://bitbucket.org/lisyarus/psemek/src/master/libs/audio/...
I wonder if, in retrospect, the decision to use floats was a good one. The author mentioned issues with the time of each sample for the sine wave, which were float-related. I get that audio effects (compresser, reverb etc) are probably easier using floats but I don't immediately see why it's better to have float as the core data structure and convert to int at the end rather than having int as the core and convert to/from float only when needed.
The alternative that my older code used was int16, and it has way bigger precision issues (32-bit float has 23 bits of precision, and in16 has 15). Fundamentally audio samples are just numbers; the fact that they have to be in the [-1..1] range in the end is a hardware technical limitation (the speakers' membranes can only vibrate so much), but there's nothing in the audio itself that says it should be constrained to some range. So, floating-points are a nice fit. Fixed-point math like 16.16 might be a good alternative, but floats have hardware support, so here we go.
The smallest nitpick: The accepted variable for frequency is f or nu (ν). The formula and units are correct in the article, but lowercase omega is understood as angular frequency and would take the place of 2πf. I also got 3 hours of sleep, so I could be wrong.
Also, thanks for the great article and a possible alternative to my own SDL_mixer woes!
Yes it's a good decision, and 'the standard' for audio in the same way as 32-bit ARGB makes everything easier for graphics.
Of course there are some things that are better done with ints or maybe doubles, like keeping track of the time position, but for the sine wave generator it doesn't need to know how long it's been running in total, just the position in the current cycle, so floats are fine for that.
Freedom from overflow, and noise floor stays level-dependent. Meaning that when you multiply your float signal by a certain volume the quantization noise is at most -140 dB RMS. If it were integer, the quantization noise would depend upon how loud your (signal x gain) is.
Besides, top-level compressor/reverbs use double :)
One big reason is that when you add two samples together, if you overflow a signed int that's UB, but if you go over 1.0 on a float it's just business as usual. This means you can do a ton of audio processing and only have to deal with fixing the overflows in one place right at the end (they use a compressor, but other algorithms are also possible).
std::vector would definitely be the wrong choice, since I typically want to pass random subsections of random arrays to be filled by audio samples. std::span would be a good choice, though (I actually have my own span in the engine, from before C++20). Why didn't I use it? No idea, just didn't come to my mind. Guess it might be the influence of the book I was reading, or of SDL_Mixer interface, etc. I may do some refactoring in the future and use spans instead.
I only raised the issue, because this is how new programmers keep doing mistakes in C++ when learning the code from others, specially projects that they find are cool to learn from.
As you seem to use best practices in the rest of the code that seemed strange to me.
std::span is new, but I have seen some custom form of it in pretty much every codebase I've worked with for a very long time (and implemented some myself). Like it used be the case for std::string in older codebases.
std::span did not exist in the Standard at the time std::from_chars was adopted.
An overload that takes std::span would be easy, and harmless. But as std::from_chars is only ever used from within some other abstraction, any benefit would be minimal.