- When working with frequencies alone and ignoring the phases you kill half of the entropy of the signal.
- Further more, when only the positive half of the frequencies is used and the negative half is mirrored with the complex conjugate, you kill half of the remaining entropy of the signal.
Your first point is spot on, but the second is not: for the signal to be real valued in the time domain, the negative and positive frequencies have to be redundant in the complex valued frequency domain.
You are right, let me rephrase it: You would loose half of the remaining potential to encode entropy.
Of course, if you start with a real valued signal, then you already lost that before even getting to the transform part. Or in other words you have to transform a signal twice as long as it needs to be, because half of it is mirrored and then discarded (I assume).
The goal is to generate a random spectrum that corresponds to a real-valued signal. If the positive and negative frequencies aren’t conjugate-symmetric then the resulting signal after the IDFT will be complex-valued.
You can think of it in terms of degrees of freedom. A real-valued length-N signal has N degrees of freedom. A length-N spectrum is complex-valued, meaning 2 degrees of freedom per frequency bin. When you constrain it to be conjugate-symmetric you bring the degrees of freedom back to N, which matches the real-valued signal.
Basically yes, but his statement is mixing up things a bit. The Discrete Fourier Transform transforms a complex vector of length N to another complex vector of length N. In the latter, each of the N elements corresponds to a frequency "bin" and, like complex numbers in general, can be represented either by a real and an imaginary part (like Cartesian coordinates), or by a phase (==atan(Im/Re)) and a magnitude (==sqrt(Re^2+Im^2)) (like polar coordinates). Obviously, whichever you choose, you need both components for unambiguous representation.
The FFT segment starts at 4:30 (linked), the specific part about phase and magnitude is at 5:45, and the part about negative vs. positive frequencies starts at 6:35.
Very nice video, the 3d animations and colors make it look very professional. What tool(s) did you use to create the video? Ruby? OpenGL? A custom synthesizer?
All of the above. I have a custom animation system written in Ruby that talks to some visualization software I wrote using C and OpenGL. The synthesizer is also in Ruby, and the core audio code for that is on GitHub. I use Inkscape for overlay graphics, and reboot into Windows for VEGAS for editing and compositing.
You can try the synthesizer and other audio code if you're using Linux.
Here's an earlier version of the synthesizer (licensed under AGPL3). The MIDI CCs for controlling different parameters are listed in the source code. You'd want to clone the repo, run through the installation instructions in the mb-sound repo, do a bundle update mb-sound, then run bundle exec bin/complex_synth.rb. https://github.com/mike-bourgeous/mb-surround/blob/3823de44a...
I don't plan on making the visualizations available, in part because the system is too convoluted and they probably only work in my specific environment.
No, just MIDI CC controls. I control it with a hardware keyboard, but anything that can generate note events and MIDI CC changes can be wired in with jackd and a patchbay like Patchage.
To add a DIFFERENT point on handling phase:
Overlap-add STFT actually has phase dependencies between adjacent patches. So generating independent random phase in frequency space tends to produce incompatible phase in adjacent patches. In audio the audio domain, this leads to audible distortion.
What's (often) used in the audio domain is Griffin-Lim, in which you apply the ISTFT and STFT repeatedly to 'smooth out' the phase inconsistencies. It typically takes a long time to converge and is still not quite right. The main alternatives for audio are the new neural vocoders. But they are expensive to train, and it's not terribly clear to me that it's any better than using an existing blue noise algorithm for this specific problem.
I studied physics when I was young and my first reaction was 'sure you can!' and 'I have done that kind of stuff some 20 years ago or so'. What looks to be the problem is that he wants a rather non-physical kind of noise. The noise is supposed to be between 0 and 1 distributed uniformly. Now, that really is not the kind of noise one expects or wants in physics in most cases. The values that are distributed in a gaussian way would seem to be much more sensible. O well, I suppose the thing he calls 'blue noise texture' really cannot be generated very well in frequency space....
If you do the phases wrong you will get pulses instead of noise. Obviously if you do the phases right you can do anything, because the Fourier transform is one to one.
Sure for any specific signal x you can craft the corresponding signal X so that ifft(X)==x, but the idea here is to generate a random signal with certain properties (blue spectrum and uniformly-distributed time-domain values). This is the tricky bit.
The "Composers Desktop Project" also exists standalone with a paid front-end I think. Not sure anyone but him can use it well though :). It also has a loooong history (starting 1986) https://www.composersdesktop.com/history.html
edit: his UI, sound loom, in "use" (at least you see how transforming to spectral space and re-synthesis are a thing, not much else though :).) : https://youtu.be/LypM6-WDjL8?t=620
No time to play with this at the moment but I suspect the problem is that the author didn't try generating the real and imaginary parts. You can't treat the magnitude the same way you treat amplitudes.
Also I don't see if the DC and Nyquist terms are forced to be real (otherwise the IFT won't be real)?
> First we make N complex values from polar coordinates that have a random angle 0 to 2pi and a random radius from 0 to 1.
This is what jumped out to me as suspicious. This looks just like a naive (and wrong) algorithm for generating random points in a circle. The simplest correct way to do this in the circle case is rejection sampling. In this context, that would mean: generate a random real and imaginary part to get z, and retry until |z| ≤ 1.
The goal is not well-defined here. If the goal is to generate uniformly-distributed points in a circle then this algorithm is wrong, but it’s not clear that’s actually what they want.
Generating a spectrum with a given magnitude distribution and uniform phase distribution is pretty common (at least in audio).
Yes, though e.g. if you want white nose you'd use a Rayleigh distribution for magnitude rather than a Gaussian as you would use for real and imaginary parts.
> I suspect the problem is that the author didn’t try generating the real and imaginary parts.
I’m not sure I understand what you mean. He definitely does describe generating a complex number, and ensuring the imaginary parts are designed to produce an output image with real-only values after the FT. Can you elaborate on how magnitudes and amplitudes are being conflated?
The problem, as I understand it, is the output is Gaussian distributed rather than uniform, not a simple bug or misuse of the DFT like you assume. Perhaps this implies that a using white noise source in the frequency domain is the issue, maybe the forward transform of blue noise is not white along the outside of frequency space, and so generating a frequency space image using white noise might not be expected to work?
All I mean is that the statistics of the magnitude are different than the author expects, probably. You can certainly generate complex numbers with the magnitude, but picking from a uniform distribution is unlikely to generate what you want.
Also, it will certainly be a problem if the DC/Nyquist terms aren't pure real, though maybe that is actually being done and I missed it.
> picking from a uniform distribution is unlikely to generate what you want.
I see now, and I think this is a good point. Maybe I just said almost the same thing. Without looking at frequency plots, if I just think about what I expect to get with a DFT of high-pass filtered white noise, I’d assume it’s blue-ish in the sense of having less low frequency, but I don’t think I would expect it to produce the same perfectly even spread that the void-and-cluster algorithm produces. It seems like this implies that a forward DFT of a void-and-cluster blue noise texture has some structure that might be hard to see in frequency space, we can’t assume frequency noise that looks white really is white, and maybe there is a relationship between the frequency and phase components that just aren’t met by picking a random angle & radius.
Yeah, looking up void and cluster, I think it's clear the phases won't be uncorrelated between frequencies, though may not be easy to describe in the frequency domain.
This is for the 1D case, but since the Fourier transform is separable, it works identically in 2 (and any N) dimensions by performing the transform sequentially in every dimensions.
You're pointing to something saying that power-law distributed noise can be generated by filtering white noise (and doing the filtering in the frequency domain, but that's really an implementation detail). OP actually wants to only do the inverse transform, generating the noise in the frequency domain already.
There is an interesting relationship between frequency domain filtering and the distributional properties of a signal, which I believe the author encounters:
So interestingly, the IDFT method makes noise that is gaussian distributed. This kind of makes sense because we are filling out frequencies as uniform random white noise, which are turning into uniform random white noise sinusoids that are being summed together, which will tend towards a gaussian distribution as you sum up more of them. In contrast, the void and cluster method makes uniform distributed values which are perfectly uniform.
One of the papers I'm most proud of co-authoring explains some aspects of this phenomenon [0] through the use of higher order spectra (the bispectrum, trispectrum, etc...) and how the geometry of frequency-domain filters affects skewness and excess kurtosis.
Author seems to be looking to generate a blue noise texture for image sampling. I'm not familiar with them but it seems to be blue noise which also has uniformly distributed values in time domain. Generating white noise in frequency domain and multiplying with a frequency shape mask can generate noise with any frequency distribution, but it does not fill the uniform distributed values in time-domain requirement.
If there are no other requirements than the frequency spectrum then generating the noise in frequency domain works fine.
What are the use cases of generating different kind of noises (e.g. white, blue and red)? I've only heard about them as background noises to fall asleep or focus while working/studying.
Is there something ininformation or signal theory that benefits from creating truly random (or random but with some structure to it like blue noise not having lower frequencies) noise? As a chemist, I always model real-world noise with a gaussian distribution, so I don't really get where this could br used.
One reason to use Blue noise in graphics / games / images is anytime you need a random number per pixel for, say, some Monte Carlo process, that will have a visible effect on the image and result in visible noise in the output. Using blue noise, the output will be visibly less “noisy” looking than when using white noise, due to blue noise having no low frequencies.
Blue noise is best for situations where you only have the budget for 1 or 2 samples per pixel. The very low sample counts is where it performs considerably better than white noise. If you integrate with many samples, tens or hundreds or more, then the advantage of blue noise over other “colors” diminishes.
Blue noise is useful for procedural generation, because it's what the human eye will naturally consider the “most randon” and the most pleasant. IIRC red noise appears when you have random walks or Brownian motion.
> As a chemist, I always model real-world noise with a gaussian distribution
Gaussian distribution is what you get when you have a lot of independent random events adding up (it's the central limit theorem), so it applies a lot in real life but it can also cause modeling issues if it's not how randomness appears in your system. (Please note that I have absolutely no knowledge on chemistry so I have no idea if it's something that could be relevant in your field)
Blue noise is useful in games for generating more natural looking scenes. Here's an example of how to algorithmically place grass in the game "The Witness".
Chemist here too but my ex was an sound consultant for an architectural firm. Iirc, different kinds of noises essentially are defined by the average amplitude (energy) vs frequency function of the rng. For architects, it can be important to model noise to ensure your structure complies with for example OSHA regulations. There can be code regulations too if you're in an urban place and say putting residential on top of street level commercial spaces, especially bars that might be open late. Finally, for gathering/events spaces you want to have a situation where a PA/presentation system doesn't have to work hard to go over conversation noise.
The energy profiles for all of these scenarios is empirically determined and "well known" in the community.
The ex helped model the "death star" auditorium space at the academy of motion picture arts and sciences in LA. IIRC it sounds really aquarium-ey during normal times and during events they roll out strategically placed dampeners.
Also chemist here. Think about it the other way around, not synthesis but analysis. For example: "The term "Brown noise" does not come from the color, but after Robert Brown, who documented the erratic motion for multiple types of inanimate particles in water." (https://en.wikipedia.org/wiki/Brownian_noise)
Side note; what does a self described chemist do these days? Seems like Breaking Bad spurred a lot of interest in the field, but as someone who minored in chemistry many years ago, the actual job prospects seemed limited. And research seemed very stodgy (to me at least). If you don't mind sharing, what's your background and what types of things are you working on?
Chemists do the same things they always have, just for much less money than the same people would make today in software engineering. Here are some types of applied chemistry:
- Deciding what ratio to mix things in for the countless liquid products that are combinations of already-discovered chemicals. Everything at the grocery store that comes in a jug or a bottle (cleaning products, hair conditioner, drain cleaner) falls in to this category.
- Doing industrial research to improve existing processes. This would include discovering new catalysts, and trying out the endless permutations of solvents and conditions in which existing reactions take place, to optimize them for whatever the biggest operating costs are.
- Figuring out how to recycle industrial chemicals and get the valuable stuff back out of sludge and effluent. This is a surprisingly big field with important consequences.
- Working on specialty materials, like plastics and synthetic rubbers, that are not completely dissimilar from existing products but require a chemist to design them for specific, demanding applications.
The fact that everything involving the physical world gets you paid way less for work that's way more difficult than programming will come back to bite us somehow, but it's hard to say when.
> for work that's way more difficult than programming
That just reads as “I am better than you”.
Programming is arbitrarily difficult, and it rarely has an objective reality to measure against. I have some kind of “Peter Principle” in mind where people continuously reach their own personal limits of complexity. However they then expand their skills in an ill-defined problem space to level up and beat their prior limits. Programmers display a massive variety of talent, and most of the talent is invisible because the problem space is heinously deeply complex and outside observers only see glimmers on the surface.
Or is your implication that idiots and/or lazy people choose to be programmers? Certainly there is a share of them in the discipline too: perhaps smart for chasing money and perhaps even content with the challenges they face personally.
Regardless, you are being very judgmental towards programming.
Your own descriptions of your discipline could be perceived as rote work or make work, on the same level as making another CRUD app for a business, that doesn’t require deep applied intelligence. How people apply their skills and intellect to the problem defines how hard the problem is.
I personally have experienced the more-money-less-difficulty science-to-coding gradient. You can say that programming can be arbitrarily difficult, which is true, the reality is that it usually isn't arbitrarily difficult, and corporations are happy to pay non-geniuses lots and lots of money to reliably design systems that work with zero marginal cost, whereas if you want to make $160k as a ground-level researcher, making 1% improvements to products with very substantial marginal costs, using million-dollar equipment to do your job, you'd have to be a rare intellect indeed, and beyond that be great at selling your own value.
In software, there's no such thing as marginal cost, the only tool you need is a $300 laptop, and all you need to know can be read for free online. That's an amazing world, from an economic standpoint, and it's no surprise at all that it's so much easier to make money that way.
I'm not a chemist. (Worse, I'm a physicist). But I have several relatives who are chemists. And I work for chemists. I think there's an issue right now, which is that market demand for computer programmers has created a distorted view of all other occupations. That's not going away any time soon, and if it does, there will be some other "hot" occupation.
I think what makes people want to become natural scientists is a genuine interest in how things work, plus either an innate or learned ability to "think" in a certain way that works for their field of study. I don't have a good way to describe it, but a sense that a chemist thinks like a chemist. A scientist is obsessed with learning how things work. The different fields are different approaches to finding that out, that work for their respective domains. Trying to figure out how a frog works by thinking like a physicist will result in a lot of dead frogs.
The other post mentions food science. Stodgy, yes. Fascinating, you bet. Food isn't going away. The problems of making food abundant, pure, healthy, safe, and ecological, are going to get more and more challenging. It can be stodgy because we have to control our impulse to try dangerous experiments on human subjects, or make a mistake that brings down a production line or triggers a recall. But oddly enough there are people who get their excitement out of working within that constrained environment.
You have to embrace the stodge. Something I've noticed about chemists, is that they tend to have the best discipline about running controlled, repeatable experiments. They keep the best notebooks.
Chemistry is closely related to materials science. Any realistic development of a material beyond the basic research phase will require the involvement of a chemist. Likewise drug manufacturing, etc.
I don't think Chemistry in and of itself is a thing, or was it ever. It always has to be applied to something, and then the possibilities are endless. After all, everything is made of chemicals, isn't it? Personally, I'm in food, I also have an MSc in Food Engineering, and currently preparing to start a PhD in that area (NIR spectroscopy, Hyperspectral Image Analysis). But there is also the oil, pharma, bio, environmental, etc. areas. (Also, in British English "Chemist" means "Pharmacist". I always wondered, what they called a Chemist?)
> Also, in British English "Chemist" means "Pharmacist". I always wondered, what they called a Chemist?
(Brit here). A shop that sells pharma products is often colloquially called 'a chemists' but the people who work there are typically referred to as pharmacists. 'Chemistry' as studied at school and Uni is totally about chemicals in general and not drugs (which would be pharmacology). Someone who described themselves as a chemistry student or professor would almost inevitably be perceived as someone who is working with chemicals.
Most jobs in industry are in quality control for medicines, food, paint, cleaning products, etc. Most industries need at least one chemist to analyse input and output to your processes. Then you have academia/teaching.
Myself, I'm about to finish my PhD in materials science and discover what I can really pivot to, wether related to materials/chemistry or using what I've learned about data analysis for any other area.
The difference is the spectral power distribution of the noise.
White noise has a flat spectrum from low to high frequencies. It has the standard noise sound that most of us recognize from digital systems.
Pink noise is shaped with a decreasing power as frequency increases. This results in more low frequency noise and less high frequency noise. This noise pattern occurs frequently in natural systems.
Different colors of noise have different sounds to our ears, of course. The color naming scheme is loosely intended to map to light spectral distributions. For example, blue noise has a rising power with increasing frequency, similar to how blue light has more energy at higher frequencies (shorter wavelengths).
Different spectrums can be tailored for signal processing schemes that want to add more noise at frequencies that need it and remove the noise with a filter when it has served its purpose.
For example dithering adds enough extremely high frequency noise to obliterate unwanted patterns, then applies a lowpass filter that keeps all the "good" signal and removes the noise.
For one real application of blue noise in particular, see https://youtu.be/Ge3aKEmZcqY?t=1350 (22:30 is a good starting point). Here Casy explains a grass planting algorithm for games, where white noise wouldn't be suitable.
Sometimes certain measuring instruments can be assumed to have such profiles for noise, in this case generating noise is useful for Monte Carlo based modeling or uncertainty propagation.
I wonder if there’s some iterative algorithm that would work here.
When synthesizing audio from the short-time Fourier transform (STFT) sometimes you have unknown or noisy phase, and there’s an algorithm called Griffey-Lim that’s pretty common for finding the corresponding time-domain signal. You start in the STFT-domain and continually swap between time and STFT domains, each time fixing the STFT magnitudes. Eventually the phase tends to converge (but not sure if that’s guaranteed).
Maybe there’s something similar here where you keep swapping back and forth while applying the blue spectrum and uniform sample histogram constraints (or partially applying them in a gradient-descent fashion).
(Side note that there are other/better phase estimation algos for this problem, but Griffin-Lim is simple and relatively common)
Can you link to a paper, summary, or textbook with this algorithm? I'd like to learn more, but searching for "Griffey-Lim" only returns this HN thread, and a bunch of very unrelated news items about money laundering.
That is Griffin-Lim, which matches the second mention but not the first. It's a fairly well known algorithm for reconstructing speech from spectrum, though I believe is largely superseded these days by vocoders based on neural techniques. (Actually I just checked and the original Tacotron paper cites Griffin-Lim; I do think newer neural TTS approaches have gone beyond it, however)
The prior post I link below suggests that what’s going on here is we are trying to find good points to sample an image. A challenge here is that the algorithms to generate the blue noise distributed sample points are slow. This motivates using an FT signal constructed by hand and then inverting this somehow to get sample points more quickly. But… he’s finding that the result doesn’t place sample points uniformly throughout an image. Is that right?
How does one construct discrete sample points from the inverted FT?
Yep, you’ve got the right idea - the implied goal of trying to use the FT is to make blue noise texture generation faster, but it doesn’t seem to work using the DFT. (Or maybe there are unmet constraints on how you need to generate the frequency space texture.)
> How does one construct discrete sample points from the inverted FT?
This is a good question! So one example the author has (in a different post) is how to use blue noise textures for ray traced shadows. The idea is at every pixel of your output image, trace a ray into the scene, then when it hits something, trace a ray toward the light to see if the pixel is lit or shadowed. Normally, you’d use a random number generator to pick a random sample on the hemisphere of the surface the first ray hits, and shoot a new ray in this random direction. You can instead grab your two random values from the red and green channels of a blue noise texture. The blue noise texture could have the same size as your desired output image, and you would use the same pixel id in the blue noise texture as the pixel id of your output image.
That example is pretty easy, but I found out it can be surprisingly tricky to use blue noise textures in other ways, there are a limited number of ways you can use a blue noise texture effectively, and it’s not always an easy drop-in replacement for a white noise random number generator. It’s much harder to use blue noise for multiple samples per pixel, for example. It’s harder to create a sequence of blue-distributed random numbers for use in a single integral. Another way to say that is that it would be difficult to generate your camera rays using blue noise.
Didn't read in detail and had a hard time understanding what their goal was but there are important things to understand regarding Fourier features (eigenfunctions, sin, cos, etc) and the Gaussian Kernel
Fourier theory is built on the assumption of everything being linear, and I suspect it can be hard to fit the constraint of 0 <= v <= 1 into that framework. In audio they solve this by having headroom (not applicable here) or using dynamic range compression (might work), but that will change the frequencies a little bit. Histogram equalization could be something to try. It forces the values to be uniformly 0 to 1.
This could also solve the issue of
> the problem with the IDFT method is though… you get gaussian distributed values, not uniform, and the noise seems to be lower quality as well. If these issues could be solved, or if this noise has value as is, I think that’d be a real interesting and useful result
So frequency-domain blue noise has a higher crest factor than void-and-cluster?
Also I'm interested to see how closely the mean intensity (at each radius) of void-and-cluster blue noise actually matches a sinc or sinc-squared subtracted from a constant.
I thought this was straightforward like in the audio domain? [1]
Now, not all noises of the same amplitude spectrum are created equal. For instance LFSR sequences are only ±1 yet they have a white spectrum. The difference lies in phase.
I wonder if generating random phases, then putting it through an optimizer to find lowest crest factor could work.
Oh, and can someone chime in whether void-and-cluster masks guarantee that all values are covered, just like in Bayer mask?
For what it's worth, in digital communications, OFDM you could say "is designed in frequency space" and then passed through a FFT/IFFT and the output time-domain signal is what is sent over the air.
Think about the direct problem: you start with white noise with a distribution g such that you can compute its power spectrum (e.g. if g is Gaussian) , and then convolve it with a positive kernel k. You can write down explicitly the histogram and the power spectrum of the result, in terms of g and k. Now work backwards from it. I can find a reference if you need it.
But after you convolve in the time domain it will change the distribution, right? Every point in the convolved signal will be the sum of a bunch of points in the original signal multiplied by the kernel, which would make their distributions more gaussian, right?
- When working with frequencies alone and ignoring the phases you kill half of the entropy of the signal.
- Further more, when only the positive half of the frequencies is used and the negative half is mirrored with the complex conjugate, you kill half of the remaining entropy of the signal.