No. "Dimensions" is the right word, because they are three orthogonal scales along which musical notes can be measured[1]. The author could also have suggested timbre and other possible dimensions, but the three stated apply to all sound, including (importantly) sine waves, the simplest type of sound.
[1] Technically, frequency is a function of time too (and timbre a function of the interaction of multiple frequencies and envelope changes, another function of time) but these are all independent uses of time.
Technically there are two complementary sets of dimensions: time and amplitude vs frequency and phase. Both are complete encodings of the waveform.
The article is extremely muddled from a technical point of view. When dealing with perceptions it is extremely important to distinguish physics and physiology. In optics we have radiometric (physical) vs photometric (perceived) values: https://en.wikipedia.org/wiki/Photometry_%28optics%29#Photom...
It appears in the article they are doing some kind of implicit averaging over the ear's response function at each frequency, which may make sense in terms of perceptions but makes very little sense in terms of physics.
A much better visual analog would be a blurred photograph rather than a cropped one. "Turning up the volume" simply increases the brightness of the images, which doesn't do a damned thing to reduce the blurring.
One thing that people with normal hearing don't get is how much information is in the high frequencies, which are where the most loss normally occurs, although there are also "notch" losses that happen to people whose ears are routinely subject to loud noises in narrow bands.
We tend to think of "high frequency" sounds in terms of single notes, but in speech the high frequencies are most important in the unvoiced constants, the "s" and "th" sounds and similar. Losing the high frequencies blurs the edges of speech, often making the shape of it unrecognizable. Frequency-dependent enhancement sharpens the edges and brings it back into useful focus.
> Technically there are two complementary sets of dimensions: time and amplitude vs frequency and phase. Both are complete encodings of the waveform.
Minor note, the "frequency and phase" is actually frequency and complex amplitude, which encompasses both phase and scalar amplitude as we think of it intuitively.
In the mathematical theory there is also provision for complex amplitude in the time-domain, but this is rarely needed in practice (and never found in real-world signals).
Not really. A sound can be described by a one or more [time, volume, frequency] triples. I think the author's use of the word "dimensions" is perfectly suited.
They are complementary. You can fully describe sound with either function of volume in time or complex amplitude (So normal amplitude + phase) in frequency. So you could just as easily say the volume is emergent property of the frequency.
Your intuition is in the right direction, but not quite correct.
Think of a cartesian plot. Now, think of the Y axis "tilted" to the front. With those 2 "vectors" you can still describe the whole plane (for whatever "describe" means =P)
If you tilt it so much the "Y" axis becomes paralel to the X axis, then you lost something.
Instead of "orthogonal" what you need is a mathematical property called independence (as in linearly independent) that basically means "not redundant to express a space"
Is anyone else as irked about the authors choice of the word dimensions as much as I am? I can't read past it. Wouldn't "factors" be a better fit?