(The 0 and 1 version is binary audio FSK, but you could also choose to have more frequency levels, such as 4 or 10 or 20 different levels.)
People studying this don't usually distinguish between whether the waves are emitted into space as audio or whether they're encoded as electricity. The main reason for that is that we have devices (speakers and microphones) that convert directly between audio and electricity in both directions, so the design of a system that uses a wave represented as electric signals and a system that uses a wave represented as sound is "equivalent" in some way (even if they face somewhat different practical considerations, like different kinds of background noise).
Edit: oh, there's a more detailed Wikipedia article on non-binary FSK, called "multiple FSK".
https://en.wikipedia.org/wiki/Frequency-shift_keying#Audio_F...
(The 0 and 1 version is binary audio FSK, but you could also choose to have more frequency levels, such as 4 or 10 or 20 different levels.)
People studying this don't usually distinguish between whether the waves are emitted into space as audio or whether they're encoded as electricity. The main reason for that is that we have devices (speakers and microphones) that convert directly between audio and electricity in both directions, so the design of a system that uses a wave represented as electric signals and a system that uses a wave represented as sound is "equivalent" in some way (even if they face somewhat different practical considerations, like different kinds of background noise).
Edit: oh, there's a more detailed Wikipedia article on non-binary FSK, called "multiple FSK".
https://en.wikipedia.org/wiki/Multiple_frequency-shift_keyin...
It gives some examples of practical applications of this (again, not distinguishing fundamentally between what kind of waves you use).