That said, it'll be able to match modern overcompressed human cliche soup pretty easily. There's a lot of production out there which is really low hanging fruit for AI.
It's not unlike how the visual AI can do 'Greg Rutkowski', but has a hell of a time being an actual concept artist in a functional way. If the cliche soup is well defined, you're pretty much all set, particularly if it's not a genre that requires a lot of character.
> The information density of music is much higher than that of text or still images.
That depends on how you encode it. As a sound file (.wav, .mp3 or something like that) it's hard to compress but as for instance a midi file it can be very compact. Music is hard to make and hard to reverse but it is relatively compact in terms of source material if it can be expressed as midi.
The information density of music is much higher than that of text or still images. So something like this is still tech overreach.