Hacker News new | past | comments | ask | show | jobs | submit login

To my ear, they all sound about okay for 4 seconds, until my brain recognizes that there's no tension being built or story being told. It's like every track is 4 seconds of music followed by 4 seconds of music followed by 4 seconds of music rather than a track with a real sense of progression.

Many have said in this thread already that maybe we ought to expect that a ml approach in the next few months/years could be much better. I'm not so confident that it will happen so soon. Audio might end up being a much harder problem than visuals, for a variety of different reasons. Having the time domain built into the medium requires some concept of memory, and even modern neural nets seem to struggle remembering what they said before the most recent prompt.

Once again though, its not impossible. Just requires the right techniques and enough people focused on it.




The thing is even if you can make a machine reproduce it, it's missing the human component, and the fact that you (I) know it's not human made already degrades the experience.

What AI gives you is a mash up, a mix of people's intent, a mix of people's feelings. What I want is the result of a singular person expressing his singularity though his work, I don't want the "average of the best" music or the "average of the best picture". This is good for content creation, when you need to pump out the maximum amount of "content" for people to "consume" (see marvel, netflix&co), but not for art

Art that leave a mark is always weird/quirky/personal/deep/&c. the fact that a machine can replicate the result removes the most interesting part of the equation, the human part. It's like making your own bread vs buying supermarket bread, the later is cheaper and faster, it might even taste better if you fucked it up, but it's a complete different experience


Not sure why this is downvoted, I think it’s exactly right. A huge part of what makes music feel meaningful is the parasocial relationship with the artist, and the cultural context the music captures and expresses.


That's... such a different way of relating to music!

Some of the most meaningful experiences I've had with music involved DJs whose names I didn't know, playing tracks produced by musicians whose names I didn't know and had no way to discover.


Didn’t say it’s the only way! But I think you would agree that most popular music is made by artists who have a very prominent public persona, expressed in different ways depending on the genre and subculture the music appeals to. As a fan you’re not just listening to the music as it is, but interpreted in terms of your thoughts and feelings about the artist. That context can make the music feel more meaningful.


Several years ago I implemented some very basic rules outlining some distance metrics between two chords and ran some multiobjective evolutionary algorithms to generate, say, a 16 bar progression while trying to minimize these distances between any two subsequent jumps. I added a couple or three other objective functions for judging the progressions by my idea of structure (i.e., starting and ending on the same chord), and found the results to be very promising.

With enough of a sophisticated rule system (which could be built from existing music) an AI should be able to optimize for tension building or storytelling quite easily. Of course it will only optimize for the definition of tension building or storytelling that it understands, via statistical methods or being told by the programmer explicitly. In the latter case the programmer is just doing one of the things composers always do, while in the former, the generated content is interesting - or not - in much the same way (IMHO) as transformer-based language generators like GPT-3.


Link?


Unfortunately I don't have the full working codebase anymore. Although I think I have enough I could recreate it, it was pretty rudimentary and I've always intended to revisit the idea one day and flesh it out as at least a blog post or something (at which point I will submit it here). I only have a couple of the progressions but they don't mean much on thier own since I hand-picked them out of the result set based on personal taste rather than any formal algorithm.


I'm curious about your distance metrics for the chords. This problem comes up when trying to constrain the size of a vocabulary for musical tokens.


Most real tracks don't have tension buildup or progression. That you judge music based on it mostly just speaks of your preferences. As far as I heard the tracks were coherent and not just 4s snippets glued together. Having said that, I don't think they were exceptional or anything.


Besides ambient music, what music doesn't have any sort of buildup of tension and subsequent release? Do you have any particular example tracks that doesn't have any tension/release at all?

I feel like most of the music people commonly listen to have that, it's an essential part of what makes music feel "human".


I think maybe you two aren't on the same page. I know you are referring specifically to tension and release as a music theory concept, which is for sure very common. Even a single "tense" chord in a chorus resolving to the tonic is tension and release. I think the other person is speaking of "tension" in the context of progression, like a song's building up to a crescendo would be considered tension, or a dubstep drop, or a metal breakdown, etc. (those are also tension and release in music theory, but I think the person is speaking broadly in laymens terms)


So far so true. But there are also better examples. Bachbot was rather good, or some of the examples of MuseNet (https://openai.com/blog/musenet/).




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: