It's interesting to me that for critiques of AI, one of the major arguments is "...

snickerdoodle12 · 2025-07-17T12:16:17 1752754577

Many people, including myself, object to companies violating copyright on a massive scale without any consequences whatsoever while people like this, who cannot possibly have the same impact, get their lives ruined.

ronsor · 2025-07-18T02:14:09 1752804849

Well, I object to copyright in general, regardless of the parties involved. We should not promote regression in the name of fake "fairness."

gcau · 2025-07-17T12:34:36 1752755676

Which companies are violating copyright on a massive scale? And what impact? (a bigger, badder impact sounds implied by you)

beezlebroxxxxxx · 2025-07-17T12:40:57 1752756057

One example is basically all of the major AI players have used Annas Archives/Libgen's database to unlawfully access millions of books.

awongh · 2025-07-17T12:39:37 1752755977

To be clear, scraping the entire internet so that ChatGPT knows what Mickey Mouse is may not be a fair use of copyright. Or to be more specific, being able to generate images of Mickey Mouse may not be legal- that is the ingestion of those images that give the model the ability to generate images of copyrighted material. I guess the courts will decide that soon-ish?

awongh · 2025-07-17T12:32:18 1752755538

My main point was that if you are against AI scraping, are you also against this guy being able to post this video?

Separate from the level of consequences for an AI company or this guy- for example if he was forced to simply take the video down or pay a small fine relevant to the level of piracy he was encouraging.

snickerdoodle12 · 2025-07-17T12:49:04 1752756544

My personal opinion is that since the laws haven't changed and society is still harshly punishing individuals for copyright infringement all the companies that have downloaded e.g. anna's archive should be dismantled, or at the very least their executives should be jailed.

Maybe the laws should be changed, maybe not, but the fact is that they haven't been.

RIP Aaron Swartz.

awongh · 2025-07-17T15:42:11 1752766931

It's not as if individuals are the only ones who bear the consequences though- huge companies sue each other all the time over copyright.

I guess it's a function of the legal system at all levels that money buys you more access to justice- not sure if that's a copyright issue specifically.

snickerdoodle12 · 2025-07-17T18:30:29 1752777029

The consequences are not at all proportional. Individuals get their lives ruined, some even driven to suicide. Often they're not even profiting, at least not directly.

Companies shield the executives and workers breaking the law (and profiting from it!) and barely get fined, if at all. Often they're just told to cease & desist.

beezlebroxxxxxx · 2025-07-17T12:37:02 1752755822

> Ultimately I hope AI will force us to decide on an updated paradigm of who owns ideas and it won't be a case of me receiving a cease and desist letter if I type a ChatGPT prompt that includes Mickey Mouse or "Miyazaki".

The principle of copyright is fine for artists. AI and ChatGPT aren't fundamentally changing the underlying logic: artists should have their intellectual property protected and be able to receive compensation for their work free from getting ripped off. The problem is stretching copyright to absurd timelines when the underlying logic also recognizes that novel ideas emerge out of the public commons and ultimately return to them after a certain amount of time. 7 or 8 years is reasonable. 10 tops. Decades or even hundreds of years is absurd.

ethagnawl · 2025-07-17T13:42:23 1752759743

> Ultimately I hope AI will force us to decide on an updated paradigm of who owns ideas and it won't be a case of me receiving a cease and desist letter if I type a ChatGPT prompt that includes Mickey Mouse or "Miyazaki".

I've been thinking a lot about this lately since I've had some ... questionable images generated by Gemini. If it outputs infringing material is that on me, them, both of us? Does it depend on my prompt/context, what I do with the output, etc.? My instinct (in opposition to your comment about C&Ds) says it's on them because they're charging money for the service and it's _clearly_ been trained on copyrighted material. I think this question and related ones are going to be answered fairly quickly, especially because of how egregious some of the output I've seen is.

I don't want to get into specifics right now because, IMHO, this particular "trick" is an exploit, as it's reproducible and systemic. Google has a bug bounty for Gemini but this scenario (i.e. output containing copyrighted material) is "out of scope" and they request that you submit individual tickets for every infringing instance. It's not clear to me if end users are supposed to do that or copyright holders but that's not a scalable or practical solution to a systemic problem. I would prefer to be responsible and be compensated for my trouble but I may wind up writing a blog post or something about this if I can't get their attention.

cubefox · 2025-07-17T14:45:33 1752763533

The case is arguably even more clear cut: copyright protects more-or-less exact copies. So copying old video games is not allowed. However, making something that is merely similar, but clearly different from the original, is not a copyright violation. For example, you are allowed to make a "Zelda clone" if you only copy broad game principles and vibes, but not specific art or level designs.

Generative AI mostly works by copying fuzzy styles instead of specific texts or images. There are some exceptions where models actually memorize specific material, but these seem to be relatively rare and probably require that the piece in question occurs frequently in the training data.

So in general, training on copyrighted material is probably legal as long as the model is not able to exactly reproduce the training data, while copying video game ROMs is clearly always illegal.

Of course, whether these things are morally okay or not is a different question...

Edit: Of course, to train on copyrighted material, you have to download it first. If you don't pay for the copies, this is arguably still illegal, even if the resulting model doesn't distribute any copies! (An exception might be content that is directly embedded in websites, because copying websites into the browser cache is allowed, even if they are under copyright protection.)

awongh · 2025-07-17T15:39:35 1752766775

> The case is arguably even more clear cut: copyright protects more-or-less exact copies.

What about songwriting? Or even music performance- me performing a song doesn't produce a more or less exact copy.

cubefox · 2025-07-17T17:44:35 1752774275

For covers, the recording is not considered the same as the original, but the underlying composition (notes and lyrics) are. Both composition and recording are separately covered by copyright.

awongh · 2025-07-17T19:52:07 1752781927

> making something that is merely similar, but clearly different from the original, is not a copyright violation

Is a cover a copy? For music, it's not like I'm selling you sheet music- I'm still outputting something you listen to that won't be the same as the original.