Hacker News new | past | comments | ask | show | jobs | submit login
An unwilling illustrator found herself turned into an AI model (waxy.org)
726 points by ghuntley on Nov 1, 2022 | hide | past | favorite | 751 comments



I'm thinking a little bit of empathy doesn't hurt. Reason from Hollie's point of view. She didn't ask for this and was working on cool stuff:

https://holliemengert.com/

Next, somebody grabs her work (copyrighted by the clients she works for), without permission. Then goes on to try and create an AI version of her style. When confronted, the guy's like: "meh, ah well".

Doesn't matter if it's legal or not, it's careless and plain rude. Meanwhile, Hollie is quite cool-headed and reasonable about it. Not aggressive, not threatening to sue, just expressing civilized dislike, which is as reasonable as it gets.

Next, she gets to see her name on the orange site, reading things like "style is bad and too generic", a wide series of cold-hearted legal arguments and "get out of the way of progress".

How wonderful. Maybe consider that there's a human being on the other end? Here she is:

https://www.youtube.com/watch?v=XWiwZLJVwi4

A kind and creative soul, which apparently is now worth 2 hours of GPU time.

I too believe AI art is inevitable and cannot be stopped at this point. Doesn't mean we have to be so ruthless about it.


There's nothing inevitable about it. Laws exist to protect people.

It's a bit like saying we can't stop music piracy, now that Napster exists.

Remember the naive rallying cry among those who thought everyone should have the right to all music, without any compensation for the artist?

"Information wants to be free!" (https://www.theguardian.com/music/2013/feb/24/napster-music-...)

Napster was a peer-to-peer file sharing application. It originally launched on June 1, 1999, with an emphasis on digital audio file distribution. ... It ceased operations in 2001 after losing a wave of lawsuits and filed for bankruptcy in June 2002.

Use of the output of systems like Copilot or Stable Diffusion becomes a violation of copyright.

The weight tensors are illegal to possess, just like it's illegal to possess leaked Intel source code. The weights are like distilled intellectual property. You're distributing an enormous body of other people's work, to enable derivative work without attribution? Huge harm to society, make it illegal.

If you use the art in your product, on your website, etc., you risk legal action. Just like if I publish your album on my website. Illegal.

The companies that train these systems can't distribute them without risking legal action. So they won't do it. It's expensive to train these models. When it's illegal, the criminals will have to pay for the GPU time.

It will always exist in the black-market underground, but the civilized world makes it illegal.

That's where this is going, I hope. Best case scenario.


>It's a bit like saying we can't stop music piracy, now that Napster exists.

We effectively didn't, though, at least as far as artists are concerned. Streaming revenue is abysmal for artists. https://www.latimes.com/entertainment-arts/music/story/2021-...

Piracy made music acquisition too convenient for the consumers, so an alternative had to be created - but this alternative really only helps the labels, and not the people actually making the music.

It's not clear to me that the streaming world is better for artists than the Napster one. At least anyone wanting to legally listen to music then would buy albums, rather than just having a spotify subscription. Not that royalties on physical CDs were great, but my understanding is they did work out better for most artists than we see with streaming royalties.

I don't know what a potential analogy would be here with stable diffusion or dall-e or whatever, but I don't know that people were able to immediately identify the potential downsides with "winning" against piracy, either.


> We effectively didn't, though, at least as far as artists are concerned. Streaming revenue is abysmal for artists.

But that's not Napster's fault. Spotify pays a lot of money for playing of a song, of that the artist only sees a tiny percentage due to music middlemen trying to relive the 90's.


Sure.

And that's why I buy music off Bandcamp whenever I can, and thankfully most of the music I listen to is on smaller labels, so usually even more money goes to artists.

I'm just saying that the solutions that pop up once you "win" are not necessarily ones that provide a win for the people you are trying to protect.


Spotify pays a lot of money for playing a song? Out of a $10/month fee, that clearly can't be true.

If P2P was somehow impossible, we would still be buying DRMd tracks for $1 each on iTunes. Spotify would at best, cost $80/month.


> Spotify pays a lot of money for playing a song? Out of a $10/month fee, that clearly can't be true.

Spotify take in €9.5 billion and pay out 70% of that as fees to record companies who take their slice before giving the artist nothing.

I would consider that a lot of money, even if you disagree.


I distribute my music through CDBaby, and looking at transaction history I've been getting $3.65 per thousand streams. That's not nothing, and is much higher than I'd get from radio.

Spotify is taking in a lot of money and paying 70% to labels, which adds up to a lot of money for artists depending on their agreement with their label/distributor. But the per stream rate is still very low because there are trillions of songs streamed annually.


I'm sorry, I don't want to sound dense but I'm not clear what your point is.

Are you saying that CD Baby is a better distribution technique than standard labels because you get good margins? I didn't know CD Baby until I just looked them up but they appear to be a distributor, so your $3.65 metric is still being paid by Spotify/Amazon/Apple. Please correct me if I'm wrong but that is much higher than the normal published numbers by 10-100x.

Is this an RIAA moment where labels are trying to make other people look like jerks rather than accepting what they do, or are people using the "per 1000 streams" poorly because they will always look worse on successful platforms?


> that is much higher than the normal published numbers by 10-100x.

$3.65 per thousand is at the low end of what I see elsewhere. For example https://twostorymelody.com/spotify-pay-per-stream/ has $3-$5 per thousand.

> I'm not clear what your point is.

I think the distribution of streaming revenues is generally reasonably fair, and people who say things like "Spotify pays artists nothing" are confused about either (a) how much money there is to divide up or (b) where it is going.


You are right on those numbers, I miscounted when I looked at those numbers.

I'm glad that there is someone creative on here that can educate and confirm.


The point is that 1/3 cents a play isn't a lot. Argue against that claim if you like, but engage with it instead of using weird rhetoric to avoid it.


I don't think that was the point that anyone was making.

In fact Jefftk has stated the exact opposite to your position.


Anything is a lot of money if you offer no basis for comparison. A million dollars seems like a lot until you say that it's what you paid to build a downtown skyscraper.


The math here makes a flawed presumption: that you play a song only once after buying it from iTunes.

I obviously don't know your listening habits, so for you that may be the case. But people will listen to a single song far more often than once. Or otherwise there'd had to be 77.946.027 new users on spotify[1] last month all playing Ed Sheeran once. Clearly nonsense.

If you play every $1 iTunes song eight times on spotify, the costs (and therefore fees) will be on par: $10/month.

[1] https://open.spotify.com/artist/6eUKZXaKkcviH0Ku9w2n3V


70% percent of Spotify revenue goes to the artists (content owners to be precise), I doubt that was the case when you bought a CD (I have found numbers closer to 40%). It is not abysmal, revenue never seemed to be better.


> 70% percent of Spotify revenue goes to the artists (content owners to be precise)

That's pretty important difference. If it is going to recording company that pays pittance to artists no wonder "streaming revenue is bad"


From the article I linked:

>The actual recording artists? “They’re keeping anywhere between 5% and a quarter.”

That is a far cry from 70%


That is not Spotify's fault. It is a deal made by artists that splits revenue between them and producer, record label etc.

70% of Spotify's revenue is to be split that way compared to ~40% of revenue of CD sales.


It doesn't matter if it's Spotify's fault - I'm not saying they are the evil empire. I am saying that streaming is how we "beat" piracy, and it was not a panacea for the people it was supposed to protect - unless we consider the labels the people it was supposed to protect.

You're also comparing apples to oranges on revenue. 40% of the revenue of a $10-$18 CD sale is a lot different than 70% of a $10/mo subscription being split out over however many artists someone might listen to on spotify.

https://www.nytimes.com/2021/05/07/arts/music/streaming-musi...

Lots of artists talk about how they simply can't make a living off streaming royalties - artists that were able to do in the era of album sales. Obviously any sort of comfortable living requires merch sales and touring.


Comparing artists that could make record sales earlier to all current streaming artists is comparing apples to oranges.

I agree that Spotify's revenue split is not perfect (it is in fact worse than what you described), but it is still much more fair than record sales. I consider having tape/CD record sold in physical shops (i.e. not during concerts) would be a success on its own in the 90s.

Now every artist can publish their work on Spotify and start to earn money, possibly getting noticed through it. It is much feasible to not be a part of record label now.


>unless we consider the labels the people it was supposed to protect

You've hit the nail on the head there. The same record companies that got up to very dodgy copy protection schemes [1].

[1] https://en.wikipedia.org/wiki/Sony_BMG_copy_protection_rootk...


But even taking those %s at face value: it's 40% of a much larger amount of money (a whole CD versus some streams).


When considering revenue of 1999 as 100% CD sale and 2021 as 100% streaming we got something like (in inflation adjusted $bn).

(1999) 40% of 23.7 = 9.48

(2021) 70% of 14.9 = 10.43

So while gaming got much more popular during that time, music still brings a lot of revenue to the artists thanks to the streaming.

(1) https://www.statista.com/chart/17244/us-music-revenue-by-for...


> The weights are like distilled intellectual property. You're distributing an enormous body of other people's work, to enable derivative work without attribution? Huge harm to society, make it illegal.

The thing is that you're distributing only the instructions for making other peoples' work. There are art books and articles explain the styles of certain artists and what techniques they use to achieve it; you could probably recreate "the Mona Lisa but in the style of Marc Chagall with cool glasses on" with real paint if you had previously stared at both the Mona Lisa and Marc's art for hours at a time. Are you infringing upon either of their copyrights by combining them? Probably not. But if you just recreated the Mona Lisa after having stared at it for hours, and it turned out nearly identical, then it would be. So where is the line?


Well think if you took the left side of Mona Lisa and combined it with Van Gog's Starry night? Would that be ok? Of course not.

But if you took 100 paintings from 100 artists and clipped all into small jigsaw pieces then recombined them randomly to create a new picture, that would probably not be considered "derived art".

What matters is how much creativity you put into the process. Does the creative esthetics of the work derive from your efforts, or from those of the existing author's existing art.

But if I took 100 paintings from a single artist and combined them all into new works of art, that would probably be copyright infringement in my view, and is what seems to be happening here.

Consider the history of trademark lawsuits. You can be sued if you create something that somehow resembles an existing trademark, say use the exact same color as Coca-Cola for something similar.

So I think the guiding principle is or should be whether what you create can be confused with the work of some other highly original artist. It doesn't matter if you painted it all, if it looks similar enough that people could confuse it with the original artist's work, you are infringing.


>But if I took 100 paintings from a single artist and combined them all into new works of art, that would probably be copyright infringement in my view

What makes you say that? A work is considered derivative with regards to another work, not to an author. If we take your jigsaw example and accept that the final result would not be derivative of any of the works that contributed each individual piece, and then pretend as if in actuality all the sources were from the same artist, what would change that would suddenly make the result derivative from some or all of the original works?


You are probably right there, I was just assuming that courts would consider it a factor if all pieces came from the same author.

As I understand it copyright infringement is not just a pure "crime" in itself. It is about the financial harm caused. I think the word they use is "tort". It is always about violating somebody else's right(s).

Oracle sues Google for Java copyright infringement. It is not just about "Hey here's a copyright infringement ... put Google in jail".

It is about "We lost a billion dollars, because of your infringement". So Oracle claims not just a single copyright infringement of a single work, but billion dollars worth of infringement. It is not black and white, it is quantitative. How much there is of it determines the seriousness of the violation.

I'm not a lawyer of course, don't take my advise.


That's kind of hilarious because ALL artists copy/imitate other artist' styles during their learning process before settling into a style all of their own.

The number of people who learned to draw by redrawing/imitating Disney stuff is countless.

The thing people aren't seeing with AI art is that it's the same as mass manufacturing, compare: buying a mass produced knife vs buying a handmade artisanal knife. I think exactly the same thing applies; generating machine made art in a given style vs buying/commissioning an artist.

I think taking someone's work to train an AI is fine, as long as you obtained legal access to the material in the first place. There is no copyright for art styles, if there was we would have no artists because even this artist in question would've started out by imitating other artist' styles.

As an update after taking a closer look at the article rather than the discussion: her art style is 100% inspired by Disney (and a few others) and there is nothing wrong with that.


It seems very strange to use the existing rules of copyright as a defense of the use of this new technology.

The concept of copyright was created in response to the development of the printing press. It was a reaction to a disruptive technology. It was possible to laboriously copy written works before the printing press existed, but the new technology made it incomparably cheaper and faster to do so, and societies reacted by creating new protections for content creators.

We are now at the threshold of a new disruptive technology that is likely to bring about profound economic changes in the arts. It makes no sense to me to take the old rules and try to use them to justify this disruptive technology, when the old rules were initially created in response to a different disruptive technology.

It seems uncontroversial that this new generative technology is built on the backs of human artists. It only functions by drawing from their works. Is it so unconceivable that we might need a totally new set of protections for those human artists?


It is true that generative ai technology is often trained on human artists' work. But how is that different from human artists taking inspiration/learning and adapting the style of other human artists? I suppose the argument is that humans should get special treatment in the copyright domain?

I wonder if it is possible to get a machine to learn a style without input. Likely a room full of typewriter monkeys searching for Shakespeare scenario, but a human would still be involved in the loop to "confirm" the desired style - which is technically a creative decision in itself.

Which I guess shows the true nature: machines could generate stuff for machines without any external input. But we built them, so we've tasked machines to generate stuff for humans. And therein lies the answer I guess.

I 100% believe machines can be creative. Creativity isn't something unique to humans or to living things. For me it's a concept.


>It is true that generative ai technology is often trained on human artists' work. But how is that different from human artists taking inspiration/learning and adapting the style of other human artists?

It's different in the same way that making a copy of a book by hand, where it might take weeks or months to make a single copy, is different than making a copy with a printing press in a few minutes. It was the technological development of the latter process which lead to the concept of copyright being created in the first place.

There is a fundamental difference between a human being taking years to acquire artistic skill, then using that artistic skill to create individual works inspired by other artists, vs. using a generative AI system to "learn" a particular artist's style in a minutes or hours, then create infinite iterations of that style nearly instantly.

There's a tendency for people in tech to search out broad, overarching, universal principles that can be applied to all behavior. But sometimes, simply being able to do something tens of thousands of times faster or tens of thousands of times more cheaply is enough of a difference to require new rules, new moral frameworks, new modes of thinking.

"The computer is just doing what a human could do" simply isn't a compelling enough argument, any more than "the printing press is just doing what a scribe could do" would be.


You raise some interesting points.

> The concept of copyright was created in response to the development of the printing press. It was a reaction to a disruptive technology.

Absolutely, one of the major factors was that it allowed individuals to benefit directly off someone elses work without having made substantial changes. The protection was intended for the original works it self and derivatives too close to the original content.

> We are now at the threshold of a new disruptive technology that is likely to bring about profound economic changes in the arts.

This already happened with photography taking over portraits and tracing, the response wasn't to outright ban it, or really prevent it either. When technology made photography more accessible, to the point it was going to be disruptive to professionals in the field, the response again wasn't to outright ban it, or really prevent it either. This is despite the fact that it has litterally destroyed a significant amount of jobs to achieve conviniences that we now all enjoy.

I feel like the AI issue is a parallel to above situation. People are now given better tools to generate/create art themselves and as long as it isn't blatant copies, derivatives too close to the original content, it probably should be have similar rules in my opinion.

> It only functions by drawing from their works.

You can train AI models by taking photos and then vectorizing/toonifying/paintify etc. depending on what you're aiming for with various wildly available non-AI filters. Stylistic ideas are possible to implement into these filters, I have some experience having done so with making plugins for processing my photos. So, that isn't even a strict requirement for generation. So, even in the case where you ban AI from learning from people made art (even in the situation where they would allow it), there are ways to still train the AI models regardless to achieve a similar result.

There is another problem that hasn't been discussed, enforcement is going to be a very interesting problem considering how international borders for information/data are virtually non-existent now and it's becoming relatively difficult to even distinguish if a piece was generated by an AI or by a person. The economic changes are likely coming in regardless from my point of view. It's going to be either people are using it illegally if banned regardless or people using it legally if it isn't -- I just do not see this changing either way.


> Taking someone's work to train an AI is fine, as long as you obtained legal access to the material

That is the big question here, what kind of legal access does Copilot etc. have to the training materials. When they use the training material they must copy it to their computer. According to most open source licenses they then also have to retain the copyright notice to wherever they copy it. But now it seems that Copilot skips that part. It copies everything else but not the copyright notice.


You can trademark a style, you can't copyright it. IANAL but that is what my corporate IP compliance training tells me. As long as am regurgitating non legal advice, I suspect half mona lisa half starry night might be considered a transformtive work. If a single human artist painted both perfectly onto the same canvas it could be construed as a statement about changes in the culture between the two contexts, so if you do it with Photoshop, it might very well get ruled the same way.

As for the morality of it? I don't like the idea of copilot replacing me, but I don't think it was wrong to make it. I'll eventually have to retrain myself to retrain copilot models I suppose. Or we'll have to decide to care for each other as we all go unemployed.


If you gave the co-pilot the license to copy your code, part of that license is they have to include your license and copyright notice in every derived work they make.

And Copilot, it doesn't just copy "style", it copies code.


Andy Warhol did something very similar to this with Avril Harrison's computer illustration of Venus. He just used the clone tool to add a third eye then called the result his own work. It even still had her signature on it.


>But if I took 100 paintings from a single artist and combined them all into new works of art, that would probably be copyright infringement in my view, and is what seems to be happening here.

If I look at 100 paintings by Pablo Picasso and then paint a new one in his style, did I commit copyright infringement?


That's a good question. My immediate answer would tend to be no.

But consider you produced a comic-book about Mickey Mouse where the character Mickey Mouse looked exactly like the one in the several Disney books and movies. You would probably get sued. Right?


Trying to take a strong form of OPs position, one obvious line would be the automation and mass reproduction aspects, in addition to how much unique creativity you specifically added to the process.

It gets of course harrier because what happens to experts in the field who are able to do that and then just use this as a boosting tool. Still, I don’t think copyright law has clear bright lines so much, but more guidance that the courts just try to muddle through as best they can. Certainly one can make an argument that just stealing an artists style like this could be considered a copyright violation, just like sampling even a few seconds of someone else’s track can be in music.

Again, not saying these are net good for society, but clearly existing copyright laws do try to take this tack. I think the one thing working against her favor is that a) early days so laws haven’t caught up b) us laws generally favor corporate interests over individuals so she might never get any relief even if deeper pockets start to protect themselves as this becomes a bigger problem for them.


The process for a human to copy her style in an original work would be similar, and legal. I don't think it's a good idea to prevent the automation of human-capable tasks, because it's anticompetitive: it protects an industry (albeit a small one of starving artists) at the cost of consumers.

The harms to artists are obvious and immediate, but limited and small. The benefits of letting an ML model train in the same way as a human are vague and in the future, but might be capable of massive transformative changes in the way we work. I think it's right to be careful about "protecting" a limited number of people at the cost of enormous future potential.


Enormous future potential for derivative work gets created. Enormous future potential for original work gets erased.

Why would anyone in their right mind choose to put effort into creating original art if there is "one easy trick" to get around copyright by simply turning their art into a model that can be used to churn out things they could have produced?


>Why would anyone in their right mind choose to put effort into creating original art if there is "one easy trick" to get around copyright by simply turning their art into a model that can be used to churn out things they could have produced?

Why do some artists still paint on canvas instead of using photoshop or krita, where you can easily ctrl+z any mistake, never need to mix any paint, can move layers up and down, etc. etc. etc.?

Why do some photographers still shoot anything smaller than large format with film when medium format and full frame digital cameras exist?

Why do some people still use analog synths when Native Instrument's Komplete exists?

Why do some guitarists still use amplifiers when they could use an AxeFx/Kemper/Neural DSP?

Most of those options are also more expensive, on top of being more inefficient/difficult/generally burdensome, yet people still do them.

People do a lot of things that are not necessarily the most efficient way to do something. They like the minor differences, or enjoy the process, or many other things.

I also don't see how SD and similar get around copyright. Even if training these models on copyrighted images is legal, that doesn't mean that the output they produce necessarily is. It doesn't matter how I create a depiction of Iron Man, be it SD or a paintbrush and canvas, I do not have the rights to reproduce him. And for things that can't be protected by copyright, such as style, I am not hindered by it no matter if it is created with SD or colored pencils on a sketchpad.


If you think about future business cases, my guess is if I'm in the content creation business I'd hire some artist to create inputs for my ML model to train. And I'd be the only one with access to these inputs (in the beginning). Or think about it the other way around. If I'm an artist I buy a commodity AI-art-generation-engine and feed it with my work and I can create infinite items in my own style for (digital) sale.

It'll all be about time to market and brand building. I could even see a world where the originals of the input creator would sell quite well as classic modern artworks. Imagine for a second a world where 3D assets get created this way. I'm pretty sure fans of popular games would shell out good money for originals from "the artist behind the Witcher 7 asset engine" if the trajectory of human development goes as I see it going.

Also...artists are going to create art no matter if it makes financial sense or not. In fact I'd argue that's the difference between art and design :P


> if I'm in the content creation business I'd hire some artist to create inputs for my ML model to train

That's a reasonable way to go about things. The problem is that right now the status quo is that you just take artists' work without their consent and use it to train your model.


Because they want to do it? The motivation for creating art isn't purely financial.

Plus, we humans all built our skills and works on the shoulder of giants. Artworks and cultural artifacts are never created in a vacuum. Maybe it's time to acknowledge that.


> The motivation for creating art isn't purely financial.

Yeah, but getting financial compensation can certainly help. The opportunity cost of putting bread on the table means that the output of most professional artists today would drop significantly, if they needed to pick up another profession (especially full time).

> Plus, we humans all built our skills and works on the shoulder of giants. Artworks and cultural artifacts are never created in a vacuum. Maybe it's time to acknowledge that.

You're acting as if artists don't already acknowledge and understand this. https://www.muddycolors.com/2017/12/some-thoughts-on-master-....


Financial compensation does help. But certain industries become marginalised or relegated to history given enough time. People then keep them alive because they choose to.

Where are the tears for horseback couriers? Or blacksmiths? Or thatchers?


That just proves the original poster's point, that the potential for future original work will be erased.

How often are we seeing innovative advancements in the field of horseback couriers, blacksmithing and thatching nowadays?


I guess you didn't get my point which was: those industries died apart from specialists keeping them alive today and that's just the nature of the world.

The same thing will happen to human generated creative content whereby it becomes something that people are involved in because they want to be, not because it's a necessity/it's the only way to do it.

Yes the potential for future art work done by a human today will be erased in the future when it can be performed by a machine, but that has always happened & yet somehow it's surprising to people.

An artist being indignant towards machine generated art yet using mass produced tools, eating food farmed by mechanised equipment, wearing clothing woven by automatic looms, taking a digital photo themselves instead of hiring a portrait painter, owning a car instead of a horse that supports many sub-industries, sending emails instead of letters is just hypocrisy.

Technology has always brought us forward and these new AI powered tools will assist us as the tools we produce have always assisted our species. And as always those who refuse to change will eventually be left behind.

And yes, if this was happening to the industry I'm in I would currently be going through the 5 stages of grief about it, too. But then I'd just have to change up what I'm doing to reflect the changing times. As she herself said, it still doesn't capture what she puts into her art & so there is still that avenue to pursue.


That's begging the question. I don't agree that a model is one easy trick to get around copyright, any more than paying another animator to draw in the same style would be.

In terms of creating original art, I think that in ten to twenty years artists will see models as another tool for creative expression; one that lets an individual artist be more productive but can produce a generic feel, like thin-line animation or sticking difference clouds everywhere or using a palette of pre-made drag and drop body parts.


People will still put effort into creating original art in styles that don't yet exist.


> The process for a human to copy her style in an original work would be similar

It wouldn't be similar at all. It takes years to get skills good enough to even copy stuff like that. With AI person who never did any art in their lives can get hundreds of copies in few hours.


The engineer stumbled onto the least sympathetic, least transformative, most obnoxious use case for the AI. He was trading on the artist's name, confusing people and even arguably devaluing her work by reproducing it in a clumsy and low-value way. Folks in the industry would do better to acknowledge, as he did, that this was wrong and establish standards so everyone knows this is not considered a proper practice.


Taking longer doesn't mean it's dissimilar.


Here's why I believe it is inevitable...

It's already out there. On people's local computers and soon their mobile devices. People are tinkering with it at warp speed. This point addresses that it technically cannot be stopped.

I don't expect it to be possible to detect that the art is AI-generated. This becomes further impossible when using a personal input image as well as many follow-up edits or composite works. It blends into normal image creation. The only way to prove that it's not AI-generated is to record in-progress "human art" as is sometimes done in art contests, but this isn't reasonable to legally require of every single piece of art to be created.

You can't enforce what you can't detect.


As a society we have gone through great pains to protect the software developer's incompe and job, source code was given copyright and patent protection - no other industry gets both protections at once.

Now you seek to deny others such protection while taking advantage of it yourself.


Let's not kid ourselves. Those protections exist for the benefit of corporations, not software developers. If those corporations could have robots write the software and copyright that software and patent it, they absolutely would.


Ironically, the high salaries the software industries is able to pay is precisely because of the copyright protection afforded that prevents the value of the software from being diluted by way of rampant copying.

This is also the same reason due to which open source projects often struggle with funding, and why many databases (among other OSS software) are moving towards stricter licenses such as the AGPL.


I don't think so. The highest paying companies don't distribute software, they provide access to a remote service. Even if copyright didn't exist, you couldn't copy the Google executable.


> Even if copyright didn't exist, you couldn't copy the Google executable.

If copyright did not exist, you could take it as a google employee and start your own google without going to jail.


Pretty sure you could sue them using other means, such as through contract law, if they signed an appropriate contract. You really don't need copyright if you aren't broadly distributing information.


if Guy A copied your code, and i got it off him, you can sue him for violation of contract but you cant stop me, a third party.

Your ability to sue him will be limured, he cant go to jail, he can deckare bancrupsy, and you have to be spesific about what is protected, 'idea' is a vague term that cant be protectedm


You're making my point though. The fact that Google is successful, in part, is because of the fact that you can't copy their trade secrets and methods; which is one reason no solid competitor has come up to challenge Google.

(There are infrastructure challenges as well, but this thread is about intellectual property.)


My understanding is that trade secrets are distinct from patents. For patents, you tell everyone how to do it but they're not allowed to for 20 years. For trade secrets, you don't tell anyone how to do it, but if someone else figures it out for themselves, it's fair game. Most of Google's search IP is protected as trade secrets rather than patent/copyright, I believe.


Your point is that you can't copy Google's secrets because of copyright, and therefore copyright is valuable to its employees. I'm saying that the reason you can't copy that information is different from copyright.


Many (most?) software developers believe that software shouldn't be patentable at all. Many also believe that copyright terms should be way shorter.


They only believe this because this won't (immediately) affect their material prospects.


Or maybe they genuinely believe that the content they make today shouldn't be copyrighted for the next 100 years? Lifetime of the author +70 years is a very long time.


Given a binary choice between unemployability and extending patent protection to (even) snippets of code, I am quite confident that 90+% of salaried developers today will chose the latter.


thats not the argument we are discussing - the question is should these protections exist at all. We are talking about denying the artists all protection, only fair to confront developers with the same dialemma.

Whether they are 3 years or 300 is a finer point, and is only worth discussing after nessesity and legitimacy of sich protections is established


What a load of nonsense. The last 40 years has made the barrier of entry to software development lower than it's ever been.

We got applications like unity for game generation, low code solutions that let you generate a crud dashboard for a database in a few clicks, etc.

As a developer with an actual degree in comp sci, I can guarantee I'd be a helluva lot better paid if everyone had to do their software development in low level C.


That wasn't to protect software developers that was to protect companies & corporations. Twisting the law/politics is what capitalism has always had companies try to do.

Software developers aren't protected at all, we are simply just in demand for the time being. There will probably be a time in the far flung future where our jobs are phased out, too.


It doesn't matter what any of us think. The genie is out of the bottle and cannot be put back because unlike pirating existing media, this new style pirates the whole style.

Visual arts as a career will be dead soon. Visual arts as a hobby will live on.


>Visual arts as a career will be dead soon

Put a fine of 10% of annual turnover if a company cannot prove that images used in its product/ads are human generated. Make payment to visual artists part of the process of proving it. Boom, visual arts as a career saved.

It's one thing to say we shouldn't do it, it's quite another to say we cannot.


That's a profoundly stupid idea. There are low code solutions that generate fully compilable pieces of programs and applications, so let's make sure we ban those.

Plenty of music is procedurally generated so we gotta make sure we ban those as well.


What's profoundly stupid is to assume every piece of technology is good so we should do nothing about its proliferation. Yes, we should ban low-code tools based on deep learning over "fairly used" (not) datasets and we should ban the same for music, writing, whatever. AI bros can go cry in the corner, I don't care.


Think of it from a non-monetary way and ignoring job security for a moment. Why would anyone (artists/programmers) spend their time doing something a machine can do ? It would be a terrible way to spend ones life. Perhaps the datasets can be licensed (if that's the sticky point) and embrace the AI ?


>Why would anyone (artists/programmers) spend their time doing something a machine can do

Why do humans play chess after 1997? Why do they play checkers after the 70s?

AI capability is destroying human enjoyment of activities because it also destroys the economic rationale for engaging in them and/or allows other humans to cheat.

The obvious conclusion of this position is that we should just all kill ourselves if strong AI ever starts to exists. No, thank you, I'd rather do everything possible to prevent it from being created.


> The obvious conclusion of this position is that we should just all kill ourselves if strong AI ever starts to exists.

Indeed that is the logical conclusion. I expect some people advocating this line of logic to follow through


It's the difference between a bus and hiking: With a bus, you can arrive at lots of places fast but you will never experience the place as the hiker does.


With the amount of power the copyright lobby has, who knows it might be true. However advertising is not the only industry where visual arts will be affected. Gaming, Comics, Animation, VR, Movies etc will be affected as well. And regulation isn't going to keep up across all the countries.

Even if we assume the above rule is made, there is no way to prove it because even if a human does it, he might still use the help of photoshop etc which are planning to integrate such tools currently. Most favorable outcome I see out of this is for famous/competent artists to license out their style of art to these generation companies which then train their models on them (which sounds win-win but won't be as profitable as it is today)


>Even if we assume the above rule is made, there is no way to prove it because even if a human does it, he might still use the help of photoshop etc which are planning to integrate such tools currently.

It's unclear to what extend tool producers will support AI in this context at all. Also, this is a problem for safeguarding the integrity of the product of artists, not for safeguarding their income.


The companies that pay artists' salaries won't be willing to secretly break the law to save money.

Microsoft won't legally be able to continue its abuse of GitHub. Copilot will be dead. OpenAI and Stability will not legally be able to profit from large-scale intellectual property theft. All these violations will end.

These are the most significant digits. The residual amateur piracy doesn't matter. It doesn't matter if random guy gets some leet neural net warez and uses it to make his desktop wallpaper.


First, you didn't address my point that you cannot detect AI output.

I already have Stability running on my PC. I generate an image with it. How do you know the output comes from Stability? Answer that, please.

Second, a worldwide draconian ban on AI image training and generation just isn't going to happen. Very few legal things are coordinated worldwide, and copyright law is incredibly low on the list.

Even training without consent can be addressed. Google trained some of their AI from Google Photos. Which they made free for unlimited use so that us fools would produce billions of images, accept the terms we don't even read and voila: AI legally trained.


Just because you can buy a knife to kill someone discreetly, doesn't mean that we should just give up on the idea of making murder illegal.

Of course "killing someone" and "copying someone's artwork" are at different levels, however, I'm sure artists would beg to differ.


I don't think there is any work on that yet, but if the model is known it should be possible to derive the probability that a particular image is the output from it.


Ive produced over 5000 images on my local SD install. Currently, there are many dead giveaway if you produced the art with it. Specifically around hands, feet, holding things, pupil directions. Of course these things will get better with time, but currently there are many things that exposure generative art.


It's inevitable that someone comes up with methods to detect generated images, since a lot of political (Edit: and financial) capital hinges on that. If AI image generation is inevitable, then methods to analyze images wrt. known generators are even more inevitable.


> GAN

And your point is invalid in long term.


I've noticed there's a lot of newly created luddite accounts.

Reddit luddite brigade?


I'm sure I've still got a bunch of MP3 files from Napster on a hard drive somewhere, but yeah, that genie was put back in the bottle. This one can too.


It was put back in the bottle by Apple legitimating the piracy business model by cutting desks with the major labels to digitize their music.

The analogue here is AI art week be legitimized, and the artists who profit will be the ones who let their work be used as input for a cut of the profits. And nobody will be able to compete with them, and the owners of the machine will be able to set the profit rate as they choose.

... That does sound like a new stable equilibrium, actually.


And as we all know, music piracy over the Internet ended the day Napster stopped working. /s


How was it put back in the bottle? The Pirate Bay still exists. Bittorrent tools still exist. Name any song and I'm sure we can find it.


Napster was on millions of machines and now it isn't. It's like you skipped that part.


Napster, the centralized service, was stopped.

Napster, P2P sharing of music, was not.

Piracy is held at bay only by the ease & affordability of legally obtaining media and difficulty accessing the technical means to pirate.

... and well-packaged piracy solutions and modern broadband bandwidth likely sink the maximum price (the only remaining term) below the cost of production.

It took me 20 minutes to go from nothing to an entire season saved locally and streaming to a Roku. That's finding the software, installing, configuring, finding torrents, downloading, and then playing. And that's not having pirated in a decade or so.


Napster has single points of failure and future p2p had poisoned seeds for tracking.

Stable Diffusion is math and cannot be stopped now the toothpaste is out. You can attempt to regulate, assign draconian requirements by force of law, but ultimately these are as unenforceable as regulating that pi=3.

Ironically, what could help is NFT type tech. Signed with a private artist key, your copy is "original". Even if knockoff generative copies are produced, the digitally signed produced-bys are still authentic.


>Ironically, what could help is NFT type tech. Signed with a private artist key, your copy is "original". Even if knockoff generative copies are produced, the digitally signed produced-bys are still authentic.

That solves a completely different problem, though. I don't think anyone is saying that the problem is one of false attribution, where people are claiming generated images are the work of a particular person. What's being discussed is artists having less work because people generate art computationally rather than commission artists to do it.


Aye, and on your concern about the different problem, the toothpaste is out of the tube never to truly be returned.

We can evolve the market (in my view, into luxury goods with NFT type tech) or we can wait for artists to truly starve. I'm a proponent for solving the problem that can be solved to help folks move forward.


We can try to evolve it, sure. I don't think that's an option that will interest enough people to matter.

While it's possible that these AI tools will leave some (certainly not all) artists without work, what I think is really going to happen is that artists will harness them to do new things that were simply impossible before, or to make their work easier. Technology rarely destroys jobs; it more frequently changes their nature. Just like how at some points animators needed to know how to use 3D tools when in previous decades they didn't, in the near future graphical artists will need to know how to use AI. It's possible that where there were previously two artists working there will then be only one, but such is life. Demand for art is finite.


I worry that the ability to effortlessly conjure a "good enough" image may drown out efforts to thoughtfully create a great one.

Your comparison to computer animation is apt. Rewatching animated classics recently, I can see what we've lost now that every film is plastic.


I agree, traditional animation was better than modern CGI, but I don't think it's as simple as CGI being an inherently worse medium, but that films are produced more cheaply. Some weeks ago a friend and I were watching and comparing some scenes of Snow White and Cinderella in English and Spanish and were stunned by the singers in both languages. How often do you hear actual opera singers in modern Disney films?

So, yes, what you say may definitely happen, but it's a trend some graphical industries have been on for decades. It's why there are so many fewer professional animators anymore. I wouldn't be surprised if some techniques of traditional animation have been lost by now.


Yeah, sure, but that's because streaming was merely more convenient than Napster. No more downloading bad songs with bad metadata. No more lugging around and hand curating an mp3 library. And for a lot of people: no more having to choose what to listen to.

In two years people will be generating novel music of any style with any vocals and vocalists they want. That'll be even more of a fit to consumers' wants. They'll never run out of music that will appeal to them.

I'm currently working in this space and it's wild the things that are possible.


True, but the post-Napster media negotiations very much priced it in.

Nobody is paying $10 per CD anymore.


Looks like Taylor Swift's new release on CD is $14 at Target right now and $12 for digital download.

They're not paying $10, they're paying more.


According to this site [1], $14 in 2022 Dollars is $7.19 in 1995 Dollars.

So, about $3 less.

[1] https://www.usinflationcalculator.com/

(Who knows if that site is reliable though, the point is inflation.)


> It's a bit like saying we can't stop music piracy, now that Napster exists.

Curious choice of example, because it was never stopped. It just went somewhat out of the mainstream because the industry offered pricing options, like Spotify, who were acceptable for most people so they no longer had an incentive to resort to piracy. Not wanting to pay at all was always a minority position, most people just found it ludicrous to pay full album price for one of two songs that they liked.

And still, if you do want song X for free you can still obtain it easily. The industry just no longer makes a fuss about it.


> There's nothing inevitable about it. Laws exist to protect people.

Amen.

I think one thing we're going to have to look at is having the expectation of a separate agreement for having ones work go into a training set. Maybe equity should even be the standard here.

And informed consent associated with it. People need to know they're training something else to do their job as well as doing the job, selling the cow instead of the milk.


> There's nothing inevitable about it.

Everything you can come up with as a "solution" is really just a stop gap measure. Instead of specifying the style by name, you could specify it by example image. Instead of training the AI on her images directly, you could train a second generation AI on images drawn in that style generated by another AI that was trained on her images. Thus your second generation AI would be free of any copyrighted work. And of course the whole copyright thing only comes into play when people redistribute the AI. If AI is easy to train yourself locally, even that doesn't matter anymore.

If you want to go all Butlerian Jihad on the world, you might be able to stop it. As long as AI is allowed, this ain't going away, it's only getting easier, cheaper and faster.


For all we know, this could already be happening. Every digital image produced yesterday could have been AI-generated for all anyone knows. The original artist in this story could already be using AI to create their own work. Of course, I don't actually believe that's what's happening in this case, but the fact that it could means it's probably impossible to return this genie to its lamp.


It honestly sounds like you're making a good argument for a Butlerian Jihad in that case.


HN suddenly filled with a bunch of crazy luddites. Why don't we instill the death penalty for artists who has taken inspiration from other artists while we're at it.


I know right? I think all those FBI warnings worked maybe, and the new generation of geeks think IP is actually a moral thing instead of a corporate money-grab.

Also, Herbert wasn't against AI I don't think. I suspect he simply recognized he couldn't comprehend the world that far in the future if AI was a part of it. Instead, he used space magic to explore his very present reality of resource wars, and went on to make a point I'm not sure I understand, about too much political order and resultant stagnation causing self annihilation.


I was joking, but HN at the same time is filled with people that believe regulation only stifles innovation.

So just because AI is inevitable doesn't mean that we should abandon all regulation. There would be good merit in slowing down some progress, so we can actually maintain a good transition to new industries.


I'm pretty sure if you steal a diamond from a diamond thief it is still considered theft from the original owner.

Your "indirection" hasn't ever been a valid argument.


Except you're not stealing anything and the "thief" didn't steal anything, either. You both just made a copy.


In fact, in that case, the indirection does the opposite of protect: even receiving the diamond from the original thief as a gift is illegal.


> Use of the output of systems like Copilot or Stable Diffusion becomes a violation of copyright.

That really should depend upon the output.

Many, if not most, people learn an art by imitating the style of established artists. Some will carry on with that style. Others will develop their own, though it will probably always carry elements from those they imitated. Should injecting a machine into the process automatically make it illegal?

There are going to be clear cut cases where it should be, cases where so much is imitated that it goes beyond style and into substance. Yet that means we should have a human looking at the output to determine if it is too close to a copy, rather than banning AI generated art altogether. To do so would put the creative process in peril. This is not because machine learning reflects our definition of creativity. Rather it is because it is difficult to define what human creativity itself is.

(That said, I do believe that using the artist's name as a way of promoting their own work is stepping over a line.)


> That really should depend upon the output.

Except, in copyright law it depends on the _input_.

These models would not exist if they were not first fed the source material.

Until we have systems that are not trained on a pre-existing corpus this will remain true. No matter how clever the algorithm, without the source material you have no output. Zilch. Nada.

Now, when the source material is someone else's property this means that without - someone else's property - you would have had no output.

So, when you want to use someone else's property, which you do not own, the general rule is that you a) first ask them if you may and b) pay them for the right to use their property.

In this sense it is no different than using a photocopier.

It's the copyright ownership of the material you put into the machine that will interest the judge not the quality of the copy.

I'm really looking forward to the first court cases and predict that much hilarity will ensue!


Trained models don't have the actual images inside, they have summed up gradients. So what they are doing is far from a copy&paste job, it's more like decomposing and recomposing from basic concepts, something made clear by the "variations" mode.

Among the things the model learned are some un-copyrightable facts, such as the shapes of various objects and animals, how they relate, their colours and textures - general knowledge for us, humans. Learning this is OK because you can copyright the expression, but not the idea.

Trained models take little from each example they learn. The original model shrunk 4B images to a mere 4GB file, so 1 byte/image worth of information learned from each example, a measly pixel. The DreamBooth finetuning process only uses 20-30 images from the artist, it's more like pinning the desired style than learning to copy. Without Dreambooth its harder but not impossible to find and use a specific style.

And the new images are different, combining elements from the prompt, named artists and general world knowledge inside. Can we restrict new things - not copies - from being created, except in patents? Isn't such an open ended restriction a power grab? To make an analogy: can a writer copyright a style of writing, and anything that has a similar style be banned?


> Trained models don't have the actual images inside, they have summed up gradients. So what they are doing is far from a copy&paste job, it's more like decomposing and recomposing from basic concepts, something made clear by the "variations" mode.

Doesn't matter. JPEG of the work is just a bit of numbers to feed equation, doesn't change the fact it's copyright infringement


A digital photo of the Eiffel tower at night doesn't have the real Eiffel tower inside, only weights and pixels - still you don't have the rights to publish your photo of the Eiffel tower in France.


Drawings and art including the Eiffel tower are still ok, right?


>Except, in copyright law it depends on the _input_

I am not a lawyer, but my understanding of copyright law is that this is explicitly the opposite.

https://www.wipo.int/edocs/pubdocs/en/wipo_pub_909_2016.pdf

>• reproduction of the work in various forms, such as printed publications or sound recordings; >• distribution of copies of the work; >• public performance of the work; >• broadcasting or other communication of the work to the public; >• translation of the work into other languages; and >• adaptation of the work, such as turning a novel into a screenplay

None of these rights, to me, indicate that copyright protects the input. The AI model is not reproducing any specific works, distributing copies of it, performing it in public, broadcasting it, translating it to another language, or adapting the work from one format to another.


>Now, when the source material is someone else's property this means that without - someone else's property - you would have had no output.

Exactly the same happens with artists. The only artists who can claim not to have been influenced by seeing the work of other artists lived tens of thousands of years ago. So what makes it okay to process artwork via some processes but not others, when the ultimate output may in some way copy the input anyway?


Personally, I'm on-board with protecting artists incomes, however, I think there's a middle-ground.

First, I'd like to correct a fact you omitted: Napster, Limewire and the like didn't come out of nowhere. They were created because artists and their recording companies forced consumers to buy entire CDs at inflated prices that kept rising. Now, what they got in return from their consumers after that wasn't fair either.

I don't think making AI generation illegal for everyone makes sense. That's how you get the Metallica's of the world bank-rolling professional grifters to hold people's grandmother's financially hostage.

I do think it makes sense to bar AI generated products from making money if the works used to train it did not belong to that company or individual. If you create a program using CoPilot, you should not earn money. If you make a comic using Stable Diffusion, you don't deserve money. This keeps the power players in check while allowing artists paths to use these AIs if they own their own work outright. Imagine if you could train CoPilot on your own code and then use it to help you. That to me sounds like the framing for a new and responsible form of innovation.


yeah! i think AI tools must be transparent of the input, period.

it feels too unfair leading a new art style and simply be copied with machine precision and speed… opting to not contribute to the neural network database should be a thing but i do not know how reverse engineering of output can be done


Yes, there should be opt outs for ML training. They could take many forms - robots.txt rules, special HTML tags, http headers, plain text tags or a centralised registry. You can take any work out of the training set without diminishing the end result. But doing so would mean being left out of the new art movement. Your name will not be conjured, your style not replicated, your artistic influence thinning out.

If an artist wants her works to have the fate of BBC archives, that removed millions of hours of radio and tv shows from the internet, then go ahead. The historic BBC content was never shared, liked, commented or had any influence since the internet became a thing. A cultural suicide to protect ancient copyrights.


Music piracy didn't stop because of the Napster shutdown. It just manifested itself in different ways. Now all you need to do is use youtube-dl to download the youtube video or soundcloud track or bandcamp album with the -x flag to extract audio. Both the software and the original media sources are legal. In fact, GitHub was forced to take a public stance on youtube-dl after a DMCA takedown request on the repo.

The biggest reason the laws can't possibly hope to stop the practice is:

> [MysteryInc152] told me the training process took about 2.5 hours on a GPU at Vast.ai, and cost less than $2.

As those costs are driven down and the software is accessible to more people, distribution of the weights will not be needed.


I think that one of the main reasons YouTube became so huge was music piracy.


Modern intellectual property law, specifically copyright, is so brazenly slanted to maximize benefit profit for American corporations at the expense of your averge person with zero consideration for the rule of law or the democratic process.

Year after year entrenched media interests lobby the US government to make IP policy more corporate friendly and those policy changes are forced on the citizens of countries around the world through strong armed free trade agreements.

We don't get to discuss these thigs as citizens of soverign states, they just happen to us.

Maybe we want to live in a world with a substantially shorter copyright term, is that so wrong? Maybe that would be better for individuals and society as a whole but we'll never know because American companies wont risk the chance of losing money or power to find out.

How long do you think should copyright should be before it reverts back to the public domain?

5 years? 25? 50? 100? 500? 1000?

How much is too much copyright?


>It's a bit like saying we can't stop music piracy, now that Napster exists.

You can't. Soulseek is alive and well. It started with Napster and now others are carrying the torch.


Except now the vast majority of people pay for music via Spotify.


>It's a bit like saying we can't stop music piracy, now that Napster exists.

perfect example, as piracy has _never_ been effectively stopped and _never_ will be effectively stopped by legislative means


Sure, on a small scale, but Spotify, Apple Music, Tidal and I'm sure many other platforms exist and are quite successful.

I'm a big pirate myself, with multiple TB harddrives full of pirated music accrued over the years, but even I choose to use Spotify and Tidal a lot of the time, out of pure convenience.


Exactly though, the fears of the industry of the time were met, one way or the other. Spotify and others came around and basically destroyed the album / CD model, led to independent publishers having way more power than ever before. It is a record company hell that we're living in right now. Despite spending as much as they did to kill Napster, they weren't able to stop the "inevitable."


> Sure, on a small scale, but Spotify, Apple Music, Tidal and I'm sure many other platforms exist and are quite successful.

They are not stopping the piracy.

They are providing better service than piracy. As Gabe Newell said, piracy is a service problem.

Offer something convenient and worth the price and a lot of people will pay it just fine.


I think it may fall along very similar lines to "sampling" in terms of use... the AI model obviously used copies/samples of original/copyright works.

I'm not saying I support the argument or that it will stand up in court, but definitely some merit to making it.


Napster distributed whole songs. If it sampled 1000's of songs then created original compositions that sounded kind of like the 'style' of those songs what would the legalities be? That is a huge difference. I'm a professional artist who has been able to make a good living and support a family, what does this mean when someone with an algorithm and some key words can produce good-enough work in a fraction of the time for pennies? There is a huge swath of professional artists whose livelihoods are at stake.

Is this like the stagecoach makers when automobiles where invented? Or is this like Napster stealing copyrighted material? This is new territory.


Very much the first.

(1) Ubiquitous new technology

(2) New domains the technology opens

It's even more fundamental than stagecoach --> automobile. It is more like cipher --> RSA -- fundamental change based on basic math and ubiquitous, readily available technology.

The toothpaste is out of the tube!


At this point, I don't think the law can stop it. We're looking at a technology that can easily become illegal but ubiquitous, like Napster in the heady days of flouting audio copyright.

If the entire Western copyright sorted of influence unifies on it being an illegal system, Russia and China are under no disincentive to ban the techs. Especially if it makes their entertainment industries more competitive with the Hollywood machine.


Not just entertainment, whoever bans this will lose out everywhere and become a backwards shithole.


- Copyright expires 70 years after an artists death.

- Corporations / Contributers can buy images, or draw their own

- Getty Images already has the rights to 477 million images.

This is inevitable.


>It's a bit like saying we can't stop music piracy, now that Napster exists.

Trivially, you still can't. Lawbreaking when it comes to copyright is enabled at scale by computers (like everything else); so unless you manage to win the war on general-purpose computing there's nothing you can do.

Sure, streaming has taken the place of piracy (growing the pie is better than strict conservatism), and Patreon (and its offshoots) has made it possible to be paid for recurring content that's inevitably going to be pirated, but file-sharing (torrents) and alt.binaries (abusing free storage sites as a backend for streaming video) still work just as well as they ever did.

The only reason people pay for content is that they want to, provided the price isn't usurious or infinite ("not sold in your region"); those that continue to work with said want prosper, those that fight it fail, and that's just the way it is.


If artists were required to exclusively sign with globe-spanning conglomerates that pay them in loans and take a cut every time they teach an art class (no good comparison for record companies taking cuts of concert revenue), you'd see a society-breaking, unjustifiable level of protection for an artist's "style."

As it is, artists don't have massive teams of lawyers and billions in assets, which makes their concerns irrelevant to the people who would normally be bribed to advocate for them.

For copilot, I'd like to see more models trained on stolen and leaked proprietary code from hacks, or an organized movement to leak code from businesses and feed it into a freely-shared model. If transformation into the model is enough to launder copyright, it ceases to be stolen code. I'm sure it would be helpful in cloning proprietary products.


How do you exactly define a single person's style vs a genre? While any artist might specialize, as is the case here, in a single style and distill it and create a large body of work in the specific style, do you think no one before created a similar work of art in the same style?

Naming a recognizable artist is the current "lazy" way of doing this instead of naming every possible visual style; and sure we could ban name and surname, but should an artist own "dreamy flat pastel colored illustrations of cities of characters with high contrast, no lines, children illustrations" for perpetuity? Definitely not, for the style itself there's likely hundreds of artists that have done something similar before and after.


I think the advantage of using them is too great, companies that use the networks will outcompete ones that don't -- even if it were made illegal in the US it won't be everywhere. I imagine when it comes down to it, the law's going to be pragmatic. What happens to US industries if we allow this, and what happens if we don't, and my guess is that it ends up being allowed.

I'm not saying that's good or right - I really don't know how I even feel about the AI networks morally... I just think money is going to win out.


Copyright exists to incentivize authors, artists, scientists, etc. to create original works by providing a temporary monopoly.

The arguments suggesting that people shouldn't benefit from their work on an individual level, and pointing to music piracy as an example of why we shouldn't try, strike me as arguments for general inaction and fatalism. Not sure what the goal is, there...


The goal is to get these people to face reality. The fact is we are in the 21st century, the age of information. Their creations are just data, and data can be copied, processed and transmitted worldwide at negligible costs. There is no controlling it.

The goal is to make them stop trying to control it. Because their attempts to control it are ruining computers for all of us. We already have harmful stuff like DRM on every chip because of these people. Platforms are getting more locked down, our freedom as users and programmers is decreasing. They will destroy free computing as we know it if this keeps going unchecked.


Because someone may see a version of reality where people are incapable of benefiting from their own work does not mean that it's by any means a settled issue or indicative of "Reality". I doubt these conversations would exist if it was. It is indeed the current year, but that doesn't mean that because things can be metaphorically distilled with false and reductionist equivalencies, that it should all be free for the taking to benefit a few people who outran regulation.

Regarding the concept of control, artists were first put in a defensive position by the individuals who started using their work without their consent, and who are trying exercise their own control over the artwork produced by others through monetizing outputs. Are only companies like Stability.AI, OpenAI, and Midjourney exclusively permitted to use and control the artwork of others, and allowed to charge for access to models which use this artwork without compensation or accreditation to the original authors? Are those artists computers not also being ruined? Do they not deserve representation?

We need to stop demonizing the idea that someone can benefit from their work because there are some companies that have fought to extend copyright for their own benefit.

Copyright REFORM is generally a much more supportable issue than the idea that everything should be free in perpetuity...


> does not mean that it's by any means a settled issue or indicative of "Reality". I doubt these conversations would exist if it was.

It is the reality of computing. Anyone trying to deny that is going to discover that bits are bits and there is no control unless you end computer freedom. It takes tyranny such as mandating that computers only run government signed software to change this reality. This is the sort of thing that will happen if this copyright insanity continues and it will also pave the way to absurdities like regulation of cryptography.

> We need to stop demonizing the idea that someone can benefit from their work

Nobody is doing that. They can benefit from their work as much as they want. Plenty of creators are benefiting right now from patronage via platforms like Patreon. They're getting paid for the act of creating, not for sales of an artiticially scarce product. Copyright is not necessary.


The reality of physics and biology is that if someone is bigger and stronger than someone else, they can beat them up and take their things. Anyone trying to deny that is going to discover there is no control unless you end the freedom of unlimited violence. It takes tyranny such as mandating that beating people for no reason and taking whatever they have using the tool of your superior physical strength results in punishment imposed by collective agreement of society.


You're actually comparing these copyright issues to physical violence? I don't even know what to say. They're not even in the same conceptual space.


I don't think this is freedom - as long as some company with a million time the resources that I have can train a better model, I'm only ever using the models someone with power gives me, no matter how small a device the model runs on.

Having larger models and adapting the weights is one thing but the innovation is mostly on the side of large entities.


> Copyright exists to incentivize authors

It's ideas (memes) copying themselves, making variations, evolving. Until now ideas could only jump from human to human, intermediated by various media. Now they can be unified and distilled into a model, a more efficient medium of replication for ideas, and more helpful in general because it can be adapted to new situations on the fly.


So the same argument should advacate for having no patents, the advantage of just stealing everyone's patents is too great and not all countries enforce patents


Patents are nonsense and are an impediment to progress. They are a government granted monopoly.


You can argue that patents are an INCENTIVE to progress, since people are INCENTIVIZED to create newer and better things knowing that they will be able to enjoy the results of their labor without copycats leeching on their work and ingenuity. I think the pharma model of short-time allowed patents is the best, something like a 10 year competition free period is completely fair to INCENTIVIZE people to create the iPhones and cancer cures of tomorrow.


There are so many arguments here about big C copyright (Disney etc.) and how it is evil and that it shouldn't be an argument - but what I'm seeing is that small artists, freelancers are getting hurt by the output mostly at this point.

If this is about big C copyright, where is the Mickey Mouse dreambooth concept? Disney property is seen as property but the labor of some random freelancer is just seen as nothing.


https://imgur.com/a/BzOt61v 1s to google and 14s for image gen (of 4). It's already all out there.


No chance. IMO i can see case scenario artists get together to lobby for some sort of label system like food industry to label non synthetic art for those interested in supporting bespoke human created works. Then watch said artists get called out as fakers for using AI assisted features like context-aware fill.


I think the tech evolution will drive a niche luxury market for authenticity.


And just how will these new laws be passed? HackerNews upvotes?


> It's a bit like saying we can't stop music piracy, now that Napster exists.

AI art is unknown territory. Comparing this to media piracy (e.g. copying music) leads to a fallacy.

Specifically: where does fair use stop? And consider: Good artists copy; great artists steal. Any art historian will be able to show you how true this is accross all epochs and styles (and types of art no less, i.e. including e.g. music)[1].

Anything that follows is OT for the debate at hand. It is merely to point out that while not only not applying here (AI art is derivatives/remixed works not simple copies), the notion that the P2P crackdown and its legal repercussions of the early 2000's had anything to do with how much someone creating the music in the very first place got paid is a myth perpetuated by the music industry. Specifically that part of the music industry that is not the artists.

> Remember the naive rallying cry among those who thought everyone should have the right to all music, without any compensation for the artist?

The only naivity is that compensation of artists played a role in this. Piracy was never noticable for musicians who weren't already stinking rich. And for those, while noticable, it wasn't an issue. One may argue it was/is for people high up in the food chain of the music industry. But even that stands on feet of clay. From [2]:

> The main finding of the present study is that in 2014, the recorded music industry lost approximately €170 million of sales revenue in the EU as a consequence of the consumption of recorded music from illegal sources. This total corresponds to 5.2% of the sector’s revenues from physical and digital sales. These lost sales are estimated to result in direct employment losses of 829 jobs.

There are approx. two million people being employed by this industry in the EU[3]. Go figure.

For further reading on the funny idea that artists got compensated before P2P and didn't after there is Courtney Love's classic debunking piece on musician's revenue around the time Napster was a thing[4].

And some comparable numbers from what this means for artists trying to make a living of digital music today [5][6].

[1] My father was an art historian. My opinion is mainly based on spending every holiday of my youth looking at art from all epochs across Europe, first hand. Nolens volens I may add. I.e. I'm saying: take my word for it. :]

[2] https://euipo.europa.eu/tunnel-web/secure/webdav/guest/docum...

[3] https://www.ifpi.org/music-supports-two-million-jobs-contrib...

[4] https://www.salon.com/2000/06/14/love_7/

[5] https://www.hypebot.com/hypebot/2019/12/how-many-spotify-str...

[6] https://www.rollingstone.com/pro/features/spotify-million-ar...

Edit: typos


I know it doesn't sound nice but harm to the artists is similar to the harm you do to hole diggers when they see you bringing excavator to your plot instead of hiring them.

Art is not digging holes. But some of it is and more of it will become it in the future.


It's hardly comparable. The excavator does not owe its creative influence to the hole diggers. The quality of its work does not result from someone else's intellectual labor. It's 100% the digger doing the digging.


It's technology putting people out of work.

We care a lot about Copilot, we care somewhat about artists, we care little about manual workers, and less so if they are in another country.


Not quite. It still relies on their inputs. It doesn't just put them out of work, it feeds off them.

I'm not sure of what I'd call this relationship, but parasitic is close to it.


It’s a teacher-student relationship, except the teachers don’t do anything specifically for the students. Let our jobs be taken by Copilot. Are you that afraid?


> The excavator does not owe its creative influence to the hole diggers

But it does. While not as complex as AI art generation, the excavator is mimicking the hole digger. It takes a human action, generalizes it, and offers it in a more efficient manner.


As you allude to, this analogy works if you're creating art purely because there is a demand for it and you only put in the effort required by the customer.

But that is generally not the case with illustrators, and certainly not in this case.

Also creating a new model is dependent on artists (at least right now!) while excavators are not dependent on hole diggers.


I think the problem is ethics models kind of fall apart here. You can trivially make an argument both for and against this on the grounds of ethics. Legally it seems pretty clear nothing wrong is being done here. A human can train on a particular artists style and lift it just fine. Which they regularly do. We just made it way way easier now.

So sure, we can empathize with someone feeling kind of off about the situation. But at the same time its kind of eh, that's how the world is.


> Legally it seems pretty clear nothing wrong is being done here.

It most certainly doesn’t. Just because a human can eyeball an art style and copy it eventually, does not translate into “I can take your copyrighted work and feed it to a machine”. Ethically you may argue one way or the other, but legally you are using somebody’s else’s work without permission.


The machine in question is the human brain.

And besides, there's nothing illegal about "using" a copyrighted work without permission (for example, if artist wanted to use the pixel values in an image for a color pallet, that's totally fine), only reproducing it - which no image generation model does.


I just think the law hasn’t caught up with tech again here. This is derivative work, essentially by definition, and just because the styles are being created without “patching together existing IP,” doesn’t mean they are in the clear.

We can trust that a human creator who apes the style of another human creator will do so with a preponderance of flaws such that their works are distinguishable. AI doesn’t operate like that and the case can’t be made that somehow both the AI and the person spontaneously landed on a certain style like it could with two human beings. As the commenters say in the article, the AI couldn’t generate anything without the original works.

Apropos of nothing, her art style really isn’t that original. Her style itself clearly apes illustration styles of the 40s and 50s. I guess copying never goes out of style.


> AI doesn’t operate like that

Wrong. Various regularization schemes are used in AI models which essentially introduce “flaws” and noise into the process. The flaws are more optimal than what the brain does, but they are there.


This may only be the case right now but I find current AI art sits in the uncanny valley. At first glance it looks impressive but after you spend a few hours looking at the output of current algorithms you start to recognize the same quirks and shortcomings in every image. At this point if I needed art for a project I really cared about I'd still spend the money for a human artist.


Doesn't matter if it's legal or not

See I don't understand this. Training an AI using this content is legal, but how is copying it in the first place to use not illegal? This is what I never understand about these AI cases. If I copy a Youtube video I'm breaking the law, but if I then use my illegal copy to train an AI it retroactively makes my copying legal?


I think you're using the word "copy" incorrectly. What does it mean to "copy" a Youtube video? You are literally making a copy of it on your computer as you watch it, that's what watching a video is. You're also making a "copy" of it in your brain.


Right, but you don't train a model by pointing a camera at a screen. You download the video file. You deliberately bypass the copy protection. I'm not saying it should be illegal, I'm saying it is.


Hmm, is that how it works? We're mostly talking about images in the op, not video, and her images are freely available to view / download from her website.

I'm not sure about literally downloading YouTube videos, you might be right about that.


and her images are freely available to view / download from her website.

At least under US law, the person downloading the images has to be the same one who trains the model, because giving them to another person is copyright-violating distribution.

To be clear, I'm just sick of corporations being free and clear to do things that would get the rest of us stomped.


The LAION dataset, which SD was trained on, is just a list of URLs and textual descriptions. There's no illegal copying going on when StabilityAI trained SD. It's also not illegal for you to do the same thing.


> If I copy a Youtube video I'm breaking the law […]

No. Only if you redistribute your copy are you violating copyright law.


I like how her style has a storybook thing going on. She has a knack for expressions.

edit: That's because she is a storybook illustrator! I should have checked the video too.


No difference at all with how Disney a giant corporate machine has treated artists and art over the last few decades. This is just the next stage in the mindless machines evolutions.

Artists have had less and less influence on anything beyond mindless consumption during our generation. So no surprise where the story goes. Without influence you can't control what the machine does.


If your art style can be ripped off by an AI looking at 32 images, your art style is too pop to be considered "yours" imo.

To take that criticism further, the original artist took Disney art subjects and applied a nickelodeon animation style to it.

Hopefully the artist can appreciate the irony of claiming that it is _her_ style being ripped off when everything in the examples shown is clearly pop art, categorically defined by predominant social influences, and not something which came from her artistic perspective.


Is there any artist whose style cannot be imitated by an AI given selected images? I'm pretty sure an AI can correctly mimic Picasso if you give it 32 blue period paintings. Does that mean Picasso has no artistic value?


No, but it probably means that Picasso is also insufficiently unique/inscrutable to expect not to have his style imitated.


I very much sympathize with her because it isn't fair, but I do find it slightly hypocritical that when looking at the work on her website, much of it is drawing other copyrighted work.

I suspect the only acceptable answer here is to disallow AI training of copyrighted material, but this only delays the models supplanting the actual artists (because people will contribute to and build up a pool of copyright-free training material), it doesn't prevent the ultimate issue of people being replaced by AI.


I believe Disney paid her for those.


But her style is still defined/drawn from Disney/Nick/other cartoons anyway. At the end of the day she said that she didn't see herself in the AI images and felt distanced from it - isn't that good enough?

Her art still has value to her and her clients as long as the human in the loop still has something to offer buyers (ie being able to have a conversation/work on the design for something tailor made).

When the AI is smart enough to generate art from an ongoing conversation about a piece, making adjustments etc then she will have to draw art for herself and those that appreciate it, in the same way that there are still some blacksmiths around and people buy their works because they love that it was handmade/a piece of history.


Indeed!

And another important point that's buried, is that most of the art used to train the model is actually owned by her clients such as Disney. So, she could not, even if she wanted to, give permission to use that content.

IOW, the person training the model was just fine with stealing the artwork.

While the artist seems to not be litigious, it'll be interesting to see if the major rights-holders like Disney start going after the AI model companies and/or the people that train the models, if they find that there is output of their properties.

This automated generation of code, text, art, etc., is really nothing more than sophisticated sampling/mashup, and when you use snippets in your work output, it should be credited and properly compensated. This is rapidly amounting to automated creation theft engines.

Worse yet, thinking ahead a bit, once they've all been trained on the available works, and all the writers, artists, & coders have been put out of work, progress will stagnate, because the "AI"s will generate nothing new, only continuously regurgitating bits and mashups of what is now old stuff.


If a human looked at her art and then made new illustrations in the same style would that be ok? If so, why is it not ok for a computer to do it?


I appreciate the argument. It made me think that we might have a “photographs steal your soul” kind of moment here with new technology.

Nonetheless, the difference is pretty clear here I think. An AI makes the artists style infinitely reproducible by all. A single artist copying an artist’s style is basically what copyright is all about, including the relative straightforwardness of enforcement through ordinary litigation.

Whether copyright is or is not being breached by the AI, there’s a paradigm shifting difference in the nature of said AI, perhaps not unlike downloadable digital music on the internet compared physical media.


I'm not a moral philosophist, but I'd say the difference is effort.

I mean no shade on the illustrator herself, but her style looks derived itself from other styles.

Anyway, another illustrator would need to put in the effort to learn the style, and then X hours to create each piece. An AI, once trained, can churn out thousands of artworks in that style per second (with enough computing power); it makes the illustrator obsolete, and like mass production of low cost knockoff products, it forges competition and cheapens the brand / style.

Is that good or not? I don't know, again I'm not a moral philosophist.


If a human artist sold work created by copying her style, they would definitely get called out by other artists.


>Next, somebody grabs her work (copyrighted by the clients she works for), without permission.

Isn't this what a lot of these AI models use though? Pretty much anything trained on data from the internet is going to be largely copyrighted, no?


When it's using a lot of different types of work from different creators I think the output is more a sum of it's parts, it's a little bit more of a gray area. When it's specifically trying to copy one persons style, that's very personal, and very real to the person being copied. I think it's made weirder by how low effort the copy is. Someone learning your art style and painting it is a bit different to just making a computer do it.

My second thought on this, is that it reminds me of the attitude I saw rampant for data collection back when I was getting into tech. The casual attitude towards consuming other peoples information, be it private data or in this case, work they've labored over, has lead us down the path of exploitation and profiteering. I'm sure it will be no different here.

It's all fun and games when it's free and open and we're all just having making toys, but the commercialization has already begun, and these precedents will end up being profiteered by companies who are willing to profit off of things that others had too strong of a morale compass to do.


Jaron Lanier's 'Data dignity' idea really seems like the best solve here. Her work was indispensable to whatever value this algorithm produces in the future - it would make sense if she got partial ownership of some kind. It's what share of ownership she should have that we get hung up on. In some sense she's already won, she'll definitely get more traffic to her own site now, because she was part of an interesting story about the early days of artistic AI. But we intuit that for every Hollie Mengert or Metallica out there who benefits from the attention sluice, there are a number of other artists who don't get those benefits, and by definition, we don't know who those artists are.

In an ecosystem we might say 'fit data is what makes it into the next generation no matter the species'. In a 20th century economy, we might say 'the creator should benefit from her work'. But we're not in either of those. We care more about the Hollie Mengerts of the world than their impact on the future evolution of art. Or more precisely, we care more about the right incentives being present for Hollie Mengert than how those incentives play out in this individual case. But that influence on future art is also undeniably part of the incentive structure for an artist today.

This seems like a classic wicked problem - does anyone know of a group engaging with it?


Maybe not ruthless enough. Society needs to evolve past this notion that people have control over data and information just because they created it. The faster this happens, the better. Are we seriously gonna have to put up with the good old copyright industry forever? They keep destroying perfectly good technology just because it threatens their existence. I say let them disappear.


Don't worry, if the West doesn't come around to your way of thinking, China will.


If this was to happen, and we created a world where there was no form of copyright: why would somebody spend their life in a creative industry making new things?


Most of the entertainment I have consumed in recent years has been made by amateurs that at most got donations. Besides, there can be subsidies. The current copyright law is already a subsidy, but it’s selling off the society’s natural rights instead of petty tax money.


Without copyright people can make even more things. Bring out your own Mickey Mouse - Star Wars crossover, NSFW!

Imagine the entire corpus of human creation, free to be remixed and extended. For now we hope they make a good sequel, someday...


Most of that corpus was only created because it was profitable to do so due to the protections offered by copyright law.


Intrinsic motivation to create. Also, there's no reason they can't make money some other way. Patronage and crowdfunding seem to be the answer.


They don't seem like a very good answer.

Allowing creative output to be freely used, while forcing creators to subsist on the crumbs thrown back is a two-class system. It seems unfair to those doing the work in such a system.

Copyright is far from perfect (far far...) but it is still an improvement on patronage. At least a creator has control over the use of their work.


> It seems unfair to those doing the work in such a system.

It's the only thing that makes sense in the 21st century. Copyright is unenforceable in the age of ubiquitous networked computers.

In order to enforce copyright, every computer will have to be locked down so that they only execute "compliant" software. Surely everyone browsing this site can appreciate the unfairness of that outcome. I for one do not want such a future under any circumstances. If the copyright business model is killed, so be it.

> At least a creator has control over the use of their work.

An illusion. They have no control. Their copyrights are infringed every single day. Most of the time people don't even realize they are infringing someone's copyright.


> In order to enforce copyright, every computer will have to be locked down so that they only execute "compliant" software.

This is not true. While it is one possible approach to enforcing copyright, it is not the only possible approach. Network surveillance of distribution is another possibility that has been against p2p networks.

Copyright has never been completely enforcable. It has always been a partial solution aimed at preventing organized / profitable distribution, i.e. it is a legal fallback rather than a prevention. But a partially working solution is better than nothing.


What is a "copyright industry"? This is one human trying to make a living by making kids smile.


> her work (copyrighted by the clients she works for)

There you go.


If the artist is posting personal work, then it is covered by copyright automatically per the Berne Convention.

Copyright exists to incentivize authors, artists, scientists, etc. to create original works by providing a temporary monopoly.

If an artist works with a client, the legal stipulations are going to be different for each commissioned work, generally.

Depending on the client, they may be better able to protect an artists work than the artist themselves, and with credit...


> temporary monopoly

What a joke. That "temporary" monopoly lasts centuries and gets extended whenever some rich company's imaginary property is about to enter the public domain. Copyright duration is functionally infinite, you will be long dead before your culture is returned to you.


Right. I doubt many people would disagree copyright REFORM is sorely lacking. Or are you suggesting that an artist is not allowed to benefit from their work because Disney has extended copyright?


Creators can benefit as much as they want. Just not through artificial scarcity. That ship sailed the second computers were invented and they need to stop trying to put that genie back in its bottle.

Either copyright remains unenforceable or computing as we know it today will be destroyed. There is no middle ground and I know which side I'm on. Computers are among the greatest inventions of humanity, they are too precious to be jeopardized because of such concerns as invalidated business models.


Well yeah, I don't know if people have noticed but notice how Disney has started using the classic Mickey Mouse animation at the start of all their works now, because they know their already extended copyright is about to expire.


It should be illegal (if not already) to use other people's work to train AI models. In the future, all artworks will have a license attached to it, some fair usage clause.


Won't this lead to issues like secretly trained models and proving artwork is not AI generated?


The empowerment of man is more important his ego. Pay UBI, subsidize creators with the tax payer’s money, demolish all copyright.


The guy should have tried to use his tool on Disney.



I know this website is not a hivemind, but it's interesting every time an article like this gets posted the majority opinion seems to be that training diffusion models on copyrighted work is totally fine. In contrast when talking about training code generation models there are multiple comments mentioning this is not ok if licenses weren't respected.

For anyone who holds both of these opinions, why do you think it's ok to train diffusion models on copyrighted work, but not co-pilot on GPL code?


If I were to steel man both sides it'd be something like this:

1. Training an AI model on code (so far) makes it regurgitate code line-for-line (with comments!). This is like "learning to code" by just cut and pasting working code from other codebases, you have to follow the license. The AI doesn't "understand the algorithm" at all (or it hasn't been told "don't export the input you fool"). Obviously a bog-simple AI could make all licenses moot by dumping out what it input, and the courts wouldn't permit that.

2. Training an AI model on illustrations so far produces "style parodies" which may look similar to an untrained eye (the artist here is annoyed because she'd not art like that, even though to us it looks similar enough). Drawing a picture that looks like Mickey Mouse is a trademark violation, but tracing a picture of the Mouse is both a trademark and a copyright violation.

The first violates some pretty clear legal concepts; the second is closer to violating moral concepts but those are more flexible - if an artist spends years learning to paint in the style of Michelangelo is that immoral?


The problem with this argument is that it's founded in how the AI is used, not how it is made. It's not a compelling reason to ban the tool, it's a compelling reason to regulate its use.

Copilot can produce code verbatim, but it doesn't unless you specifically set up a situation to test it. It requires things like "include the exact text of a comment that exists in training data" or "prefix your C functions the same way as the training data does".

In everyday use, my experience has been that Copilot draws extensively from files that I've opened in my codebase. If I give Copilot a function body to fill in in a class I've already written, it will use my internal APIs (which aren't even hosted on GitHub) correctly as long as there are 1-2 examples in the file and I'm using a consistent naming convention. This isn't copypasta, it really does have a clear understanding of the semantics of my code.

This is why I'm not in favor of penalizing Microsoft and GitHub for creating Copilot. I think there needs to be some regulation on how it is used to make sure that people aren't treating it as a repository of copypasta, but the AI itself is pretty clearly capable of producing non-infringing work, and indeed that seems to be the norm.


Please let’s not start dictating how people should use a piece of software. It would be like ”regulating” Microsoft Word just because people might use it to duplicate copyrighted works.


I'm not saying we should regulate the software, I'm saying we need some rigorous method of ensuring that using the AI tools doesn't put you in jeopardy of accidental copyright infringement.

We most likely don't need new laws, because infringement is infringement and how you made the infringing work is irrelevant. Accidental infringement is already illegal in the US.


i would argue that we _do_ need new laws. AI generated code is so quite different from any other literary works - after all, it was not created by a human.

My own personal opinion is that the AI generated code (or pictures in the case of the article) should be under a new category of literary works, such that it does not receive copyright protection, but also does not violate existing copyright.


This is meaningless though. The majority of AI generated art you see out there is either hand tweaked or post-processed or both. There's human input involved and drawing a line is going to absolutely backfire.


if you presented both the generated image and the "original" to a jury of peers (or even a panel of experts in the field), they would be able to make a determination as to whether the generated image violated the copyright of the presented "original".

Humans tweaking the image is immaterial to this determination - if the human tweaked it so that it no longer seem to violate copyright, then that said panel would also make the same determination.


You are arguing that AI generated means no copyright protection. So you can't tweak it to "not violate copyright" because their literally isn't any.

Of course you have no way to prove whether any image was or was not generated by AI so welcome to a new scam for law firms to aggressively sue artists claiming they suspect AI was used in their works.


The vast majority of paintings weren't created by a human either, but by a paintbrush. We should really ban those too. Just think of all the poor finger-painters who've been put out of a job!


I think it's worth pointing out that Adobe has been doing this for a long time. You can't open or paste images into Photoshop which resemble any major currency.


> Copilot can produce code verbatim, but it doesn't unless you specifically set up a situation to test it.

It does not matter what a service can or cannot do. We do not regulate based on ability, but on action.

The service has an obligation to the license holders of the training data to not violate the license. The mechanism for which the license is violated is irrelevant. The only thing that matters is the code ended up somewhere it shouldn’t, and the service is the actor in the chain of responsibility that dropped the ball.

The prompting of the service is irrelevant. If I ask you to reproduce a block of GPL code in my codebase and you do it, you violated the license. It does not matter that I primed you or lead you to that outcome. What matters is the legally protected code is somewhere it shouldn’t be.


> It does not matter what a service can or cannot do. We do not regulate based on ability, but on action.

Whether we agree with it or not, intellectual property laws have historically been regulated by ability as well as action. Hence why blank multimedia formats would often have additional taxes in some jurisdictions just in case someone chose to record copyrighted content onto them. And why graphics cards used to include an MPEG royalty in their consumer cost, regardless of whether that user planned to watch DVDs on their computer.

Not saying I agree with this principle. Just that there is already a long history of precedence in this area.

Like a lot of politics, ultimately it just comes down to who has the bigger lobbying budget.


> If I ask you to reproduce a block of GPL code in my codebase and you do it, you violated the license. It does not matter that I primed you or lead you to that outcome. What matters is the legally protected code is somewhere it shouldn’t be.

This isn't accurate. If I reproduce GPL code in your codebase, that's perfectly acceptable as long as you obey the terms of the GPL when you go to distribute your code. In this hypothetical, my act of copying isn't restricted under the GPL license, it's your subsequent act of distribution that triggers the viral terms of the GPL.

The big question that is still untested in court is whether Copilot itself constitutes a derivative work of its training data. If Copilot is derivative then Microsoft is infringing already. If Copilot is transformative then it is the responsibility of downstream consumers to ensure that they comply with the license of any code that may get reproduced verbatim. This question has not been ruled on, and it's not clear which direction a court will go.


> The big question that is still untested in court is whether Copilot itself constitutes a derivative work of its training data.

Microsoft has a license to distribute the code used to train Copilot, and isn't distributing the Copilot model anyway, so it doesn't matter whether the model itself infringes copyright.

Whereas that same question probably does matter for Stable Diffusion.


Given that there's AGPL code in Copilot's training data, it does still matter if Copilot is derivative.


Technically this GitHub license term means you grant an extra license to GitHub whenever you upload it:

https://docs.github.com/en/site-policy/github-terms/github-t...

As in " including improving the Service over time...parse it into a search index or otherwise analyze it on our servers" is the provision that grants them the ability to train CoPilot.

(also, in case you're wondering what happens if you upload someone else's code: "If you're posting anything you did not create yourself or do not own the rights to, you agree that you are responsible for any Content you post; that you will only submit Content that you have the right to post; and that you will fully comply with any third party licenses relating to Content you post.")


But you may not have the rights to grant that extra license if CoPilot is determined to violate the GPL, they can yell at you all they want but they will have to remove it, as nobody can break someone else's license for you.

It'll have to be tested in court, but likely nobody actually gives a shit.


> But you may not have the rights to grant that extra license if CoPilot is determined to violate the GPL

Which is why that second provision is there to shift liability to you. You MUST have the ability to grant GitHub that license to any code you upload. If you don't, and MS is sued for infringing upon the GPL, presumably Microsoft can name you as the fraudster that claimed to be able to grant them a license to code that ended up in copilot.


Microsoft is selling a service to put potentially copyrighted works into ones code, stripped of and disregarding its original license.


How is that different from a consultant who indiscriminately copies from Stack Overflow?

Tangent to that is the "who gets sued and needs to fix it when a code audit is done?"

Ultimately, the question is then "who is responsible for verifying that the code submitted to production isn't copying from sources that have incompatible licensing?"


The consultants would have to knowingly copy from somewhere. One can hope they're educated on licensing, at least if they expect to get paid.

If Microsoft is so confident in Pilot doing sufficient remixing then why not train it on their own internal code? And why put the burden of IP vetting on clients who have less info than Pilot?


> How is that different from a consultant who indiscriminately copies from Stack Overflow?

and how is that different from a student learning how to code off stackoverflow (or anywhere else for that matter), then reproducing some snippets/learnt code structure, in their employment?


That's also an excellent example.

Or a random employee copies some art work that is then published ( https://arstechnica.com/tech-policy/2018/07/post-office-owes... ). You will note all the people that didn't get in trouble there - neither the photographer who created the image, nor Getty in making it available, nor the random employee who used it without checking its provenance.

In all of these cases, it is (or would be) the organization that published the copyrighted work without doing the appropriate diligence on checking what it is, if it would be useable, and how it should be licensed.

> The Post Office says it has new procedures in place to make sure that it doesn't make a mistake like this again.

... which is what companies who make use of AI models for generating content (be it art or code) should be doing to ensure that they're not accidentally infringing on existing copyrighted works.


Microsoft just calls others code as «public code». Public code is in public domain.


Pilot is regurgitating snippets of code still under copyright and not in the public domain. Some may consider publicly available code fair use, but the fact that they're selling access for commercial use may undercut that argument.


How would you regulate this?


There is a part of Deep Learning research (Differential Privacy) which focuses on making sure an algorithm cannot leak information about the training set, and this is a rigorous concept, you can quantify how much privacy-preserving a model is, and there are methods to make a model "private" (at the cost of performance I think for now)


Differential Privacy only proves that it cannot leak a certain amount of information about individual samples of the training set. This only guarantees the input is not leaked exactly back, any composition of the training set is valid, although in image generation this usually means a very distorted image.

An example of DP in image generation (using GANs): https://par.nsf.gov/servlets/purl/10283631


AI image generators also often churn out near-exact replicas of their inputs. For example:

Original: https://static-cdn.jtvnw.net/ttv-boxart/460636_IGDB-272x380....

Copies: https://lexica.art/?q=bloodborne


The AI image generator is revealed to be a lossy compression algorithm which can recall near-identical images to the ones it was trained with. Therefore, the software is conveying copyrighted works. If somebody gave you the model, they violated copyright in doing so. If somebody runs the model on a server, they violated copyright in transmitting the image to you. If you, the recipient of that copyrighted work, go on to redistribute it, you have also violated the copyright. I don't see any difference between these image generators and code generators.


Exact replicas are an issue. If you are using AI image generation to replicate the near exact image, then that's illegal. But nobody cares if you copy a nice code pattern from a GPL code and apply it to your own code base. In the same fashion nobody should care if you make an image in the same art style.


Inexact replicas are also an issue, otherwise there would be no issue with distributing MP3s of an Audio CD, as it's a lossy format that is only close to the original.

I suspect the courts will treat AI more like a "black box" - they won't care how or why your black box can perfectly play Metallica, only that it does.


> But nobody cares if you copy a nice code pattern from a GPL code and apply it to your own code base

this is not true, smh. See Oracle vs. Google.


Oracle v. Google involved actual Java (declarations) code being copied; that it was a derived work wasn't seriously disputed there.


CoPilot will return actual GPL code verbatim.


Yes, and that's why I personally believe that the model itself should be considered a derived work of such code. But OP was specifically talking about "code patterns".


Copying the code verbatim would fall under copying the code pattern. You talking about changing the name of the variables or something?


> if an artist spends years learning to paint in the style of Michelangelo is that immoral?

I'd say that artist has gained a lot by studying Michaelangelo, including an appreciation for what Michaelangelo himself accomplished and insights into how to paint as well or better, and maybe even how to teach that to other people. I don't think we get those benefits from AI models doing that (at least not yet!)


I think we're kidding ourselves to think that some nebulous concept of "the artist's journey" somehow informs the end result in a way that is self-evident in human-produced digital art. Just as with electric signals in the "brain in a vat" thought experiment, with digital art it's pixels. If an algorithm can produce a set of pixels that is just as subjectively good as a human artist, then nobody will be able to tell - and most likely the average person just won't care.

On the other hand, I would say that traditional mediums (especially large format paintings) are relatively safe from AI generation/automation - for now.


> On the other hand, I would say that traditional mediums (especially large format paintings) are relatively safe from AI generation/automation - for now.

Why do you think that? I think large format paintings might be in just as much danger.

There’s a large industry of talented artists in China, Vietnam, etc who copy famous artworks by hand for very low prices. They’re easily accessible online: you upload an image and provide some stylistic details and the artist does the hard work of turning the image into brush strokes. It’s not “automated” but I’ve already ordered one 4’x2’ AI generated painting in acrylic relief for less than the cost of a 1’x1’ from a local community gallery. I put in quite a bit of work inpainting the image to get what I want but it would have been completely impossible to get what I want even six months ago.

I’ve only ever purchased half a dozen artworks in my life and they were all under a few hundred bucks but with this new tech, it just doesn’t make sense to buy an artists’ original work unless it’s for charity. The AI can do the creative work the way I want and there are plenty of artists who are excellent at the mechanical translation (which still requires a lot of creativity, mind)


You don't even have to go to China - I had a very nice painting painted from a photograph for a friend done by another friend's mom who just like painting landscapes.

It looked great and all I had to do was pay for supplies, which was still less than the cost of the framing.


I didn't know there was an industry for that, I guess I should have figured. I might look into that for my own purposes. Although for what it's worth when I said "large format paintings" in my mind I was thinking very large paintings - like Picassos's Guernica - larger than something the average person would have hanging in their home. To the point that the cost of producing it and transporting it is large enough that a buyer is more likely to take personal interest in the artist and much less likely to knowingly purchase something AI-generated or otherwise automatically produced.


That is simply a version of the GPs "artists who are excellent at the mechanical translation".

Want someone to paint the ceiling in your mega mansion? Sure.

But now the creative bit can be done by you - or your 8 year old - if you like.


I think we're kidding ourselves to think that clustering features of existing works and iteratively removing noise based on that clustering is somehow comparable to building up human experiences and expressing them through art.

Using the "brain in a jar" thought experiment, you're making the assumption that the iterative denoising process is equivalent to the way the "brain in the jar" would generate art. Since the question is whether or not the processes are equivalent, it seems nonsensical to have to assume their equivalence for your argument.


I don't think the artist's journey necessarily informs the end result in some way - but I believe it can be an important experience for the artist. Then again, artists can still do this in the era of generative art - there's just not much as much chance of being rewarded for it. If this leads to fewer people wanting to explore art, then I think we've lost something. But it's not clear to me where things are headed I guess. This could be a huge boon in letting people explore ways of expressing themselves who otherwise lacked the artistic ability to want to try.


In retrospect I think I may have been overly pessimistic.


And perhaps more importantly regarding (1) than simple regurgitation: code does things. There's a real risk that if you just let Copilot emit output without understanding what that output does, it'll do the wrong thing.

Art is in the eye of the beholder. If the output looks correct as per what you're looking for, it is correct. There's no additional layer of "Is it saying what I meant it to say" that is relevant to anyone who isn't an art critic.


Art is in the eye of the beholder, but it still needs a creator.

That creator had a vision in mind that's unique to them because of their experiences, and I think it's wrong to say that this image can be quantified as a location in a abstract feature space.

So to say "there is no additional layer [to judge goodness] that is relevant to [most people]" assumes there is an algorithmic measure of "goodness" that can be applied to art, which is an assumption you need to make to believe that there's any similarities with AI generated art and human generated other than "they look kinda similar".


Until 100 years from now, when more general purpose AI are having what could be described as experiences, and can be asked to draw a picture of how they feel when thinking about being unplugged/death.

We hoomans love to think we're special, but quantum tubules etc besides, we really are just biological computers running a program developed over our evolutionary/personal histories.


Sure, in a future 100 years from now when AI is an actual general AI and not the specialized algorithms we have today, one might be able to argue that it does things the same way as a person(although one would hope it does then better instead, since that's the goal).

Until then we are special and we're just pretending these specialized algorithms are replicating the things we don't even understand. Anthropomorphising the algorithms by saying they "learn" and "feel" and "experience" is us as humans trying to draw parallels to ourselves as we find our understanding inadequate to explain what's actually going on.


I'm pretty sure that there's a considerable amount of art hanging in museums, that was done by students of great artists. I think there are several Mona Lisas, done by da Vinci's students, and they are almost identical to the original.


In fact it's well known that successful artists would have studios with their students churning out art that they'd apply the direction and final touches to.

Which are which has been lost to time in some cases, and the art world is filled with dissertations on it.


> an artist spends years learning to paint in the style of Michelangelo is that immoral?

This is a deceptive comparison. A human learns a style and adds their own ideas. They ideas are affected by their mood, schooling, beliefs and coffee this morning.

AI has only the training dataset. If you trained AI on 1000 copyrighted pictures, AI cant add its own ideas, it can only remix pixels from stolen work of other artists.

This is basically like money laundering, if you melted down stolen gold coins, and minted new coins, and then claimes this gold is yours because you made them.


I wouldn't dismiss it so fast. I've seen SD generate some quite creative images, and original as I've been able to determine by searching the training dataset. One example was asking for a picture of someone riding a Vespa, and one of the images had the rider wearing the Vespa fenders as a helmet, louvers and all. I don't see what else to call that but the AI's "own idea".


By deconstructing the "decisions"(to use a disgusting anthropomorphism) that led to either image we can dismiss the "I don't understand, so it must be doing something greater than it is" rhetoric.

The decisions leading up to the human art is the entire human experience leading up to the creation of the art(and possible context afterwards), which we as people tend to put value on.

The "decisions" leading up to the AI art are a series of iterative denoising steps that attempt to recover an image from noisy data by estimating how much the noise differs from the "good looking" image.

So for your "vespa fenders as a helmet" drawing, I don't think that constitutes an algorithm being "creative". If a human were to make the same picture we could rationalize that they're being creative because we can imagine a path where their human experiences led to a new idea. Since the algorithm was only ever made to denoise an image based on its abstract feature-space representation I don't see any way we could rationalize that it created a new idea. The algorithm never "thought" it should use a fender as a helmet, it only found that the best way to denoise the current image to the one described in feature-space was to remove pixels that resulted in the image.

Don't humanize algorithms. They're applied statistics, not a sum of human experiences.


If a calculator adds 2 and 2 and shows 4, is that disgustingly anthropomorphizing the word "add"? If we need a separate word for every informational process, it's going to get awfully messy.

When an idea "pops" into your head, how was that made? Couldn't it also be a similar denoising of patterns in synaptic potentials? We know from many experiments that what something feels like can be quite different from what it actually is.

Is it only that we don't know the exact brain process that makes humans special? And once we inevitably do figure it out, does all human art become meaningless too? I think we need to learn to disconnect process from result and just enjoy the result, wherever it came from.


Those own ideas are mostly affected by other things the human had seen. Way more than by the coffee.

The difference between human an AI model is that AI focuses on seeing completely but human has to do non-art stuff as well.


If you are willing to throw out the moral reason, then the legal reason is just an empty rule.


There are many legal reasons without moral force behind them beyond "we need to agree on one way or the other" - such as which side of the road to drive on.


In your example, both sides are equally acceptable and we just pick one. How does this apply to the present case?


We made a decision years ago around copyright (we've modified it since but the general concept is "promote the arts by letting artists have reproduction rights for a time"). We could change that in various ways, if we wanted to, even removing copyright entirely for "machine-readable computer code" and leave protections to trade secrets. Even if you argue "no copyright at all is immoral" or "infinite copyright is immoral" it's hard to argue that "exactly author's life + 50 years is the only moral option".

Switching the rules on people during the game is what annoys/angers people, and is basically what these AIs have done (because they've introduced a new player at low effort).


But haven’t we seen examples of generative art that are substantially similar to original artwork and examples where AI regurgitates blocks of art (with watermarks!?)


Re. Point 2:

Artists are granted copyright for their work by default per the Berne Convention. These copyrighted works are then used without consent of the original author for these models.

Additionally, the argument that you can't copyright a style is playing fast and loose with most things that are proprietary, semantically.


A key part of the concept of copyright is that having copyrighted works used without consent is perfectly fine. Copyright grants an exclusive right to make copies of the work. It does not grant the author control over how their work is used, quite the opposite, you can use a legitimately obtained copy however you want without the consent of the author (and even against explicit requirements of the author) as long as you are not violating the few explicitly enumerated exclusive rights the author has.

You do not need an author's consent to dissect or analyze their work or to train a ML model on it, they do not have an exclusive right on that. You do not need an authors consent to make a different work in their style, they do not have an exclusive right on that.


I feel there's a lot missing from this, and some terminology would require clarification (What constitutes "used"?).

Generally speaking, this supposition skirts around the concept of monetizing from the work of others, and seems at odds with what the Berne Convention seems to stipulate in that context, and arguably seems in violation of points 2 and 3 of the three-step test.

That's to say nothing regarding the various interpretations on data scraping laws that preclude monetizing outputs.

I don't feel it's that black and white, personally...


What I mean by "used" means any use where copying and reproduction is not involved.

The Berne three-step test specifies when reproduction is permitted, however, any use that does not involve reproducing the work is not restricted, and monetization does not matter. It's relevant for data-scraping laws because you are making copies of the protected work.


> Additionally, the argument that you can't copyright a style is playing fast and loose with most things that are proprietary, semantically.

This has been true since copyright existed, Braque couldn’t copyright cubism — Picasso saw what he was doing and basically copied the style with nothing to be done aside from not letting him into the studio.


But if I train my own neural network inside my skull using some artist's style, that's ok?

Either a style is copyrightable or it's not. If it's not, then I can't see any argument that you can't use it yourself or by proxy.


The brain-computer metaphor is not a very good one, it's a pretty baseless appeal. Additionally, it's an argument that anthropomorphizes something which has no moral, legal, or ethical discretion.

You do not actively train your brain in remotely similar methods, and you, as an individual, are accountable to social pressures. An issue these companies are trying to avoid with ethically questionable scraping/training methods and research loop holes.

Additionally, many artists aren't purely learning from others to perfectly emulate them, and it's quickly spotted if they are, generally. Lessons learned do not implicitly mean you perfectly emulate that lesson. At each stage of learning, you bias things through your own filter.

Overall, the idea that these two things are comparable feels grotesque and reductionist, and feel quite similar to the "Well I wasn't going to buy it anyway" arguments we've been throwing around for decades to try to justify piracy of other materials.

At the end of the day, an argument that "style can't be copyrighted" is ignoring a lot of aspects of it's definition, including the means, and can be extrapolated into an argument that nothing proprietary should be allowed to exist...


> Overall, the idea that these two things are comparable feels grotesque and reductionist

I agree with you there but the alternative - that they’re not comparable - I find equally grotesque and full of convenient suppositions rooted in romanticism of “the artist”. We’re in uncharted territory with AI finally lapping at the heels of creative professionals and any analogy is going to fall apart.

This feels like something that we should leave to the courts on a case by case basis until there’s enough precedent for a legal test. The question at the end of the day should be about harm and whether an AI algorithm was used as run-around of a specific person’s copyright


Good points.

I was actually just sitting in a AI Town Hall hosted by the Concept Art Association which had 2 US Copyright Lawyers who work at the USCO present, and their along similar lines, currently.

Basically, like you specified, legal precedent needs to be built up on a case by case basis, and harm can pretty readily be demonstrated, at least anecdotally, especially as copies are made during training of copyrighted work.

Unfortunately, historically, artists do not generally enjoy the same legal representation or resources that unionized industries with deeper pockets enjoy. It's probably one of the reasons Stability.Ai are being so considerate with their musical variant.

It would have been great if artists were asked before any of this. I could see this going in such a different direction if people were merely asked...


I'm an artist and I work in tech - I'd be very interested in working with the models if I didn't find the idea of using something made out of the labor of my peers repulsive.

Call me a training-set vegan, any model made from opt-in and public domain images I'd use in a heartbeat.


> But if I train my own neural network inside my skull using some artist's style, that's ok?

How well the network inside your skull can manipulate your limbs to reproduce good-quality work in some artist's style?

Our current framework for thinking about "fair use", "copyright", "trademark" and similar were thought about into existence during an era when the options for "network inside the skull" were to laboriously learn a skill to draw or learn how to use a machine like printing press/photocopier that produces exact copies.

Availability of a machine that automates previously hand-made things much more cheaply or is much more powerful often requires rethinking those concepts.

If I copy a book putting ink on paper letter by letter manually, that's ok, think of those monks in monasteries who do that all the time. And Mr Gutenberg's machine just makes that ink-on-paper process more efficient...


>How well the network inside your skull can manipulate your limbs to reproduce good-quality work in some artist's style?

An experienced artist can probably do this in a couple weeks, depending on how complex the style is.

>If I copy a book putting ink on paper letter by letter manually, that's ok, think of those monks in monasteries who do that all the time.

According to copyright, no, that's not okay. Copyright does not care about the method of reproduction, it just distinguishes between authorized and unauthorized reproduction. A copyist copying a book by hand without authorization is just as illegal as doing it with a photocopier. Likewise, if you decide to copy a music CD using a hex editor and lots of patience, at the end of the process you will end up with a perfectly illegal copy of the original CD.

So the question stands. Why is studying artwork with eyeballs and a brain and reproducing the style acceptable, but doing the same with software isn't?


unless you are in fact a living and breathing cyborg [in which case, congratulations] , the wet work inside your head is not analogous to the neural networks that are producing these images in any but the most loosely poetic sense.


No? The mechanisms are different but the underlying idea is the same - identify important features and replicate those features in new context. If an AI identifies those features quickly or if I identify them over a lifetime what's the difference? If I so that you might say my work is derivative but you won't due me. Why is it different if an AI does it?


This comment answers your questions:

https://news.ycombinator.com/item?id=33425414


Not particularly. Parent post is not concerned with or making any claims to special knowledge of the internal details of the modelling in the mind or in the machine, only the output.


> The mechanisms are different but the underlying idea is the same

no.

they are the same as asking a person to say a number between 1 and 6, then asking the same question to a dice and concluding that men and dice work the same.

> identify important features and replicate those features in new context

untrue

if you think that that's what people do, obviously you can conclude that AI and humans are similar.

But people don't identify features, people first of all learn how to replicate - mechanically - the strokes, using the same tools as the original artists until they are able to do it, most of the time people fail and reiterate the process until they find something they are actually very good at and only after that the good ones develop their own style.

based either on some artistic style or some artistic meaning.

But the first difference we learn here is that humans can fail to replicate something and still become renown artists.

An AI cannot do that.

Not on its own.

For example, many probably already know, but Michelangelo was a sculptor.

He was proficient as a painter too, but painting wasn't his strongest skill.

So artists, first of all, are creators, not mere replicators, in many different forms, they are not good at everything in the same way, but their knowledge percolates in other fields related to theirs: if you need to make preparatory drawings for a sculpture, you need to be good at drawing and probably painting (lights, shadows, mood, expressions, are all fundamental for a good sculpture)

Secondly, the features artists derive from other art pieces are not the technical ones, those needed to make an exact replica of the original, but those that make it special.

For example, in the case of Michelangelo, the Pietà has some features that an AI would surely miss.

First of all the way he shaped the marble that was unheard of, it doesn't mean much if you don't contextualize the opera and immerse it in the historical period it was created.

An AI could think that Michelangelo and Canova were contemporary, while they were separated by 3 centuries, which make a lot of difference in practice and in spirit.

But more importantly, Michelangelo's Pietà is out of proportion, he could not make the two figures in the correct scale, proving that even a genius like he was could not easily create a faithful reproduction of two adults one in the lap of the other, with the tools of the 16th century.

The Virgin Mary is very, very young, which was at odds with her role as a grieving mother and, the most important of them all, the Christ figure is not suffering, because Michelangelo did not want to depict death.

An AI would assume that those are all features of Michelangelo's way of sculpting, but in reality it's the result of a mix of complexity of the opera, time when it was created, quality and technology of the tools used and the artist intentions, which makes the opera unique and, ultimately, irreproducible.

If you use an AI to reproduce Michelangelo, everybody would notice, because it's literally something a complete noob or someone with a very bad taste would do.

So to not say the difference, you should copy the works of lesser known artists, making it even more unethical.


respectfully, you're raising a whole lot of arguments here that had nothing to do with any point I was raising and doesn't seem to be moving this discussion forward in any significant way. The point of this subthread thread was a user saying the following:

>But if I train my own neural network inside my skull using some artist's style, that's ok?

This post and others uses a lot of flowery language to point out that we train artificial neural networks and real neural networks in different ways. OK, great. I don't think anyone is saying that's not true. What I am saying is that it's irrelevant.

If I am an exceptional imitator of the style of Jackson Pollock and i make a bunch of paintings that are very much in that style but clearly not his work I'm not going to be sued. My work will be labeled, rightfully so, as derivative but I have the right to sell it because it's not the same thing. Is that somehow more acceptable because I can only do it slowly and at a low volume? What if I start an institute whose sole purpose is training others to make Jackson Pollock-like paintings? What if I skip the people and make a machine that makes a similar quality of paintings with a similarly derivative style? Is that somehow immoral / illegal? Why?

There's a whole lot of hand-wavey logic going on in this thread about context and opera and special human magic that only humans can possibly do and that somehow makes it immoral for an AI to do it. I am yet to see a simple, succinct argument of why that is the case.


> This post and others uses a lot of flowery language to point out that we train artificial neural networks and real neural networks in different ways. OK, great. I don't think anyone is saying that's not true. What I am saying is that it's irrelevant

Maybe I was too aulic.

The point is: you don't train "your artificial intelligence", because you're not an artificial intelligence, you train your whole self, that is a system, a very complex system.

So you can think in terms of "I don't like death, I don't want to display death"

You can learn how to paint using your feet, if you have no hands.

You can be blind and still paint and enjoy it!

An AI cannot think of "not displaying death" in someone's face, not even if you command it to do it, because it doesn't mean anything, out of context.

> Jackson Pollock

Jackson Pollock is the classic example to explain the concept: of course you can make the same paintings Jackson Pollock made.

But you'll never be Jackson Pollock, because that trick works only the first time, if you are a pioneer.

If you create something that look like Pollock, everybody will tell you "oh... it reminds of Jackson Pollock..." and no one will say "HOW ORIGINAL!"

Like no one can ever be Armstrong again, land on the Moon and say "A small step for man (etc etc)"

Pollock happened, you can of course copy Pollock, but nobody copies Pollock not because it's hard, but because it's cheap AF

So it's the premise that is wrong: you are not training, you are learning.

They are very different concepts.

AIs (if we wanna define the "intelligent") are currently just very complex copy machines trained on copyrighted material.

Remove the copyrighted material and their output would be much less than unimpressive (probably a mix of very boring and very ugly).

Remove the ability to watch copyrighted material from people and some of them will come up with an original piece of art.

It happened many times throughout history.


You're typing a lot in these posts but literally every point you're making here is orthogonal to the actual discussion, which is why utilizing the end product of exposing an AI to copyrighted material and exposing a human to copyrighted material are morally distinct.


> which is why utilizing the end product of exposing an AI to copyrighted material and exposing a human to copyrighted material are morally distinct.

sorry for writing in capital letters, maybe that way they will stand out enough for you to focus on what's important.

WE ARE NOT AIS

an AI is the equivalent of a photocopier or sampling a song to make a new song, there are limits on how much you can copy/use copyrighted material, that do not apply TO YOUR HEARS, because you hearing a song does not AUTOMATICALLY AND MECHANICALLY translates into a new song. You still need to LEARN HOWTO MAKE MUSIC, which is not about the features of the song, it's about BEING ABLE TO COMPOSE MUSIC.

which is not what these AI do, they cannot compose music, they can mix and match features taken from copyrighted material into new (usually not that new, nor good) material.

If we remove the copyrighted material from you, you can still make music.

You could be deaf and still compose music.

If we remove copyrighted material from AIs they cannot compose shit.

Because the equivalent of a deaf person for an AI that create music CANNOT EXIST - for obvious reasons.

So AIs DEPEND ON copyrighted material, they don't just learn from it, they WOULD BE USELESS WITHOUT IT.

and morally the difference is that THEY DO NOT PAY for the privilege of accessing the source material.

They take, without giving anything back to the artists.

They do not even ask for the permission.

is it clearer now?


I'll try to address your underlying thought, and hope I'm getting it right.

I think you are right to be skeptical and cautious in the face of claims of AI progress. From as far back as the days of the Mechanical Turk, many such claims have turned out to be puffery at best, or outright fraud at worst.

From time to time, however, inevitably, some claims have actually proven to be true, and represent an actual breakthrough. More and more, I'm beginning to think that the current situation is one of those instances of a true breakthrough occurring.

To the surface point: I do not think the current proliferation of generative AI/ML models are unoriginal per se. If you ask them for something unoriginal, you will naturally(?) get something unoriginal. However, if you ask them for something original, you may indeed get something original.


> If we remove copyrighted material from AIs they cannot compose shit.

I wonder in what way you mean that? In any case the latest stable diffusion model file itself is 3.5 GB, which is several of orders of magnitude less than the training dataset.

It probably doesn't contain much literal copyrighted data.


You're making much more concise arguments now, I think that makes the discussion more useful and interesting.

I would take the position that it's self evident that if you take the 'training data' away from humans they also can't compose music. If you take a baby, put it in a concrete box for 30 years (or until whatever you consider substantial biological maturity), and then put it in front of a piano it's not going to create Chopin. It might figure out how to make some dings and boops and will quickly lose interest.

Humans also need a huge amount of training data and we, at best, make minor modifications to these ideas to place them into new context to create new things. The difference between average and world class is vanishingly small in terms of the actual basic insight in some domain. Take the greatest composers that have ever lived and rewind them and perform our concrete box experiment and you'll have a wild animal, barely capable of recognizing cause and effect between hitting the piano and the noise it makes.

That world class composer, when exposed to modern society, consumed an awful lot of media for 'free' just by existing. Should they be charged for it? Did they commit a copyright infraction? Why or why not?


You are romanticizing brains. Please stick to logical arguments that can be empirically tested.


I feel like a broken record on this topic lately, but I strongly believe that training ML models on copyrighted works should be legal.

It is clear to anyone that understands this tech that it is not simply "memorizing" or "copying" the training data, even if they can be coaxed into doing this for certain inputs (in the current iteration of the tools).

Ultimately, I think the problem of reproducing certain popular works or code snippets will be solved. One interesting direction here are the tools of information theory and differential privacy. e.g) proving that certain training inputs cannot be recovered from the weights, or that there is a threshold of how much information can be gleaned from a single input.

It is easy to imagine (because its nearly there already) a future version of StableDiffusion (or CoPilot) which provably compresses all training data beyond any possibility of recovery, and yet still produces extremely convincing results which disrupt the creative professions of art and programming.

Until we get to that point, it feels like the only consistent and sensible place to apply regulations is with the model end user. When I use CoPilot, I accept the small (honestly overblown) risk that _maybe_ I won't have the license to use some small snippet of code it spits out. But I'm happy to wear that responsibility, because the boost to productivity is so great that I dread a return to pre-CoPilot world. That is, a world where everyone keeps reinventing the same solution to simple problems over and over and over again.


Training being fair use is something I can buy into.

As for actually using the model... personally I still find that unacceptable. Even if the risks are low, we have lots of works in the model, so the risk can still add up.

The idea you have in your head of training without regurgitation is likely not possible. The underlying technology treats the training set as gospel: the system is trained to regurgitate first, and generalization is a happy accident. Likewise, we can't look into a model to check and see what it's memorized, nor can we trace back an output to a particular training example. Which has ethical implications with the way that AI companies crawl the web to get training data; such models almost certainly hold someone's personal information and there's no way to get it out of there aside from curating your training set to begin with.


I mean, in a sense yes the training set is gospel. But these systems are also (generally) tested against held out data.

When you have to model 100TB of images with 4gb of weights there is no way this is possible without learning some kind of patterns and regularity that generalise outside the training set. Most generated items will be novel, and most training items will not be reproducible.

It doesn’t seem radical to suggest that the copying issue will continue to recede as we get better models.

And there are areas of research specifically concerned with _provably_ showing that you cannot identify what items a model was trained was.

Lots of reasons to be optimistic in my view.


Yup. There’s also this, FTA:

> the original images themselves aren’t stored in the Stable Diffusion model, with over 100 terabytes of images used to create a tiny 4 GB model

Is jpeg compression transformative then? Should a compressed image of something not be copyrightable because “it doesn’t store” the “real image”? How about compressed video? Where do we draw the line?


The difference is that JPEG does store the real image, at least close enough to within the given tolerance (determined by the compression factor). That image is as real as say an image on film (also not exact, nor in "original" form).

With Stable Diffusion it's storing the style, but can't reproduce any single input image-- there aren't enough bits [0]. (except by luck, but that's really true for any storage).

[0] https://en.wikipedia.org/wiki/Shannon–Hartley_theorem


The weights of a NN are just a compressed representation of the training data, think lossy zip.

Rank all generated images by similarity to the training data (etc.) and you can see what's stored.

The Shannon-Hartley theorem isnt relevant. A 4GB zip of 100TB text data can exactly reproduce the initial 100TB for some distributions of that initial dataset.


if you reproduced an exact image (to the same lossy degree as jpeg) using the NN, then you are violating copyright.

But if you reproduced an image whose style matches another copyrighted image (e.g., blah in the style of starry night), then how does that new image (which didn't exist before) violate existing copyright? You cannot copyright a style.

The NN containing information which _could_ be used to reconstruct an exact image doesn't itself constitute copyright violation - because the right to use information for training NN is not an exclusive right that the original holder of the training set has.

So either a new law has to come into existence, vis a vis the right to use copyrighted works to train a NN, or the current copyright laws should apply (which implies that NN generated images which are not "exact" copies of existing works don't violate copyright).


If a given model can consistently reproduce an exact image given the same input prompt, why shouldn't the model itself be considered a compressed form of that image?


Determinism is not a copyright violation.

("If a human artist...")


Right, but underneath your premise is a scam, right?

A NN has not learnt to paint: it doesn't coordinate its sensory-motor system with its environment through play, it hasn't developed any taste, it does not discern the aesthetic good from the bad, it has no judgement, and so on ad infinitum.

A NN is just a kNN with an extra compression step. The way all gradient based "learners" work is to compute distances to pregiven training data. In the case of kNN that data is used exactly, in NN its compressed.

There is no intelligence here, there is no learning: it's a trick. It turns out that interpolating a point between prior examples can often look novel and often fool a human observer.

This is largely due to how incredibly tolerant to flaws we are in the cases where NNs are used to perform this trick. We go to great lengths to impart intention, fix communicative flaws, etc. and this is exploited by "AI" to make simple crap seem great by having the observer fill-in the details, perceptually. I see it as a kind of proto-schizophrenia that all people have which usually works if we're dealing with a human, but on everything else produces religions.

In any case, a NN is just a case of a kNN -- which is capable of fooling people exactly the same way, and clearly violates copyright and is a case of theft of intellectual work to make a product you can sell. Adding compression seems irrelevant.


I don't think this interpretation of NNs is correct. There's been a few papers purporting to show this, but afair they used a very tortured definition of "interpolation".

Stable Diffusion is certainly capable of differentiating good from bad. That's why you can tell it to draw good or draw bad.

Not that this point is relevant to my comment. "Play", "taste" and "judgment" can be just as deterministic as a sequence of large matrix operations interspersed with nonlinear layers.


Sure, but then who is torturing matrices to turn them into organic bodies which adapt their musculature to their environment?

Interpolation is forced in the case of NNs, it's a training condition.

And the "kNN interpretation" isnt an interpration, kNNs define what "ideal learning" is in the case of statistical learning, and hence show, it doesnt count as actual learning.

In actual learning we're not interested in whether you can solve prespecified problems but how well you cope when you can't. This is, by definition, not a problem which can be formulated in statistical learning terms and the particular "learning" algorithm here is irrelevant.

In other words accuracy isnt a test of learning. Accuracy is a "non-modal condition" in being fit to a history that actually took place. Learning "in the usual sense" is strictly a modal, "what if" phenomenon, and is assessed by the quality of failure under adverse conditions, not of success.

If one gave any AI/ML system in existence adverse conditions, posed relevant "what ifs" and observed the results, they'd be exposed as the catastrophe they are. None survive any even basic test of "coping well" in these cases.

This is why all breathless AI public relations, ie., academic papers published in the last decade, do not perform any such tests.


so does the number pi constitute a copyright infringement?


No, since it's not a derived work.

And if you can come up with a model that can reproduce images exactly without first getting trained on them, it wouldn't be a derived work, either.


> No, since it's not a derived work.

but why is the set of numbers in a matrix considered derived?

I can trivially derive the number pi.

My original point is that just because something contains the information does not imply that it violates copyright.


Because said set of numbers is produced via a training process that has the original as an input, and a different input would produce a different set of numbers.

You're correct that merely containing the information would not violate copyright - it's all about how that information was produced.


Becuase that new image wasn't in the training set?


With the examples that we're seeing, the images were in the training set.


I'm actually seeing plenty of new images that are in the same style but are different from any of the images in the train set, like wonder woman in front of a mountain that looks like the setting of "frozen".


> there aren't enough bits

you can create compressed copy of a file containing 100TB of the letter "A" in much less than 4GB

there could be enough bits in there to reproduce some of the inputs.


What this analogy is saying is that if an image is generic and derivative enough (or massively overrepresented in the training data) it may be possible to reconstruct a very close approximation from the model. If the training data is unbiased, I question the validity of copyright claims on an image that is sufficiently derivative that it can be reproduced in this manner.


By that case all art should be copyrighted since our brain stores a highly compressed version of everything we've seen.

I wager 100 TB => 4GB is different from JPEG compression and more similar to what happens in our brains. "Neural compression" so to speak


> By that case all art should be copyrighted since our brain stores a highly compressed version of everything we've seen.

Good thing this is already the case! https://en.wikipedia.org/wiki/Berne_Convention

> The Berne Convention formally mandated several aspects of modern copyright law; it introduced the concept that a copyright exists the moment a work is "fixed", rather than requiring registration. It also enforces a requirement that countries recognize copyrights held by the citizens of all other parties to the convention.


This doesn't make any sense in this context and I think you know it bud.

I was making the point that stable diffusion != JPEG compression


The fact that it is a different form of compression doesn't change that it's compression. What is the argument here, that a numerical method should have the same rights as a person?


The author of the original post (Andy Baio) found exactly where the line was. He released a (great) chip tune version of the jazz classic Kind of Blue (named Kind of Bloop), fully handled the music copyrights, and was promptly sued by the cover artist for Kind of Blue who believed that the pixel art cover of Kind of Bloop was not adequately transformative.

https://waxy.org/2011/06/kind_of_screwed/


> Where do we draw the line?

This is what we have courts and legislation for. I expect there's existing legislation here about what constitutes a different work versus an exact copy but it may need some updates for AI.


If you can't reproduce a quantitatively (not qualitatively) similar likeness of an image then it is not just a compressed image.


Possibly controversial opinion: I think the biggest reason why so many people hold conflicting views on this is because of who the victim is in each case.

The loudest voices complaining they were directly hurt by Copilot's training are open source maintainers. These are exactly the kind of people who we love to root for on here. They're the little guy involved in a labor of love, giving away their work for free (with terms).

On the other hand, the highest-profile victims of Stable Diffusion and DALL-E are Getty Images and company. They're in most respects the opposite of open source maintainers: big companies worth millions of dollars for doing comparatively little work (primarily distributing photos other people took).

Because in the case of images the victim is most prominently faceless corporations, I think our collective bias towards "information wants to be free" shows through more clearly when regarding DALL-E than it does with Copilot.


> On the other hand, the highest-profile victims of Stable Diffusion and DALL-E are Getty Images and company. They're in most respects the opposite of open source maintainers: big companies worth millions of dollars for doing comparatively little work (primarily distributing photos other people took).

It's puzzling to me you acknowledge people are taking these photos and getting a cut from their use through the marketplace, yet still see Getty as the biggest victim.

If Getty could AI-generate their whole portofolio and keep 100% of the sales to themselves they'd do it in heartbeat (and I'd expect them to partially go that route). The most screwed people are the photographs ("the little guy" in your comparison)


I said Getty is the highest profile victim, not the only victim. They're the ones making waves.


If you're in the freelance art-sphere, the victims are also small artists who have been hustling hard to be able to live from their art.


With the caveat of "Strong opinions, weakly held", my personal take is that creating artificial scarcity is inherently immoral and thus copyright itself is immoral. Training AI on someone's non-private work is then completely fine IMO.

Copyleft is a license that weakens copyright (and thus inherently good :)), so using machine learning to weaken copyleft by allowing you to copyright "clones" of copyleft code is bad.

If I try to generalize here, the problem in both cases is only if you produce copyrighted works, especially if you trained on copyleft works. If instead both models would stipulate that all produced works are copyleft I would be much more fine with it (and I feel it would respect the license of the copyleft works it was trained on, even if that may be legally shaky).


> With the caveat of "Strong opinions, weakly held", my personal take is that creating artificial scarcity is inherently immoral and thus copyright itself is immoral. Training AI on someone's non-private work is then completely fine IMO.

Why do you hold that opinion? There's a few very clear benefits to creating artificial scarcity, mostly around incentivizing creation (and sharing!) of innovations.

If we can't create artificial scarcity around ideas, then ideas are in a sense less monetizable than, e.g., creating a piece of furniture. But this is just an accident of the way the world works - I can physically prevent you from taking a chair that I made, but I can't prevent you from taking an idea I had. Why does it make sense that the world works this way? Isn't it a whole lot better to encourage innovation, vs. encouraging more people to make physical objects, just because those are inherently scarce?

(The other side is that innovations also have the great property that copying them isn't depriving anyone else of use of the original idea, but that's a side issue to the encouraging innovation one, IMO.)


> Why do you hold that opinion? There's a few very clear benefits to creating artificial scarcity, mostly around incentivizing creation (and sharing!) of innovations.

I think it is self evident why creating artificial scarcity is immoral, but the point you are trying to make is that from a utilitarian standpoint you think it is preferable to behave in a way that is immoral in the micro scale since it will create greater good in the macro scale. If you agree, then I don't think there's any need to justify my belief here :)

That said, I also think that it is unclear that copyright is a net positive increase to creation and sharing of innovations. The current state of monetization is not inspiring since the actual creators usually are not well compensated and money tends to stay with large corporations that are essentially just "right holders". There's also many factors that actively stifle innovation and creativity.

Patents are the most well known example, but being unable to borrow chord progressions or characters or storylines from other works is also stifling (you can't exactly release your "edanm's cut of Spiderman Homecoming" publicly, nor can you create your own sequel or alternate interpertation of the story). Quite a few fan games or fan remakes have also met their demise at the hands of aggressive copyright enforcement.

My own suspicion is that if the current models of creation will be disrupted by copyright abolition, we'll just end up seeing that a lot of the money that was spent on them will move to other avenues of funding like Patreon style or Kickstarter style funding for works. We may even see some new models created. I'd also expect that it will actually shift the balance away from large corporations (whose primary value is having lots of money that allows them to hoard rights) to smaller creators which will now have more direct funding available to them.

I also think an interesting case to look at is Video Games, where the concensus is that a game's mechanics aren't copyrightable, and so whenever a new interesting Indie game comes out on steam, there is a rash of other cool Indie games with their own takes and remixes of the same concept, like the glut of Roguelike deckbuilders after Slay the Spire or the current glut of "Vampire Survivors"-likes that have their own interesting takes on the core idea. Eventually when there's enough buzz around such ideas they can even penetrate the AAA sphere (where adding "Roguelike" elements has started to appear slightly). It is also quite common to see games that are in Early Access for a very very long time, essentially letting the community fund the future creation and expansion of the game.

> If we can't create artificial scarcity around ideas, then ideas are in a sense less monetizable than, e.g., creating a piece of furniture. But this is just an accident of the way the world works - I can physically prevent you from taking a chair that I made, but I can't prevent you from taking an idea I had. Why does it make sense that the world works this way? Isn't it a whole lot better to encourage innovation, vs. encouraging more people to make physical objects, just because those are inherently scarce?

So as I said, I'm not sure it really will make ideas less monetizable (or that if it will, that it will do so significantly). Even today I can get any video game I want for free (illegally, but that effectively doesn't matter since no one will persecute me for it), and yet I still buy video games. In fact, a strong reason for why I buy video games today is because as a child I had a friend that had easy access to pirated CDs and I'd play a lot of games at their house, fueling my passion for them.

And relatedly, it is no accident that "software is eating the world", it is exactly because it is so easy to share, the low marginal costs make it easy to have a strong worldwide impact without too much real world effort :)


> I think it is self evident why creating artificial scarcity is immoral, but the point you are trying to make is that from a utilitarian standpoint you think it is preferable to behave in a way that is immoral in the micro scale since it will create greater good in the macro scale. If you agree, then I don't think there's any need to justify my belief here :)

I'll start with the end, I don't think the "immoral on the micro scale" idea makes much sense. Like, you can say it's "bad" or "annoying" on the micro scale, but if it is good for society as a whole to create artificial scarcity, it just isn't immoral for society to provide mechanisms to create it.

I also don't think it's self-evident that it's "immoral" on the small scale (though not sure what that means, since artificial scarcity is kind of a society-level mechanism.)

That said, maybe I'm just bumping on your use of the word immoral and we're not really disagreeing.

> That said, I also think that it is unclear that copyright is a net positive increase to creation and sharing of innovations. The current state of monetization is not inspiring since the actual creators usually are not well compensated and money tends to stay with large corporations that are essentially just "right holders". There's also many factors that actively stifle innovation and creativity.

This has been a talking point of people against copyright for a long time (I've been having these discussions for at least 20 years, personally).

But I think that you're basically wrong. It's pretty easy to see that you're wrong too - just look at the state of news, the state of music, etc. In most cases, artists today make far less money than they made before the rise of pirated alternatives. Patreon/other models/etc have helped some, but nowhere near where things were before.

In fact, you talk about the current state of compensation being bad, but I think that it would make more sense to listen to actual artists about whether or not copyright helps them or not. I've listened to a bunch, and most of them couldn't come close to doing what they love without copyright.

Also, personally, I'm a software dev. Most of what I do on a day-to-day basis is create IP. I'm fairly happy that someone can't just come along and repurpose everything I built, for free. Otherwise, I'm fairly sure I'd be out of a job.

(I'm fairly sure that without copyright/IP, most software we use wouldn't exist either.)


How do you propose to keep artists able to pay their bills and live a decent life if you're completely cool with training AIs on them?

Bonus points if you have any actionable scheme beyond waving your hands and talking vaguely about "basic income".

Keep in mind that the life of a professional artist is currently very perilous, anyone working freelance is constantly battling against the social media giants' desire to keep everyone scrolling their site forever. Words like "patreon" and "commission" and links off-site to places an artist can exchange their works for money are poison to The Algorithm and will be hidden.

And also if I am reading this right, you have absolutely no problem with an image generator that's been trained on copyrighted work producing work that's either copyrighted or copylefted? You are utterly fine with disregarding the copyrights of the original artist and/or whoever they may have assigned the copyright to as part of their contract?


> How do you propose to keep artists able to pay their bills and live a decent life if you're completely cool with training AIs on them?

Why isn't it a concern for any other automation?

How do we progress, exactly, if we randomly decide that nothing can disrupt any of the current ways of earning income?

What about, IDK, coal miners?

> And also if I am reading this right, you have absolutely no problem with an image generator that's been trained on copyrighted work producing work that's either copyrighted or copylefted? You are utterly fine with disregarding the copyrights of the original artist and/or whoever they may have assigned the copyright to as part of their contract?

Copyright maximalism is bad. It also doesn't make any sense. Someone learning to reproduce your capabilities by looking at your stuff isn't violating copyright. If we allowed copyright to somehow mean that someone's skills can't be reproduced...


The very first Sveriges Riksbank Prize in Economic Sciences in Memory of Alfred Nobel winner Paul Samuelson[0] makes the argument here[1] when discussing how lighthouse economics works that anything with zero marginal costs that has a price other than free is by definition an economic loss. Therefore you should find other ways to fund lighthouses, and by extension, all media and software.

If an economic loss is currently occurring, that means that, if copyright is abolished, an economic gain will accrue. Where that economic gain is captured is therefore the core focus. What we want to occur is for society and the author to share in this increased economic gain, what we don't want is for monopolistic rent-seekers to grab all of this value for themselves.

I am not yet sure of this, but I think a land value tax would accomplish this goal. One would then discover some method to find the weight for each artist or piece of software through a market means that might only be an approximation and take from the land value tax revenue and distribute it to artists and other creators. I think a decent way to find this value is through a revenue neutral(or slightly negative) opt-in sortition process in which, when you go to use an artists work, you are entered into an auction wherein the 50% who bid more than the median value get to use the work at the median price, and the 50% who bid less do not get to use the work, but receive their bid in cash. This is surely not the full system, but it is just me working out how we can move forward from such an unjust system like copyright.

[0]: https://en.wikipedia.org/wiki/Paul_Samuelson

[1]: https://courses.cit.cornell.edu/econ335/out/lighthouse.pdf - page 359, first paragraph


Speaking as another copyright abolitionist,

> How do you propose to keep artists able to pay their bills and live a decent life if you're completely cool with training AIs on them?

1. Software being able to imitate artists doesn't mean people will stop wanting art from other people.

2. If it does, I would say that it's unfortunate, but it's the world we live in. Nobody has a guarantee on being able to make a living doing any conceivable profession whatsoever, and artists are no different. I would very much like to make a living from working on my personal projects, but people don't seem to want to take me up on my offer.


Because when I go to a live performance, watch a movie, browse an art gallery, etc., I am training my brain on copyrighted work. Every artist has done the same. No artist has developed their style in a vaccuum.

(See my other comment though, I am not sold on any of this being right).


> Because when I go to a live performance, watch a movie, browse an art gallery, etc., I am training my brain on copyrighted work

you paid for it and enjoyed it in the way the artist intended.

> I am training my brain on copyrighted work.

too bad your brain alone is useless.

You need good hands to replicate much of the copyrighted works you "trained" your brain on.

> No artist has developed their style in a vacuum.

some absolutely did, indeed.

Just look at the school of film and animation that artists in USSR developed while separated from the rest of the World.

They are unique and completely different from what the west was used to (Disney)

https://www.youtube.com/watch?v=2qWBZattl8s

https://www.youtube.com/watch?v=1qrWnS3ULPk


Both links contain great examples of late 1960s Polish animation. Poland was never a part of USSR.

Moreover, just like Czechoslovakia, Poland did not develop its animation style in a vacuum. They had strong artistic connections with many other European countries, especially France.

> They are unique and completely different from what the west was used to (Disney)

Not sure if you are an American... but Europe has a very old and very rich animation tradition. Growing up in Europe I was only vaguely familiar with Disney. The vast majority of animation I watched as a child was European.


correct, I wrongly used the term USSR to mean the eastern block.

> Not sure if you are an American... but Europe has a very old and very rich animation tradition

No, I am Italian.

I grew up with these kind of animations. [1] [2]

Nonetheless, the "animation studio" of the West, the one who won Oscars and was published virtually everywhere in the western block, was Disney

[1] https://www.youtube.com/watch?v=GV3BqbsyaUk

[2] https://www.youtube.com/watch?v=8Adqk9KD6Fk


> Just look at the school of film and animation that artists in USSR developed while separated from the rest of the World.

So they were developed in a complete vacuum including traditional things like theater, poetry and story telling?


they pioneered new techniques in a vacuum, because they were segregated.

complete vacuum is a silly argument, we owe dinosaurs the oil we used to build up our modern societies, would you say that AI would be impossible without dinosaurs?

Do we owe the big bang a debt of gratitude?


> In contrast when talking about training code generation models there are multiple comments mentioning this is not ok if licenses weren't respected.

I think one of the differences is that people are seeing non-trivial amounts of copyrighted code being output by AI models.

If a 262,144 pixel image has a few 2x2 squares copied directly, you can't tell.

If a 300 line source file has 20 lines copied directly from a copyrighted source, well, that is more blatant.


As an artist you can spot parts though that have the same 'visual language' that are much larger than 2 pixels. E.g. how someone uses their brushes, how someone does texture on corrugated metal etc. Those are footprints as large as a matrix multiplication method - they just can't be that easily quantified, because we need an AI model to quantify them.


I personally feel the bar for copyrighting code should be considerably higher than 20 lines.


> I personally feel the bar for copyrighting code should be considerably higher than 20 lines.

That highly depends on the lines of code.

One of my (now abandoned) open source react components essentially does some smarter-than-it-probably-should state management in just a handful of LOCs. At least a few hundred people found the clever solution I came up with useful enough to integrate into their own projects.

I've seen talented a graphics programmer hand optimize routines to gain significant speed boosts, speed boosts that helped save non-trivial amounts of system resources.

And where do you draw the line? That same gfx programmer optimized maybe a dozen functions, each less than 20 lines, but all quite independent of each other. The sum total of his work gave us a huge performance boost over everyone else in the field at the time.

And of course you also have super terse languages like APL, where non-trivial algorithms can easily be implemented in 20 LOC.

But let's move to another medium, the written word, also one of the less controversial aspects of copyright (ignoring the USA's penchant for indefinite extension of copyright)

Start with poems, plenty of artistically significant poems that come in under 20 lines, deserving of copyright for sure.

https://tinhouse.com/miracles-by-lucy-corin/

There is a short story, around 22 lines.

The problem is, it is complicated, which is why these are the types of things that get litigated all the time.

Heck as a profession we cannot even agree on what a line of code is. A LOC in Java is, IMHO, worse less than a LOC in JavaScript, and if you jump to embedded C, wow that is super terse, unless you count the thousands of lines of #defines describing pinouts and such, but domain knowledge is needed to know that those aren't "real" lines of code.


I think that your code examples are not (and should not be) copyrightable.

Quoting US copyright law (but the same principle is global) "In no case does copyright protection for an original work of authorship extend to any idea, procedure, process, system, method of operation, concept, principle, or discovery, regardless of the form in which it is described, explained, illustrated, or embodied in such work." - when a work combines an idea and its expression, copyright protects only the expression but not the idea itself, no matter how clever or valuable it is. Copyright does not prohibit others to freely copy the method/process/system/etc expressed in the copyrighted work.

Also, there is a general rule of thumb (established in law and precedent) in copyrightability that functionally required aspects can not be copyrighted - in essence, if you create a new, optimized routine that's superior to everything else, then only the "arbitrary" free, creative parts of that code are copyrightable, but you absolutely can't get an exclusive right to that algorithm or technique. If the way you wrote it is the only way to write it, that can't be protected; and if not, others must be able to write stuff that's functionally the same (i.e. gets all the same performance boosts) and varies only in the functionally irrelevant parts. That's one of the reasons, for example, for recipe sites having all that fluff - because you can't get copyright protection on the functional part of the recipe, or some technique in another domain like architecture or computer science. Perhaps you can get a patent on that, but copyright is not applicable for that goal.

So, going back to your examples:

People reimplementing that "smarter-than-it-probably-should state management in just a handful of LOCs" is absolutely permitted by copyright law. If the same state management can be written in many different ways, then copying your code would be a violation and they would have to reimplement the same idea in different words, but copyright law definitely allows them to copy and reuse your idea without your permission, it doesn't protect the idea, only its specific expression.

Hand-optimized graphics routines may fall into the area where there is only one possible expression for that idea which implements the same method with the same efficiency. If that happens to be the case, the routine is not eligible for copyright protection at all - you can't get a monopoly on a particular effective technique or method using copyright law; patent law covers the cases in which that can or can't be done.

For APL implementations of algorithms - again, the key principle is that copyright definitely allows others to implement the same algorithm. If an obvious reimplementation of the same algorithm results in the same terse APL code, then that's simply evidence that this particular APL code is solely the "idea" (unprotectable), not "creative expression" which would be eligible for copyright protection.


It's not the training of the models that's the problem, it's when the AI spits out "substantial portions of code", an important term in the GPL and with regard to fair use law, that are exact, sometimes even including exact comments from specific codebases. This does violate the licenses.

There's something quantitative in code that you don't get in drawings, drawings the unique quality is purely qualitative, so it is hard to demonstrate what exactly it was that was ripped off. When you find your exact words being returned by a code helper AI it's hard to pretend that it's not directly and plainly just copy pasting code snippets.


I am going to go ahead and say it, some of you people are so far up your own ass that you don't realize it is the exact same thing. All of you are saying it's unique because you code but you don't understand art and can't pick out the things that are clearly copied from artists because it isn't exactly the same.


I don't understand art, you're right, at least from an artist's perspective. Maybe you can enlighten me.

So my current perspective is this: if a picture is drawn in a style similar to yours for example, that's not infringing, but if it's just a scrapbook collage of cutouts of your art it is, except where it's fair use. Would that be right?

So the same applies, if an AI actually can help people write code by learning from existing code, that's fine, but if an AI just copy pastes code blocks that's not.

Where am I going wrong here?


It’s a dumb hill to die on. Doomed to fall to a layer of minor refactoring. If you say that’s the problem, you’ll have nothing to stand on later.


On the contrary - the part that is problemmatic is the verbatim reproduction of copyrighted code. If that's fixed by a "minor refactoring" then there's no hill to die on. It's not AI code generation per se that's problemmatic - it's when it does things that are break current IP law.

If you want to debate expanding IP law - that's a different discussion and one I would be rather sceptical about. I'd prefer the that IP law in general was rolled back - not forward.


Which is a fine point of view. But it’s not the one that many (most?) detractors actually hold.

It would imply that you cannot offend open source licenses by doing something as simple as recreating a codebase in another language, thus eliminating the exact matches.


> Doomed to fall to a layer of minor refactoring

Not quite - there is a reason why https://wikipedia.org/wiki/Clean_room_design exists as a concept to workaround copyright, and the same concept could hold for ML models.


Clean room design? Did the models train on art the trainers drew themselves?


I always held the opinion that GPL etc was a copy-left license that was intended to make sure the code was free (free as in freedom not as in beer). That in an ideal world you wouldn't need the GPL or any licenses at all. At this point I really don't care what co-pilot or any of its derivatives result in and I think in the not too distant future we will have machine code to readable code translation which will enable more freedom. That is, it really won't matter if the code is compiled or not, when you can "AI decompile" it into human readable code, do your modifications, and then do with it what you will.

From that view let the data be free.


As long as this copyright violation laundering isn't reserved for the big guys, I'm happy for anything that confuses and delegitimizes the concept of copyright. But it is reserved for the big guys, you're going to get sued to death if you copy any of their work.


GPL folks are completely OK with something like Copylot when GPL license is obeyed, so all emitted code, generated by AI trained on GPL code, is licensed under GPL again. It's not OK to call our code as «public code» and ignore our license.


But by repeating this argument you are strengthening copyright, which is the fundamental evil GPL was made to fight. There surely will be FOSS clones of Copilot in the near future. There is no need to feed the copyright lobby.


Some languages need to compile but others don't


I think both are inevitable and I'm ok with both. I think a sticking point is that its considered normal to make your own art in the style of another but abnormal to copy code verbatim. Art seems to be clearly the former while there are instances that probably stick in people's minds where copilot has produced verbatim examples.

Indeed it seems like code will be vastly more prone to this problem compared to art because changing a single pixel is merely a question of aesthetics whereas code is constrained tightly by the syntax of the language. With a much smaller space of correct results duplication is likely inevitable.


This is my thinking too. A maximally useful code AI would include verbatim reproduction (since presumably the code it was trained on was written that way for a reason relevant to its function). A maximally useful art AI has comparatively little reason to ever want to output verbatim training inputs.


I've been thinking of a possible resolution to this. For Copilot and similar systems: keep the training data, and in addition to the text generation, add a search function. Find generated sequences that Copilot puts out, and send pointers to the source for close matches in strings over a threshold length. Example: if Copilot produces the Quake fast inverse square root routine, you'd get a pointer to the source AND to the license. This would allow credit for the author for permissive free licenses, and would allow the user to dump that code if it's GPL and they aren't willing to distribute under those terms.

For art, train on contributed images whose authors agree to use them for that purpose. There could be some organization, perhaps a nonprofit, that would own the images and users of the model could credit that organization, and perhaps contribute back their own generated work. That way a legally clean commons could be built and could grow.


Of course people today are training on copyrighted images, chanting fair use because people on the internet tell them it's okay, instead of finding groups of artists who consent to having their art be trained on.


I hold neither of those opinions. My take is that agricultural civilization is being digested by technocapital and we're all along for the ride.

That said, the application of copyright to text, vs, code, vs images, has salient differences. The concept of plagiarism in visual artwork exists by analogy, but it's hard to call it coherent.

Music is somewhere in the middle, people have sued successfully over melody and hooks.

There are things like characters in animation, it's not unheard of, but the balance is more on the side of "great artists steal".


The vast majority of HN's patronage is tech aligned. Try asking a community of artists the same question and see what their responses are. The results might surprise you.


What's funny is HN was majority OK with AirBNB "disrupting" hotels and Uber/Lyft "disrupting" taxi services by bending the rules and exploiting legal loopholes, but when AI starts "disrupting" their artwork and code by bending the rules suddenly disruption becomes a personal problem.

Disrupt onward I say. Humans learn and remix from prior copyrighted work all the time using their brains (consciously chosen or not). So long as the new work is distinguishable enough to be unique there's nothing wrong with these new AI creations.


Because the models are not creating a 1:1 replacement of the original work.

As mentioned before "style" is not something subject to copyright and the model creates a model of that style. The process of finetuning a model generally means that one would not want to recreate the original images as that would overfit it and render it, essentially useless.

When it comes to code, there is a higher chance of getting a one-to-one clone of the input as the options used in creating an algorithm, or even a simple function are dramatically reduced imo.


> Because the models are not creating a 1:1 replacement of the original work.

Since when did that become a requirement? If those are the rules now, then cutting the final credits is good enough to start torrenting movies.

> When it comes to code, there is a higher chance of getting a one-to-one clone of the input as the options used in creating an algorithm, or even a simple function are dramatically reduced imo.

If you're going to consider each function within a larger work as an individual work, that makes the 1:1 replacement claim more dubious. In order to recognizably imitate a style, one or more features of that style have to be recognizably copied, although no single area of the illustration would have to be. A function is a facet of a complete program just like recognizable features of a style are facets of each work an artist produces. If it helps, consider an artist's style as their own personal utility library.


If I made a scene for scene remake of a Disney movie, with an ugly woman for a princess and social commentary/satirical injections, it would be defensible as fair use in court.


That's because it is parody, which is explicitly defined as fair-use. NNs are not only used for parody.


I think when it comes to art, less than one-to-one clones are often still functionally equivalent in the mind of many viewers. Stylistic and thematic content is often just as, if not more, important than the exact composition. But currently the law does agree that this is not copyrightable. And sometimes independent artists profit and make a name for themselves copping other styles, and I think that's great.

But could it be considered an intellectual and sociological denial-of-service attack when it's scaled to the point where a machine can crank out dozens of derivative works per minute? I'm not sure this is a situation at all comparable to human artists making derivative works. Those involve long periods of concentration, focus, and reflection by a conscious human agent to pull off, thus in some sense furthering to the intellectual development of humanity and fostering a deeper appreciation for the source work. The machine does none of that; it's sort of just a photocopier one step removed in hyperspace, copying some of the artists' abstractions instead of their brush strokes.


> when it comes to code, there is a higher chance of getting a one-to-one clone of the input.

I'm not so sure. There's a generated image in the article that I think looks enough like Wonder Woman to cause a lawsuit.

That's just one of a handful of images in the article, and doesn't seem to have been chosen for its similarity to Wonder Woman.


Code is hundreds to many thousands of lines. A line of code is analogous to one color pixel in digital art.


Depends on which lines of code.

I have written projects where I'd consider a handful of lines of code to be the core central tenant of the entire project that everything else is built up around. Copy those lines and everything else is scaffolding that falls out naturally from the development process.


but one line similar code is easier to find. this is because copilot work in one line/small function level.


Style is not protected by copyright. You can create your own art in the style of any living artist and this is allowed. AI is automating that process. Some works that the models produce may be too close to an original work and probably be guilty of copyright violation if that ever goes to court. It'll be up to a judge to look at the original and the AI output to weigh in on if it's different enough or if it's an elaborate copy.

Code is different, there isn't a style to it other than perhaps indention and variable naming conventions. Entire sections (that are protected by GPL) are copied. This by itself isn't the issue, it's contaminating your codebase that is the problem. If your work ends up with the same license as the code sources and those are properly documented per those license agreements you're fine. But if you end up violating the GPL and someone knows their code is in use, you are in a tough situation. Again, it'll end up in an expensive court room session where a judge is going to have to determine if enough code was copied to be construed as a license violation. That's the one scenario you would want to avoid in the first place because for a lot of businesses that kind of lawsuit is too expensive to fight.


Humans used to learn to code from copyrighted works (textbooks) without much reference to OSS or Free Software. Similarly, teaching ML models to code from copyrighted works isn't going to violate copyright more frequently than a human might; and detecting exact copies should be pretty easy by comparing with the corpus used to train it. Software houses already have to worry about infringement of snippets, and things like Codex are just one more potential source.


Those books were purchased and a license granted for such use.


> Those books were purchased

Sometimes. People also borrowed them, read them in libraries, or in later years looked at free textbooks online.

> and a license granted for such use

I never heard of such a thing, and it was never seen as necessary. No student checked the licenses on their textbooks before deciding whether they could read them.


Cognitive dissonance, different persons, hypocrisy.


People often hold conflicting views, they done like to think about it though, because it can lead to cognitive dissonance.

That’s one reason why it is probably better to have a derived world view than a contrived world view.


Because there’s more programmers than artists in this hive mind.


> why do you think it's ok to train diffusion models on copyrighted work, but not co-pilot on GPL code?

Probably worth pointing out that GitHub has a license to the code on its site (read the fine print) that is independent of other licenses that the code may available under.

Whether that license applies to training ML models is legally uncharted waters.

Whether it’s right to train those models is another matter.


It's the hypocrisy of it all. Multi-billion dollar corporations whose empires were built on copyright, violating the licenses of other people's code. Why do they get a pass for that while simultaneously shoving DRM and trusted computing down our throats? I hope they get sued for ridiculous sums.


I think any code that's posted in public should be considered free to use by anyone for anything and its corresponding license be ignored and invalid. If you want restrictions on how people use your code, don't post it publicly, or have a proprietary portion that's required for compilation


Your argument is tantamount to expecting Michael Bay to have never seen a Scorcese or deriving influence from it.


Is Michael Bay cloning and copyrighting entire scenes of the Scorsese films as entirely yours?


Pretending it's not clearly more complicated than that will not convince anyone, it will make them feel condescended to. While Scorcese is Scorcese, a deep learning model is not Michael Bay.


I used that comparison on purpose. Michael Bay leans on a lot of computer driven technology to shoot movies inspired by other more traditional directors. The comparison is direct, if you feel condescended to, I did not intend that.


It's actually fine in both cases


It is totally okay to train Copilot on GPL code, but the resulting generated code should also be released under GPL license, clearly being a derivative work. I don't even know why it is being discussed.


> the majority opinion seems to be that training diffusion models on copyrighted work is totally fine

Well, maybe they were all just downvoted into invisibility. But up to your post, I have seen none.


Can humans learn from copyrighted work?

ps. I'm surprised music has not yet been yet leveled by AI, can't wait for "dimmu borgir christmas carols" prompts.


Music is somewhat more challenging because you have a few other problems that have to be solved in the pipeline, and source separation is still not a 100% solved problem. Beyond that, audio tagging beyond track level artist/genre is a lot harder than image tagging.

Once you have separated sources for a training data set, it's like text generation, except that instead of a single function of sequence position, you have multiple correlated functions of time. Text generation models can barely maintain self consistency from paragraph to paragraph, which is a sequence difference of maybe 200 tokens, now consider moving from token position to a time variable, and adding the requirement that multiple sequences retain coherency both with each other, and with themselves over much larger distances.

There are generative music models, but it's mostly stuff that's been trained on midi files for a specific genre or artist, and the output isn't that impressive.

I am also eagerly awaiting "hark the bloodied angel screams" with blastbeats, shrieks and blistering tremolo guitar, though.


I don't think the law will get hammered down until the AI models generate 'major recording artist' inspired songs. Anyone claiming that artists can't claim 'style' as a defense of AI generated works is in for a rude awakening.


Stealing stuff so you can get rich selling it is worse than stealing for private use.

Stealing from idealist volunteers is worse than stealing from a random person.


I don't think copyrighting things is fine.


To be honest, the majority opinion on this just demonstrates how narrow minded and uncritical many people here are that the clear and obvious juxtaposition doesn't get their minds churning. It's hard not to notice half of them merely use AI tools and don't really understand how they work, hence why the silly and incorrect phrase of "your mind is a NN!" keeps occurring here.


Code generation models tend to much more often regurgitate code from the training data compared to one of these image based models regurgitating images from the training data.

Code generation models need to have special handling for checking if the generated code falls under copyright.


I do open source on Github and I believe both are totally fine.


maybe we can have a GPL code generator model, all generated code are under GPL license?


I think this is a great question, but I think answers should rest on a slightly more detailed understanding of how actually copyright works.

IANAL, but to a first-order approximation: everything is "copyrighted" [1]. The copyright is owned by someone/something. The owner gets to set the terms of the licensing. The rare things not under copyright may have been put explicitly into the public domain (which actually takes some effort), or have had their copyright expire (which takes quite a while; thanks Disney).

So: this is really a question about fair use [2], and about when the terms of licensing kick in, and it should be understood and discussed as such. I don't think anyone who has really thought about this is claiming that the models can't be trained on (copyrighted) material; the consumption of the material is not the problem, is it? The problem is that the models: (1) sometimes recreate particular inputs or identifiable parts of them (like the Getty watermark), or recreate some essential characteristics of their inputs (like possibly trademark-able stylistic elements), AND ALSO, (2) have no way of attributing the output to the input.

Without being able to identify anything specific about the input, it is impossible know with certainty that the output falls within fair use (e.g. because it was sufficiently transformative), and it is impossible to know how to implement the terms of licensing for things that don't fall within fair use. There's just no getting around that with the current crop of models.

The legal minefield is not from (1) or (2), but from (1)+(2), at the moment of redistribution, monetized or not. Even if Copilot was only trained on non-reciprocal licenses (BSD, MIT), there are very likely still licensing terms of use, which may include identifying the original copyright owner. Reciprocal licenses like GPL have more involved licensing terms, but that is not the problem: the problem is failure to identify the original licensing terms. We should not use these models as an opportunity to make an issue about GPL or its authors, or about the business model of companies like Getty; both rest on copyright, and come to our attention because of licensing.

Sorry about the rant. As for your question: I think it may be as simple as: to what extent are readers here the producers of inputs to the ML models, versus consumers of outputs. It gets personal for coders when models violate licensing terms of FOSS code, but it feels fun/empowering to wield the models to make images that we'd otherwise be unable to access. From my rant above you can tell that whether its for code or images, I think the whole thing is an IP disaster.

[1] https://en.wikipedia.org/wiki/Berne_Convention

[2] https://en.wikipedia.org/wiki/Fair_use


I am a graphic artist. In the recent months I've read dozens of articles and threads like this. I still can't see what the big deal is.

Graphic artists don't have trade secrets or unique impossible techniques. If someone can see your picture, he can copy its style. It becomes publicly available as soon as you publish it. For the vast majority of graphic styles, if one author can do it, then hundreds of his colleagues can do it too, often just as well. If one author becomes popular and expensive - then his less popular colleagues can copy his style for cheaper. The market for this is enormous and this was the case for probably hundreds of years.

I personally am a non-brand artist like that. More often then not clients come to me with a reference not from my portfolio and ask me to produce something similar. I will do it, probably five times cheaper than the artist or studio who did the original. It may not be exactly as good, but it won't be five times worse.

Some clients are happy to pay extra for the name brand, and will pay. Some want to spend less, and will settle for a non-brand copy.

The clients that are willing to pay for the name brand will still be there for the same reason they are now, and the existence of Stable Diffusion changes nothing to them. And the ones that just want the cheap copy would never contact the big name artist in the first place. The copy market will shift, but the big name artist doesn't even have to be aware of it.


The main thing people are worried about is the fact that food costs money and you need to eat in order to live. People are afraid that their illustration jobs are at risk because of AI illustrations being _good enough_.


This is a hundredth time in history when technology progressed and artists had to learn new ways to make money. I've learned art in art school - none of the jobs that my art teachers had in their youth are relevant right now. The tools, the pricing, the workflow, the clients requests and expectations are all different. You can keep some of the skill, but you still need to learn and adapt to the new reality. Sometimes it takes 5, sometimes 15 years, but the job of the artist is always transforming.

The illustrator from the article is probably drawing in Procreate with an Ipad. Probably doing her promotion on social media and doing business with her clients remotely. All of those are recent technological advancements that appeared in her lifetime and completely outperformed the previous way to do commercial illustration. Illustrators that worked before that had to learn those new ways, or lose their jobs. This happened dozens of time in history. Now is the turn for current illustrators to adapt.


But none of that addresses fundamental changes to the market structure. How can a beginner artist possibly get traction in a marketplace where people only pay for premium names or pay dirt for beautiful art that's 90% of what they want. You said yourself most clients are willing to settle if the price is right, and you can't really beat free.


When we get to $0 for any possible artwork of any quality - yeah, it's game over for everyone, the end of the industry. Right now we are far from it, thankfully. AI still can't produce usable commercial quality files. Most of them are simply not good enough, and even those that look kind of good have to be fine-tuned and reformatted by a human artist. Which takes real skill and effort and costs money.

And until someone does any job for some amount of money, a beginner can start his career by doing the same job for less money. This will still be the case no matter what technology comes next.


Something very similar to what happened to working musicians when we developed the technology to record and replay their performances.

Now there are a lot less people playing live music.


That's funny to think about: if music playback technology didn't exist, every cafe would be in need of a mediocre guitarist.


They don't. They will either need to find a way to stand out in an increasingly competitive market, or get pushed out of it. That's how all labor markets work, but creative fields are especially cutthroat and very few people get to do art as their full time job.


Same problem for wheelwrights and loom weavers and chimney sweeps. Occupations go obsolete, people have to adapt. There is nothing special about artists in that regard, if technology supersedes them then they'll go away and people will have to do other things.


This doesn’t directly answer your question, but when I was in university about 20 years ago, I was in a digital arts program, but I focused on algorithmic art and using programming to generate images. I came to the realization that "style" doesn’t matter, and the body of work I produced really looked quite different from one project to the next. I could generate countless numbers of images in a particular style but then moved on. The art in my case was thinking of style as parameters and certain constraints.


> But none of that addresses fundamental changes to the market structure.

There is no fundamental change. Only an incremental one.

Even if AI-generated imagery takes over the market for “drawings in the style of someone else”, these AIs will still need to be trained and operated by human beings. It will not bring the cost to zero, it will just lower it — which already happens continuously in all markets due to human ingenuity.


Your profile states your a logo designer. With regards to your above post "Now is the turn for current illustrators to adapt."

Are you happy for me to feed that in to an AI generated feed and and generate logo's based off your illustrations, post them online to be sold? How would you feel? How would you adapt from that?


I am not happy or unhappy about it. I just don't have a moral problem with it.

I did the same thing to get into the logo industry. In the process of learning I've analysed hundreds of logos made by other artists. I've tried hard to understand how they work, copy the best practices and styles and do the my best effort so my logos could be as good. I've trained on this dataset and got to be successful enough to become a part of it.

I don't have a moral problem with AI doing the same. It will probably be hard to compete with it, but for now I manage. If I won't be able to compete anymore - I will adapt and apply my skills elsewhere.


If that is the main issue, why are artists hiding behind the pique that "when I create art, it is full of soul, experience, blood and sweat" Just say that you need a way to make money and these models are replacing us.


Because that argument is as old as the written word, and it works exactly as well now as every time in the past (not at all).

Every job being automated requires its own "we're unique" pitch to get any pity points.


I wonder how many illustrators already lost their jobs once clipart took off starting in the 90s.

Many newsletters/newspapers of bygone era had an artist/doodler to do little sketches which got replaced by clipart in many cases.


I dislike the drift this "need to work for food" phrase that I'm hearing so often. Job automation never reduced our ability to produce food. The harvest is not in any danger, not even if we suddenly produce twice the art with the same amount of work.


Hi. I'm a professional artist. I have a lot of friends who are also professional artists.

Most of us live in cities, and go to the store to buy food. We have specialized in being good at making images, which we trade for money, which we can trade for other goods and services such as "food" or "entertainment" or "rent". Some of us are doing well enough to have room for a garden, and the time to tend it. This is by no means the majority.

How many of your peers would know one end of a modern combine harvester from the other? Probably very few, if you live in the city.


It's not about food production, it's about capitalism.

If artists could simply ask for food and be given it from the overflowing cornucopia, then yes, this wouldn't matter and in fact would be a net benefit.

Unfortunately though, artists must sell their art to get money, then exchange that money for food. Now, if a robot produces free art that's almost as good, most of those buyers won't pay those artists anymore, and the artists will starve (or stop being artists).

I do believe that job automation will quickly eliminate scarcity for basic life necessities, while also displacing more and more jobs in our economy, and that therefore UBI or some equivalent will be imminently necessary - but that's a much larger topic


I see a pretty clear analogy to the various industries that felt threatened by home video and audio recording improving to the point of being able to make copies quickly and without significant degradation--particularly when disc ripping at 20x+ became a thing and time wasn't even a barrier.

A person who can clone a style and crank out illustrations at human speed is a very different thing than an automated process that can do it immediately on request, in minutes or seconds. If nothing else, the latter is a huge efficiency gain for being able to self-serve, as it would allow an editor to trial different illustrative approaches without all the back and forth contracting out to a human would require.

Personally, I think what this will do most is convince artists not to put galleries of their work suitable for training online.

The Redditor identified in the article posted a new comic art model based on James Daly III (this is mentioned at the end of an article with a link). The Redditor's comment in that post implies Daly was chosen specifically because they had a gallery of easily consumable training images all in one place.

I have no idea what the minimum effort would be to make the images less useful for training, but I foresee a lot of obnoxious watermarks in our future as people try to do so.


The big deal is that now I can copy someone's style in less than 30 minutes and it doesn't require the intermediance of any professional artist. That was a major source of friction that was just uplifted. Not even mentioning that I can generate tons of samples in a matter of minutes. There are so many differences here that I can't imagine how you can be asking this question.


It might not be a _big_ deal. But isn't it at least a small deal that on top of having her style lifted, her personal name is the trigger in the prompt to apply it back ?


Her style isn't original and unique to her from the art world perspective. There are dozens of people who draw exactly like this, and hundreds who can draw like it and just choose not to. This is not a criticism against her personally - it's practically impossible to have a truly unique style in this world with millions of other artists.

The fact that pictures in that style can be meaningfully described by her brand is only a result of the success of her personal branding effort. She kept a consistent style, she promoted her work, established a website, personal portfolio, publicised her career. She didn't invent this style, but made an effort to claim it as hers. This is a regular path for an artist. It didn't just happen to her like a robbery against her will - it took years of effort for her to establish her name like that in the public consciousness, culture and search engines. Stable Diffusion just builds out of those things.

If this artist didn't exist, the style would still exist in works of other artists. It just wouldn't have this useful tag of her name. We would have to put something like Modern-colorful-flat-vector-cute-disney-textured-cartoon-illustration to get the same results. But since she claimed this style as hers, we can just use her name to effectively describe it. I don't see it as a tragedy, I see it as a success story.


Thanks for the very nuanced take. I think part of the reaction is on the tension between these years of efforts to embrace and establish this style as hers, have it associated with her name; versus a project emerging from nowhere to take that name and art style and run away with it.

On a factual/legal level these are nothing burger events, and we'll probably forget about it if two months from now she has a big boost to her career. But I'm kinda skeptical much good will come out of this for her. In our worlds it would close to raising an open source project for a decade, have it succeed and shine in the world with some support money coming in, to then get it cloned by AWS and you're left wondering what you'll do next. This is part of the game, but it sure sucks.


Feels like there's a difference between artists as drops in a vast ocean of training data, vs explicitly creating a model on one person's work. And I think the conversation would benefit from not conflating the two.

I'm sort of a copyright 'moderate' I suppose. I think people should get paid for their work, and trying to just rip-off a single person's style (and I'm not at all saying this particular example was nefarious in intent) just feels gross. But I also think too much baggage and we stifle new ideas an innovations.

However, I also think that most of the conversation around large models like StableDiffusion lack an understanding of how these models actually work. There's this misconception that they're a kind of 'collage machine'. The contribution of individual artists in these base models are like drops in a vast vast ocean. [edit: I repeat myself; recovering from Covid, forgive me.] They take this incredibly large set of digitized human creativity, and in turn we all get this amazing tool: a synthesizer for imagination.

Anyway, just my personal opinion. It's become a very 'us vs them', lines-in-the-sand argument these days, and it'd be great if the conversation could be less heated and more philosophical.


Screw one person? A great offense. Screw lots of people at once? A great innovation.


You have a point, but it's also how art works in general. Most artists draw inspiration from a conglomeration of hundreds/thousands of other artists. If an artist draws inspiration from one and only one artist, they're just a plagiarist.

(Not necessarily saying that clears up any potential legal or ethical issue with generative image models training on artists' work.)


Why do you think artists draw inspiration from other artists?


An artist who's never seen art is like an AI with no training data


We probably have different definitions for what constitutes an artist. In my experience most artists spend time looking at the world around them rather than looking at other artists' work.


All art is derivative. Even "outsider art", which is made by artists who are ostensibly naive to the work of other artists. (Personally I think that category is bullshit gatekeeping, but that's another discussion entirely.)


not my college of arts experience.


And based on what I've seen in the article, the AI images look to be notionally better than the original examples.


But these aren't artists and this isn't art. This is fast food. This is content for articles and technology for startups to become the middelman for yet another thing.

There will be art coming from these tools at some point but right now it's creatively bankrupt illustrations driven by curiousity and lots of creatively bankrupt people.


Yet. As "real" artists start using these generations to leapfrog their projects, at some point, how is this different from someone studying a style and producing something in that style with that addition of actual "art?"


It's not any different if there's true craft to it. The person in this article and many of the examples are who I'm referencing as not an artist or making art. I would imagine the person that wrote the article doesn't call himself an artist either.

I'm not arguing that the tools can't be used to create art and I'm not trying to say "real" artists don't use these. What I'm trying to say is the images I've mostly seen look like illustrations and content machines.

Personally, these tools are incredible and are almost magic in a sense.

Edit: To put it another way a writer can still write if his keyboard stops working, a carpenter can still build if his saw breaks, an illustrator can still draw if their tablet dies.


That's like expecting a chef to take inspiration from a fast food menu. It might happen but it's more likely that the chef already knows what makes fast food taste good and can develop a new menu based on their fundamental knowledge.


The fundamental knowledge they gained by looking at thousands upon thousands of previous bits of information; and, after today, that collection of insights is going to include auto-generated artwork, as well.


Great innovations do tend to "screw lots of people". Cars put most buggy makers out of business. Light bulbs, candlemakers. The computer, human computers. The internet, journalists.

AI promises to be a particularly disruptive innovation, but I don't think anyone can or will stop it. Instead, we should think about improving society so that the promise of AI benefits everyone and not simply a select few.


> Anyway, just my personal opinion. It's become a very 'us vs them', lines-in-the-sand argument these days, and it'd be great if the conversation could be less heated and more philosophical.

It's heated and less philosophical because many artists are worried about their livelihood while a multi-billion dollar company is working towards making them obsolete often using their own work.

I don't understand the confusion people have towards this issue.


> many artists are worried about their livelihood while a multi-billion dollar company is working towards making them obsolete often using their own work.

You do realize that most commercial art is "art for hire" and in this very story some of the examples trained were not owned by the artist.

Multi-billion dollar companies already do this. Hire artist to draw Corporate IP. Corp owns and can do whatever they want with it. Maybe they hire that artist again. Maybe they hire someone else and they share the work in a reference folder.


The distinction I think is that the multi-billion dollar company working towards making artists obsolete by using their own work didn't pay any of these artists for that IP. At least with hiring an artist to draw corporate IP, an artist has to relinquish their rights to that work explicitly and are paid for those rights.


But now they (maybe) get to skip the "hire arist" step. Which, from the point-of-view of the artists, is the most important part.


> It's heated and less philosophical because many artists are worried about their livelihood while a multi-billion dollar company is working towards making them obsolete often using their own work.

IN that case, odds are we should be outraged about the existence of 99% of people here, because they work for tech companies whose sole goal is to create software that tries to make human employees redundant.


Are we talking about Stable Diffusion or GitHub Copilot here?


I’m not convinced there’s a difference.


It doesn't really matter: the AI model is a compressed representation of copyrighted works processed by a generic, non-artistic algorithm specifically engineered to extract the artistic features of such works.

Conceptually, it's very similar to running lossy JPEG over a single copyrighted work versus a giant folder of works from different artists, then shipping that compressed collection to you customers so they cut and paste sections of those works in their collages. The output your customers create with this tool might be sufficiently original to warant copyright protection and fair use, but your algorithmic distribution of the original works (the model) is a clear copyright violation, it doesn't matter if it affects a single artist or thousands.


> your algorithmic distribution of the original works (the model) is a clear copyright violation

Depends on if it's considered "transformative" enough. If Google can cache thumbnails for search, how is an AI model not a "search database+algorithm?"

That said, I do expect that SD will be shut down for copyright infringement at some point because you are right, the model does have a bunch of copyrighted material in it and Disney's lawyers will probably come swifter and more prepared than the defense.


As a general rule, the fair use thumbnails enjoy is very limited, only in certain jurisdictions, and only for very specific use-cases.

An universal art-production machine that can compete in the market place with the original artist - and indeed crush them on productivity and price - certainly does not qualify as fair use.


>AI model is a compressed representation of copyrighted works

But it's not because you can't get a copy of the copyrighted works back out.


How do you copyright a style though?


Misconception? I mean, if it talks like one and quacks like a collage machine, that's kinda what it is. It's just using an infinite (well, 4 GiB) magazine reel to cut out from.


In the end, you are not going to be paid for drawing everything all over again. You are going to get paid for devising a unique style to match the creative vision of the movie, series, book or whatever and for tweaking the machine outputs.

There is a chance that style will become copyright-able. Well, eventually common style banks will appear.


I think painters also copy each other styles , or make copy of popular works, artists don't like this but I did not seen people demanding "do not use this art style because XYZ created it",

For me, just a regular person, the art is not something an soulless machine can generate or a monkey with a camera, the intent and mind is important. So a guy can ask an AI to create a malformed portrait of some subject with some artist style , the value of the art is in the subject,theme and not the style IMO. Like if you ask for a portrait of X steping on Y dead body, dressed in Z and with M,N,P the artistic value is in your idea behind this and not in the pixels.

I remember similar complaints when digital art started to get popular, that is not real art , that you just move pixels around


> Feels like there's a difference between artists as drops in a vast ocean of training data, vs explicitly creating a model on one person's work.

I don't.

Stability.ai did the big training sets and it is coincidence that it remembers the names and categories accurately.

If you want to leverage this tool more fine tuned then you add these modules with more accurate naming.

I would be for some way to compensate artists, like if the use of a module gave them a royalty, but I don't think it is an ethical, legal, or social norm to enforce. If it happens in a uncircumventable way, I would be for it. If it doesn't, I'm for that too.


I've seen that with good models (I think it was StableDiffusion, not sure), you could get style imitation of a specific person's style; even if it was using a ocean of works, the model was still successfully fishing the exact drops of a specific artist (or close enough that I couldn't tell the difference). And thus the model was able to rip-off dozen of different styles.


Hey, we want to use machines to steal people's creative work, take away their jobs in the future, and create more algorithm generated garbage since it worked so well for news sites and promoting videos/posts/social media. We also want to pretend that computer generated graphics are created by an 'artist' for the sole purpose of being able to assign copyright to our machine works, but be able to ignore artists and copyright in every other way.

Why do people seem to have a strong opinion against this? It's just stealing the most intimate thing humans create (art) so that we can create soulless algorithmic versions of it for our own uses because evil artists won't let us just do what we want with their works and won't let us profit on their work. Artists are jerks.


I can't quite follow what you're saying due to your choice of subjects. "It's just stealing the most important thing humans create so we can create soulless algorithmic versions of it for our own uses..." Who is the 'we' there if not 'humans?' I can tell you the machines do not care one way or the other about art.

What we're looking at is the ability for humans who have not trained to be artists being able to create something similar to trained output. That's what technology does: augment human capacity to do something. It's what it always does.


> We also want to pretend that computer generated graphics are created by an 'artist' for the sole purpose of being able to assign copyright to our machine works

The current IP rights system allows for work for hire, assignments. Reframing it to "I commissioned an AI" obliterates any debate about who owns what. When you commission the rights of what is created is assigned to you, for this machine model, just add that clarity in the TOS and its a done deal.


Creating art similar to another artist's work isn't stealing when humans do it, why would it be stealing when machines do it?

(Unless the result is too similar, like the Wonder Woman image generated by Stable Diffusion shown in the article, in which case it's "stealing" whether created by human or machine.)


Because machines are not humans, and shouldn't enjoy the benefits of fair use.

Copyright law exists to balance the interests of the people that make up the society, not some abstract caricatural embodiment in algorithmic form.


Should humans using machines enjoy the benefit of fair use?


Sure, if you build your own model, train it on copyrighted works, then use it to create art; or if you use someone else's model which properly license its copyrighted sources, and use that to create art. In both cases your output is a new creative work sufficiently different from its parents to not constitute infringement and enjoy its own copyright protection.

However, the model creator/distributor will never be able to claim fair use on the model itself, which is choke full of unlicensed material and can only exist if trained on such material. It's not really a subtle or particularly difficult legal distinction, in traditional terms it's like an artistic collage (model output) vs a database of copyrighted works (trained model).

The trained model is not a sufficiently different work that stands on its own, in fact it is just a compressed algorithmic representation of the works used to train it, legally speaking it is those works.


In what way is the model chock full of unlicensed material? It was trained on unlicensed material, but I don't think you're ever going to be able to find a forensic auditor who can tease individual works out of the weights in a model.

You can't reasonably assert that a model encodes individual works of copyrighted material in any way meaningful for copyright. Not without a change to the law.


Obfuscation is not a valid defense against copyright infringement. If my database contains full encrypted copies of unlicensed works and I distribute keys to my customers for parts of those works, no forensic auditor will ever prove the full extent of my infringement without learning the full keyset. But I would argue that reproduction of even a single instance of an unlicensed non trivial fragment of a copyrighted work would taint that entire database.

In the same way in AI a crafted prompt that creates striking similarities to a well known work, like this example here, is suficient proof that the model embeds unlicensed works; using a copyrighted work for training models is just another form of commercial exploitation that the original author should be compensated for.


But we're not talking about obfuscation; we're talking about the data you're describing not being there. If you ask the AI to spit out a 1-for-1 copy of Hollie Mengert's work, it can't. I suspect it can't spit out coherent individual pieces of it either (I might be wrong in that assertion, as I haven't run this Stable Diffusion). It spits out content in her style.

You generally cannot copyright a style.

If it spits out entire chunks of pre-existing works, that's an entirely different story; but what it seems to do is (via the learning and subsequent diffusion process) receive an input like "Wonder Woman on a hill" and (to falsely anthropomorphize a giant math puzzle) say "I know what a Wonder Woman looks like, and I know that 'correct pictures' have some certain ratios of straight lines and angles and tend to use some particular color triplets, so I'm biasing the thing that matches to my 'Wonder Woman' shape structure with those lines, angles, and colors." The result is a picture Hollie Mengert has never drawn, which an observer could assume is done by her because the style is so spot-on.

And aping an artist's style is not illegal for humans and we have no law to make it illegal for machines. Should it be illegal is an interesting question, but it will require new law to make it so.


I'm not claiming the problem is identical to pre-existing problems in the copyright space, just that it's sufficiently similar not to pose a significant challenge for legal scholars, IMHO. Existing copyright laws not only forbid verbatim reproduction, but require derivative works too not prejudice the original author, and grants those authors the power to authorize and reject derivation: https://en.wikipedia.org/wiki/Derivative_work

You anthropomorphic analogy falls flat in its face because the algorithm does not "know" anything, not in any sense of the "know" word for sentient and rational creatures. The algorithm embeds an association between the text "Wonder woman" and actual artistic representations of Wonder woman included in the prior art it is trained on. When prompted, it can reproduce one (see the Copilot fail where it spited out verbatim copyrighted code including comments) or a remix of such representations and integrate them into the output. That's plain as day a derivative work.

The particular case you are referring too, style extraction, could be considered fair use assuming you can technically separate the base visual model from the output style and you can prove the training data for the output module is distilled into abstract, statistical quantities pertaining to that style, such as color palete, stroke weight etc. That sounds like a tall order and I would consider any AI-model trained with copyrighted works as tainted until that burden of proof is satisfied.


Isn't the fact that it can faithfully simulate, in the style of the author, works the author has never created proof enough that the style is disjoint from the trained content?

Hollie Mengert never rendered the streetscape in the article, but DreamBooth did it in her style.

If we're talking criminal copyright infringement, why is the burden of proof on the defendant to show statistical abstraction if the plaintiff can't prove the AI generates works she has made? (Again, if it is possible to get DreamBooth to kick out Hollie's original work, or substantial portions of it, I'd be inclined to agree with your way of thinking, but I haven't seen that yet).

> embeds an association between the text "Wonder woman" and actual artistic representations of Wonder woman included in the prior art it is trained on

Not if I understand how it works correctly, no; it does not. In fact, Mengert's rendering of Wonder Woman differs from the one DreamBooth kicked out if you look up the work she's done for "Winner Takes All! (DC Super Hero Girls)". This is because DreamBooth's approach is to retrain Stable Diffusion with new information but preserve the old; since Stable Diffusion already had an encoding of what Wonder Woman looked like from a mélange of sources, its resulting rendering is neither Mengert's nor the other sources, but a synthesis of them all.


I feel like there are three issues at play here:

-Using her name to describe/advertise the fine-tuned model.

-Using her illustrations to fine-tune the model.

-Using a larger body of potentially unlicensed images to train the base model.

For the first, if we had decided that the other steps were fair use or whatever, would it be better or worse if the fine-tuned model had been made available with no mention of the identity of the author of the training images? I'm not sure.

For the second, there is surely a limit after which this sort of thing becomes unambiguously unacceptable. Suppose you fine-tune so aggressively on a small dataset that eventually the model simply reproduces exactly the training images. Now you're obviously violating copyright. But where exactly is line before that? If I have a base model that was trained on fully licensed images, and I make one single gradient descent step using a copyrighted image, making imperceptible changes to the model's output, surely the resulting images not suddenly in violation. It seems to me that the standard should be that if a human were to draw the output by hand after looking at the training images, would we consider it a violation of copyright? As a thought experiment, imagine someone who lacks the ability to draw but can instead hand-write the weights of a neural network to produce the desired output - it shouldn't matter which process they use.

For the third, what if I spent a long time prompt engineering on a model trained entirely on properly licensed data and was able to generate a prompt format that produced the outputs we see from this fine-tuned model? In other words, for any generative model, there is a space of reachable outputs, and it's not so clear that these images did not already lie in that space before fine-tuning.


I think this is the right way of chopping the problem space. In this case, the artist expressed a preference to not have her name used, which seems reasonable, and the author renamed the repo to accommodate that.

One could easily imagine the opposite scenario, where the artist objects to not being credited.

We are in a weird stage where the indexes into this style space are quite weird and ad-hoc; look at all the prompt engineering black magic. It’s worth noting though that this problem only showed up because there weren’t many examples of this style in the data set; when every frame of every Disney animation is in there, it won’t need you to refer to one artist’s name.

I do wonder if a PR and UI improvement would be to layer a language model on top to build the prompt, or perhaps even bake this into the model itself, so you can use more generic (and possibly iterative) style descriptions instead of having to refer to an artist by name. Basically use AI to solve the prompt engineering problem too.


I mean, they're literally out here training models on Disney's copyrighted works. People seem to miss the point that the copyright ambiguity can work both ways and the courts are almost always on the side of corporate copyright holders.

I posit with a few good NSFW scandals so the legislation can be dressed up as protecting the children, Disney will get the legislative intervention it wants re: copyright and stable diffusion.

I realize that seems pretty US-centric but it's surprising the ways Disney can reach internationally to protect it's IP. It'll be interesting to see if Stable Diffusion ends up relegated to the parts of the Internet outside of the big three "jurisdictions" so to speak: US, EU, and China.


> But where exactly is line before that?

Unfortunately the line of "fair use" is very blurry and only gets clarified in specific instances, with lawyers and humans debating if there was damage done.

It's extremely frustrating and these advances are going to pressure fair use for sure.

People will generate infringing work with AI and people will generate newly copyrightable work with AI. The danger is, as an artist, it's hard to trust if the model has given you something that is copyrighted as you have not necessarily seen it before. (You the human can cite your references, and it's possible you were subliminally influenced, but with SD, generic prompts may give you "copyrighted details" that you don't know.)


Yeah, definitely. It's more starkly clear with Copilot, where you can point to verbatim reproduction of code. And it probably helps Stable Diffusion that images are fuzzier, so you're less likely to generate something that's close enough to be a copyright violation, but there's really no way to be certain that your output is not a memorized training example, even if it's statistically very unlikely.


> In other words, for any generative model, there is a space of reachable outputs, and it's not so clear that these images did not already lie in that space before fine-tuning.

I'm trying to understand what you are saying here. It sounds like you are saying that something that doesn't yet exist must already exist simply because a model has been created which can yield this non-existent thing as an output?

Your example uses properly licensed data as an input but seems to imply that creating a model that is capable of producing a particular output means that the output in effect already exists as a prior work before it has ever been created by virtue of the model's ability to create it on command.

I'm probably over-thinking all this.


Well, at some point it's like the monkeys writing shakespeare. Instead of a complicated neural network, we could have written a program that just outputs random pixel values. We'd definitely be able to find all of the outputs of Stable Diffusion, as well as all of the works of the original artists coming out of that program. It'd just take us a lot of waiting and watching.

I don't know that that's enough to constitute "prior art" in any meaningful sense, though.

The way I look at it is that the function of Stable Diffusion or any other diffusion model is to pare down the output space of the "random pixel machine". It learns which regions of image space are likely and which are unlikely, and so when you sample an image, you tend to get ones that people like rather than random noise.

You could imagine an idealized diffusion process whose output space is truly continuous and which assigns non-zero probability to the entire image space (all possible pixel arrangements), with higher probability assigned to "good" regions, and lower probability assigned to "bad" regions. If I sample from such a model repeatedly, I will eventually (in the mathematical sense of probability 1 in the limit) get out an image that looks exactly like a Hollie Mengert illustration, even if the model has never seen one during training. I'll even eventually get out an image that looks exactly like an illustration that an unborn artist will create in the year 2047.

Now, in practice, it's a little less clear. Are there regions of the image space that Stable Diffusion assigns exactly zero probability, such that no amount of sampling will ever generate that image? Are there real, "good" images that Stable Diffusion assigns very low probability such that generating them is no better than sampling from the "random pixel machine"?


>you tend to get ones that people like rather than random noise.

And you have to consider that your idea of and tastes in "art" may be quite different from mine. People actually buy and display canvas with random bullshit colors or simple shapes or even lines on them that to me, is not art in the pure sense but is instead just the output of a lazy person marketed to dweebs who need something different for their living rooms.

Back to the original discussion - if your earlier point is valid, I don't agree myself, then the only one who owns any art, product, process, etc is the one who designed that model that can be used to create it even if the model was created years after the thing they created. None of the people who actually created the product or process, etc own anything as a result of their efforts. That is wrong.

It is a bit hand-wavey and bullshit mystic like the dumbass marketing used by some sculptors who claim they didn't really do anything except remove all the wood, stone, clay, etc that was hiding the sculpture they created. It was always right there carefully concealed in the tree trunk and anyone who dinked around with that trunk would've ended up creating the same sculpture under that logic. LOL. Jesus put it there, I'm only the guy picked to uncover it.

>Are there real, "good" images that Stable Diffusion assigns very low probability such that generating them is no better than sampling from the "random pixel machine"?

That depends on your definition of "good". Do you mean "good enough" that an observer will see a resemblance or is it something else? I would think that the output from any model will always have an upper limit to what it can reproduce that is related to the properties of the combined inputs to the model. With more inputs one should see higher precision outputs. Like you say though, Stable Diffusion probably designates some outputs as near zero probability because the input set supports that conclusion for the requested output.


>I would think that the output from any model will always have an upper limit to what it can reproduce that is related to the properties of the combined inputs to the model.

I look at it from the other direction. A diffusion model, at its heart, is a function that takes an image (initially random noise), and produce a slightly "better" image, according to its learned concept of "better". You turn this crank over and over and out comes a good image. But the simplest diffusion model of all is the identity function. You feed it random pixels and it outputs them unchanged. That model - the "random pixel machine" - can trivially output any possible image. The training process is paring down its output space to produce more outputs that are like the training images, and fewer outputs that are unlike them.

So it's not the "upper limit" that training is addressing. Creating a model with an upper limit that includes all the great works of art humanity will ever create is trivial. It's the "lower limit" that's the problem - if those outputs are lost in a sea of uninteresting noise, you'll never find them. Training is the process of raising that lower limit.


>according to its learned concept of "better".

I'm gonna assume that all the input images fed to the process form it's "learned concept of better" (LCoB) so that in effect there is nothing random about the outputs. Indeed, each successive output becomes an optimization of the model fit to the "LCoB". Following that, it may start with an initial output that strongly resembles random noise but inside that first output will be a non-random component that initially fits some part of the LCoB model that it is trying to achieve.

Also following that, the lower limit of the process is the output of the first step in the process since each successive iteration is an optimization towards a model. You already have the model noise floor when you run the first step. Struggling to take that lower is not smart and could be accomplished simply by excluding some part of the collection of images used to form it's LCoB.

Does that make sense?

There is nothing random about any of the outputs of this if it is model-driven. Stable Diffusion output is model driven therefore it is not random at any stage.

In geophysical processing we have to carefully monitor outputs from each process to make sure that processing artifacts from mathematical operations can not create remnant waveforms that can be mistaken for geological data that could be used as a basis for drilling an expensive well. Data-creata is a real thing. Models are used.

Thanks for the discussion.


So the simplified view of a diffusion model is like the following (this leaves out the role of the prompt):

-Sample random noise.

-Ask the neural network "Here is a noisy image. What do you think it looked like before I added all this noise?" (note that you did not form this image by adding random noise to an initial image)

-Adjust the pixels of your random noise towards what the network said.

-Repeat until there is no noise left.

During training, we take a training image and add noise to it. This way we know the "correct" (in scare quotes because there are many possible clean images that lead to the same noisy image given different realizations of the noise) answer to the question in step 2. This is used to update the weights of the neural network.

Ultimately, a diffusion model is just a denoiser. A denoiser implicitly represents an underlying distribution of clean data. The diffusion process used to sample from the diffusion model is a clever way of drawing samples from that underlying distribution given access only to the denoiser.

At sampling time, we have no training image that we add noise to. We just sample random noise out of thin air. This works because in the limit of large amounts of noise, the distribution of "initial image plus lots of noise" and "just lots of noise" are the same. You can certainly draw an analogy between this initial random noise and the uncarved block of marble that the sculptor says "contains" a sculpture waiting to be uncovered. Given the same noise, the neural network is deterministic - it will always produce the same output.

You could even imagine an oracle who can unwind the process of the neural network being cranked and tell us exactly what initial noise sample would produce any desired output. Just like an artist might just draw the image that they want rather than waiting for the random pixel machine to output it, the oracle could simply set the precise values of Stable Diffusion's noise input to produce a Hollie Mengert work, rather than sampling repeatedly until it found one.


I think that we just described the same process.

>-Sample random noise.

>-Ask ... ...what the network said.

>-Repeat until there is no noise left.

Initially we have noise (random or not doesn't matter) and we are trying to find inside that noise a matching function to an image that we already have, our model or training image. That image may or may not also contain some residual noise and as you describe in your iterative steps, noise is added to it thus decreasing the signal to noise ratio, and the neural network compares the output of it's initial or updated image to the "correct answer" image and weights the next iteration accordingly so that a better match can be obtained. The function itself consists of an optimization process designed to minimize the residual noise (a de-noiser like you say) between the most recent image and our target image.

When you say that you begin with a sample of random noise created by your array populating function which is assumed to be random but that you have no training image to which you are adding noise, that fits the whole process but it ignores that you are using the random noise image to iteratively produce an output that fits a model image created by compiling statistics from multiple images to create a target output that is assumed to be near noiseless or to have very high signal to noise ratio.

If you are trying to use this to "find" Alfred E. Neuman in your output the process needs to know what Alfred E. Neuman looks like so it can effectively optimize to that result. Each iteration denoises based on the known output in the model created from the images that it must ingest in order to build the model. If you only have a few images of Alfred E. Neuman but you have thousands of Homer Simpson in your model input dataset, then you will have to fight through the tendency of the process to converge on Homer Simpson. No matter, you always have a priori information that is used to verify the integrity of the output. The input is irrelevant whether it is random or not since you are looking at an optimization process that matches, denoises, and weights iteratively until it minimizes an error function in the match and can be said to be an optimum, or good match.

This is not particularly new or novel or anything else. It is a typical iterative modeling exercise like those that have been used for decades but now you have the compute power to build a near noise-free target model that fits the known data from every source at your disposal.

The user who created the Hollie Mengert styled outputs could not have hit that target without using a model that was designed to create or mimic that type of output. That is why he chose to use her work in his process. He liked it. Then when he found out that she was not pleased about not being consulted and that she didn't have the rights to use some of those images then I think he had a come-to-Jesus moment that ultimately led him to rename it so he could feel better about it. Guilt-tripped him.

Anyway, ethics should be a required part of every computer science curriculum especially when private personal information is involved.

I'm in the oil and gas industry. It sucks sometimes. Fortunately there has been a push to include or require ethics training. Maybe one day it will clean up that industry. I'm only holding my breath though when I pass a refinery.


>The user who created the Hollie Mengert styled outputs could not have hit that target without using a model that was designed to create or mimic that type of output.

This is the part that I'm not so sure about. The value of fine-tuning on Hollie Mengert's work is not so much that it enables the model to create that type of output, it's that it makes it far less likely to create other types of outputs. It narrows down the haystack, but it doesn't create the needle.

Similarly, if I set out to find Alfred E. Neuman, but my training data has no images from MAD magazine due to licensing concerns, will it be possible? It may not be possible to use the prompt "Alfred E. Neuman", but maybe it's possible to use the prompt "A cartoon drawing of a grinning red-headed boy with a gap in his front teeth". Images recognizable as Alfred are likely still in the model's output space, even if they are not so easily found. They are certainly in the output space of the "random pixel machine". It's just a question of how hard they are to find.


Basically, from what I have read about Stable Diffusion, a model can be created to replicate a particular style of output by incorporating images illustrating that style into the model space. Once that is complete SD can use that model to create new images in that style because it has a huge textual-image trained model space to use where features in images are tagged to provide contextual clues to SD so that when the user inputs "A cartoon drawing of a grinning red-headed boy with a gap in his front teeth" the process will know how to parse the requested image parameters. In short, it already understands what a cartoon drawing is from having multiple images tagged as cartoon drawings incorporated. It also knows the difference between a grin and a frown or a look of disapproval, can recognize colors, differentiate gender, along with the other parms in the text prompt used to build the image.

However, if it has never seen Alfred E. Neuman it is very unlikely to be able to produce an output that resembles him. This is part of the reason for the huge popularity of SD. Not only is it open source and free to use, very fast since it limits image size to 512x512 (from what I read) but it also allows tuning (training) of existing models with new images so that the user can easily train it to produce images with specific features or characteristics that may not have been in the original training set. You can steer it to produce variations of any type of image that you can train it on.

As the author mentioned in the article, he trained SD to insert his image by into the outputs by adding only 30 properly configured images of himself to the model set. It worked great after that. Without those images it did not work because the model had no context for part of his prompt.

>Yesterday, I used a simple YouTube tutorial and a popular Google Colab notebook >to fine-tune Stable Diffusion on 30 cropped 512×512 photos of me. The entire >process, start to finish, took about 20 minutes and cost me about $0.40. (You >can do it for free but it takes 2-3 times as long, so I paid for a faster Colab >Pro GPU.) > >The result felt like I opened a door to the multiverse, like remaking that >scene from Everything Everywhere All at Once, but with me instead of Michelle >Yeoh.

Without knowing anything about him it would not be able to produce images of him. He uses Garfield as an example where it does not do well.

>...it really struggles with Garfield and Danny DeVito. It knows that Garfield’s >an orange cartoon cat and Danny DeVito’s general features and body shape, but >not well enough to recognizably render either of them.

SD is a model-based image optimization process which uses a very deep (100 languages recognized) training dataset of millions of tagged, open source images scraped from the internet to produce a relatively light-weight and thus, fast image generation tool. It has to know what something looks like from contextualized a priori data in order to be able to create an image that uses the object, characteristic, feature, etc in the output.

Thanks for spurring me to look into this. It looks very interesting though I am unlikely to find time to work with it myself.



This is going to be a somewhat disjointed and nonsensical comment as I'm finding it difficult to find the proper words to put down regarding this whole issue.

Never in my life did I expect myself to become somewhat of a luddite when it comes to AI, especially considering how much I love the concept in general. Something about the way it seems to be working out in the real world has me questioning my opinions, though.

I think more than anything I'm fearful for the future and what advanced versions of these tools will mean for everyone. Will the world simply become overflooded with these A"I"s for every little thing imaginable? Will every billboard I see and every sound that comes out of a speaker be some weird, uncanny computer-generated thing engineered for the maximum possible user engagement? Will anything anyone ever touches be examined by a dozen different AIs, all looking for ways to cheaply and efficiently replicate it en-masse? Will we all just become a meat shell of fried dopamine receptors, with various AIs and algorithms dictating every facet of our lives?

A lot of people call the recent and probably future AI explosion some sort of progress, but from where I'm sitting it barely feels like progress at all, considering it's already being used in the ways I've just described albeit at a less effective scale. Seeing my partner use TikTok to my dismay, and the efficacy of its algorithms and just how good it is at sucking countless hours out is just going to get worse once every single platform we interact with starts doing similar things.

It's strange, because AI could truly go in any number of possible directions, but so far what I'm seeing is it's going to lead us to ruin, and a lot of people seem to dismiss it or are even accepting of it (even in this very thread) simply in the name of so-called "progress".


> Will we all just become a meat shell of fried dopamine receptors, with various AIs and algorithms dictating every facet of our lives?

I've been thinking about this lately, in terms of dopaminergic activities like advertising, drugs, sex, gambling, bright colors, etc. When someone is too susceptible to addiction by one of these, their life is destroyed by it.

Pursuing dopamine, purely for the feeling of dopamine, results in this evolutionary dead end. So, every time modern society optimizes a product to produce more dopamine response, anyone susceptible is selected against, evolutionarily.

Your description is poignant. I can only see this going a couple of ways: 1. A society of people with fried dopamine receptors, too uninterested in food to grow it or eat it (Idiocracy). This collapses, and less dopamine-advanced people inherit the earth (until they do the same thing). Or 2. a society of people who can control, by whatever means, their dopamine urges.


The Supreme Court of the US is currently deciding if Andy Warhol's orange prince[1] violates copyright.

It is essentially a cropped photo, defeatured, plus the color orange.

This could be done easily without the help of AI at all, just cropping and filtering.

The case is very interesting and grapples with the same questions.I highly recommended listening to the oral arguments.[2]

I doubt the Andy Warhol Foundation will prevail, but it raises all these same questions, without the AI: What constitutes a transformative use of prior work?

Can you imbue existing art with new ideas and make it your own?

https://en.m.wikipedia.org/wiki/Orange_Prince_(1984)

https://www.oyez.org/cases/2022/21-869


I have been a member of a local musicians union for more decades than I care to admit. If someone was offering to replace me with an AI after a career of dedication to the art, I would worry there is a real risk of incidents of devastating loss of income if there is no financial cushion. Even if I had not lost any gigs yet, I would call a couple of the local union board members to discuss it, and I believe this is a concrete situation which the local board would take up for discussion.

I'm sure there are good things to come out of AI in the arts, especially if it becomes a tool for the artist. But offering to put a financially struggling artist out of work with low effort, even temporarily, is a nightmare for the artist. Guilds and unions [0] [1] have talented artists on boards of directors and lawyers on staff who can help them codify some of the issues into their standard contract used by member artists. I have seen bands fired with no notice when they had used a standard union contract. The contract and access to the Union's attorney is the only thing that protected them from loss work and not making rent that month.

[0] https://graphicartistsguild.org/ [1] https://www.usa829.org/About-Our-Union/Categories-Crafts#Sce...


A great artist can take these same prompts, look at the same input images, and produce the same or even better results.

No one would call that stealing. Copying, yes. Stealing, No. A great artist can copy other artist's style's expertly. A large part of being an art student is learning how to do exactly this.

Stable Diffusion is just a truly great artist.


I find these frequently brought-up comparisons of robots to humans pretty funny, personally. I think we can and absolutely should delineate between a human drawing inspiration from their peers and an AI taking a trillion images (far more than any human could possibly ever draw inspiration from) at once, shuffling them all together and spitting out the resulting amalgamation. I see no reason why we should be treating this as if the AI should be treated the same way we treat humans - we definitely shouldn't.

As an example of someone with negative artistic talent of any kind, even if I were to take any one of their works and tried to trace them, it'd come out looking a mess, especially once I came to the portion that required any actual artistic skill like coloring and shading.

With an AI, I save a few of their works, stick it in some blackbox A"I" that gives me 8 trillion different derived images that are in truth just kind of spitting out whatever makes statistical sense to stick together and boom, I've just managed to take a dozen artists' works and gain something out of it.


You are drastically underselling the amount of emergent intelligence that comes from $500,000 of Stable Diffusion training.

This is not merely "averaging" input images. It can literally understand human speech.


Maybe I'm being too pedantic, but I'd feel it's far more accurate to say it can interpret human speech within a specific context. There's no understanding going on.


Computers do it more efficiently, yes, but you're not giving an argument as to why being more efficient means there should be different rules.


The difference is that the great artist would spend time studying, learning and researching the style. Which could take days, months or even years. That would be impressive.

AI can produce the same result in seconds allowing the content to be ripped-off quicker than an great artist trying to achieve the same result. Isn't that unfair?


> Isn't that unfair?

No? Doing something better and faster does not change the fact that it is the same thing.

If an artist is allowed to copy styles, then other people should be allowed to do so as well.

Whether or not you have wasted year of your life, learning these skills, should have nothing to do with that.


The reason we are creating Artificial Intelligence is so that we can get results in seconds rather than days, months, or years.


Github co-pilot is therefore too a great tenured professor at Texas A&M.


AI for code-gen is not nearly as developed or impressive as image-gen.

So, no, I wouldn't say that...yet.


This Hollie Mengert’s style is (nice, but) not at all original. There are thousands of cartoons that look exactly like this. You could never even tell that anything like this is similar to “her” style.

But, even if she did have a distinctive style, there is nothing illegal or unethical about learning that style and producing your own similar artwork, whether you call it “in the style of”, or not.


> not at all original.

> But, even if she did have a distinctive style

This is some rough criticism of an artist who's dedicated their profession and life to art.

> is (nice, but)

This made it all OK!


It’s not rough. Her art is nice. And I’ve also seen a lot of other work that looks like it. Years dedicated to the profession or not.

Maybe she’s closer to the origin of the style than most?

Anyway I spent all my life writing code and I’m not upset when someone uses patterns I came up with, especially if they credit me.


I feed your code in to CoPilot, I've now got my super app. All those years you've spent coding, learning, refactoring, now done by AI in mere seconds.

Why should I credit you? I don't need to credit you, why should I feel the need to credit you? AI produced the product. You'd feel pretty annoyed right?


If you knew anything about software development and had actually used copilot you'd know this is a specious argument.

I have roughly a hundred git repos. Do I care if somebody based an AI autocomplete off my code? Not even a little bit, I've never given a shit if somebody steals bits and pieces of my code because that's like if I lifted a few paragraphs out of a book somebody wrote, who cares if the book that I'm producing is fundamentally different?


I do. No I am not a software development. I'm a system administrator. I do write code and I would be pissed if some AI came along, took it and regurgitated it for someone else.

I take pride in what I write. I don't write it for someone to come and steal it. Sure if I had posted my code online, than I can expect it to be copy & paste but not for the usage of AI.


Not at all. I love the idea of Copilot and similar systems!

https://youtu.be/gAjR4_CbPpQ

I have a tiny issue with those systems if/when they output certain memorized content. But that’s a type of bug, which can be avoided.

Also I didn’t say I want credit when someone copies my coding style. But if I was being credited, as this woman is being implicitly credited, I would have even less than no reason to complain.


I don't know this specific artist, but the fact that someone dedicated their profession and life to art doesn't mean they're necessarily good.


Or they could dedicate years and be good (or great!) and still not unique.

In fact, the better they get the less likely they are to be unique as more people will imitate their style.


I didn't make this claim that someone would be good just because of the amount of time spent, the claim I made that is harsh criticism to make when they have.

For example if you just started strength training, have been at it for just a few weeks and someone said "you're not very strong" that's not harsh criticism because you just started, but if you've been strength training for 10-20 years and someone said "you're not very strong" that criticism hits different right?

Does it make sense what I am saying now?

This artist will probably read these HN comments, and I'm struck by how cruel the comments are to someone who's just out there creating wonderful content, didn't ask for any of this, so many of you here do not care about how she feels.

I also call into question the ability of these HN commenter's competence at critiquing art.


It’s not an art critique.

If she reads it, hopefully she understands the lens of the language. Her style is good, and evidently assiduous.

Everything said remains relevant.


Ah, yes, that clarifies it, thank you. I agree.


Does whether they're good or not actually have any bearing on the legality or morality of copying their work?


>This Hollie Mengert’s style is (nice, but) not at all original. There are thousands of cartoons that look exactly like this. You could never even tell that anything like this is similar to “her” style.

Good point. As the article itself says, her style is explicitly based on Disney. This isn't like, say, Cubism, a style (intentionally) very different from the contemporary norm, and nothing like anything that had come before.


It's original enough that Disney pays her for contracting work.


They pay her because she does high quality work in the style of other Disney art. Not because her work is original.


Disney employs full-time artists that can do that anytime. They pay her specifically for comissions.


In the style of Disney.


It's interesting to think about all the controversy surrounding machine-generated images in contrast with a scenario where the same images were generated/drawn/created directly by a human.

In this specific scenario, what would the artist think if 1000 people started drawing in her style and released those images?


It's ok to have different standards for humans and computers. Even if you think that a machine learning model is conceptually doing the same thing as a human artist, just a trillion times faster and infinitely replicable, there's no reason we can't say that it's ok for humans to do this, but not for computers. Computers are not people. It's not unfair or unethical to put an artificial limitation on an artificial object.


Seems like putting a limitation on what someone can do with their own computer on their own time with their own money.


Yes, sort of like disallowing someone to DDoS a server using their own computer on their own time with their own money.

This is not about one person playing around in private, it's about thousands of people (potentially millions) instantaneously generating art expressly intended to copy someone's specific style and publicly releasing the results.


If you're ddosing someone's server you necessarily aren't playing around in private.


If people were only playing around with Stable Diffusion in private then this article wouldn't exist and we wouldn't be having this conversation.


Saying "here's an image I generated" is exactly the use case of reddit. DDoSing reddit is not the use case of reddit.

And "private" has more meanings than "secret/only personally known". Posting something you generated on Reddit for fun is private use.


Tough to argue with that, all I can say is that your usage of the word "private" does not agree with either the dictionary definition or common usage.


Look up the term with relation to copyright law. See also: personal use, fair use.

e.g. (pdf) https://digitalcommons.sacredheart.edu/cgi/viewcontent.cgi?r...


That link does not define the term, and I haven't been able to find a legal definition of "private use". The paper seems to be using the normal definition, though.

> It sounds controversial that a defense of private use exists at all; after all, one usually buys a book for her private use. This use may mean that one can make photocopies of a legally possessed book, in order to read it, for example, not only in the office, but also at home. One may also loan the book to a friend.

I would be pretty surprised to find a legal definition that says if you put something online just for fun it doesn't count as copyright infringement.

I don't know why we're talking about this though, I'm not a big fan of copyright but that's not what this is about. It's not "you put my drawing on Reddit without my permission", it's "you publicly released a tool on Reddit that allows anyone to effortlessly create infinite variations of my work in my name, please don't do that". I don't care whether or not it's legal, I think it's immoral to create a tool that could not exist without ingesting someone's life's work and then ignore them when they ask you not to do that.


if you release the artwork you generate, you aren't playing around in private either.


There are plenty of those limitations already, some even having to do with things like IP and copyright.


I don't see any evidence that things being generated with SD or similar somehow remove those same limitations around IP and copyright. I am just as much hoping Disney looks the other way when I make fanart of their IP regardless of how it is made.


>It's ok to have different standards for humans and computers.

Is it? The computer is a tool doing it on behalf of a human, so different rules for computers ends up being different rules for using different tools. Should the efficiency of a tool be a factor in the limits we put on a human? Given that humans can use the same tool with different levels of efficiency, this also seems to open up to the question if different levels of skill using a tool have different rules.


I can move on the street om my feet, or I can move on the street with 240km/h with the use of a tool called a sports car. I think limits on the capacity of our tools are often the whole difference between what is legal and what is illegal.


There are a number of differences. First is that road rules apply to public roads. Private roads have far fewer rules. You might not be able to endanger a child or have duels to the death, but there generally aren't speed limits at all regardless if tools are involved.

Then there is the matter of why the laws exist. Generally they prevent harms, and if the limit is on a tool it is only because humans can't do the harm without a tool. A limit on noise generally only limits the use of tools, but if a human was able to break the limit without a tool then the limit would be applied to them. A law might not have been created yet just because it isn't a threat.

In this case, humans can already copy artwork, so there are already limits in place. In this case the tool doesn't really allow for anything new that couldn't already be done with money. If I could afford it, I could hire thousands of artists to create work in a similar art style up to the limits of the existing law. Such expenditures are rare, but are they something that needs to be limited which hasn't due to being too rare? If not, then why does a tool lowering the cost change the limit?


> Should the efficiency of a tool be a factor in the limits we put on a human?

Yes, for example hitting someone with a real sword is punished more harshly than hitting them with a plastic one. Tweeting out blatant lies to your 100 million followers is worse than tweeting out blatant lies to 2 followers. Shining a laser pointer at a plane is only illegal if it's strong enough to blind the pilot.


>Yes, for example hitting someone with a real sword is punished more harshly than hitting them with a plastic one.

Intent will be a factor. If you thought you had a real sword and intended to kill someone, your penalty will be the same even if the sword was plastic. Look at federal stings where people attempt to commit a horrible act but are given inert tools to do it with and fail. There are some edge cases, like the difference between attempted murder and murder, and there are laws specifically around guns that make using them to murder a worse crime, but in general the tool used doesn't matter and in the few cases it does, I question if those laws should exist and think they were put in place for reasons other than just stopping murder.

>Shining a laser pointer at a plane is only illegal if it's strong enough to blind the pilot.

Is it? Less resources will be spent to capture someone using a laser too weak to be noticed, but if caught would it really be legal?


The question is should we have different standards for computers and human doing the exact same thing - not different things. Your examples all have different outcomes based on the tool.


Do you think training a human artist on a body of work has the same outcome as training Stable Diffusion on that same body of work?


Not substantially different enough to criminalize the latter.


I don't think it's criminal, I just think it's wrong.


Computers do it more efficiently, yes, but you're not giving an argument as to why being more efficient means there should be different rules.


Sometimes society established norms when doing a thing was very rare and expensive, and honestly barely needed any control because it was so unusual.

If that thing becomes a lot cheaper and easier, and it starts being done a lot more - perhaps we realise we actually need different rules even though the difference is purely quantitative.

One guy juggling chainsaws on main street - how entertaining! No need to ban that. Dozens of chainsaw jugglers on every street, 24/7, the issues are easy to imagine.

To say that "Oh, you were OK with one chainsaw-juggler therefore you would be OK with 100,000" is IMHO a line of argument that obscures more than it clarifies.


>It's interesting to think about all the controversy surrounding machine-generated images in contrast with a scenario where the same images were generated/drawn/created directly by a human.

It actually isn't interesting. Or, I should rather say, it's only interesting for singularitarians who anthropomorphize algorithms as part of their idiotic (pseudo)religion.


You’re missing the big qualifier: what if those other artists started drawing in her style AND marketed it as her style?


There's nothing atypical about that in the graphic design world.

An artist does not own "their" style, and can even be one of thousands of people drawing in that same style.

Art styles are not something people can own.

Imagine trying to own a genre of music...


You’re ignoring what I said. It’s one thing to draw in a similar style, it’s another to market it as a style using someone else’s name and portfolio to get clout.


It fascinates me how this AI generated art looks good on first glance, but turns into absolute nightmare fuel the longer you look at it.

Contorted faces, impossible limbs, mushed hands, surreal cityscape.

They look like something a madman pretending to be sane would draw, glimpses of his twisted mind showing through cracks in the facade.

https://i.imgur.com/EXQlxtF.png

https://i.imgur.com/F1IzpR0.png

https://i.imgur.com/2KvkNnu.png


Looks like you’re not using the stock 512 px resolution. Artifacts are more likely as the network was has only seen 512 px images during training.


Those are cropped directly from images in the article. I think they're stock res.


Very dreamlike to me, like how a dream makes sense to you usually, but if you focus at your face or hand or clock, it becomes odd and distorted.


In sophomore year of college, I came across a Japanese illustrator who sold sticker packs for messenger apps. I liked their art style a lot, to the point where I themed my entire computer setup (wallpaper/terminal/editor) around their work. Some friends and I worked on a project for an information retrieval class and their art continued to be a centerpiece for the website's theme, and we even jokingly snuck an image into our final paper.

As much as I loved their style, it seemed they rarely put out new content. A few weeks ago I trained a SD embedding on their work-- it was the coolest thing ever. I thought back to the class project, how I "stole" one of their artworks to use as the favicon. "Nobody outside this class will see this", I thought. But a pixel art anime girl wearing headphones was perfect branding for the app. "If I ever decide to publish this project, maybe I'll commission the artist for an official logo", I thought at the time. Now I wonder if I'd just use the SD embedding...


Curious who the illustrator is?



And after all those words you still haven't credited the illustrator.


I'm sure with a bit of scrolling you can find it. Just a few more words.


It really does feel like the headline for 1970 - 2070 is going to be "Technology desperately tries, but fails, to make the world a better place".

What technology has _really_ improved what it is like to be a human being, around other human beings, and to lead a fulfilling life?

Every "disrupted" industry feels like it just has been replaced by a less-human, more unequal, and more dystopian version of itself over the past 50 years.

Those who can survive tech jobs seem to have turned out to be the least equipped people to properly navigate us towards utopia.


This is just negativity bias. All the good stuff is forgotten because you don't have to think about it. Go to 1969 and you'll miss a ton of stuff.

I'd say the fact that I can get any book I want and be reading it within 5 minutes without even having to get out of bed is amazing. That tops it for me.


Group video chat. Allowed my family, spread across the world, to all say goodbye to my ailing grandfather before he passed.

And, on a happier note, allows me to stay in touch with dear friends on the other side of the globe. Emails and co. just are not the same.


> It really does feel like the headline for 1970 - 2070 is going to be "Technology desperately tries, but fails, to make the world a better place".

Well, for one, most of the accumulated knowledge of the human race is now available to anyone with a cheap phone (quite common even in the poorest countries), thus making it possible for anyone who cares to spend the time to become educated (not necessarily having a piece of paper that says you're educated -- we still need to work on that -- but actually educated)

That wasn't the case in 1970. Or even 1990. I think that has unquestionably made the world a better place.


Compare middle class life in 1970 to 2022 and consider how the current level of convenience and prosperity in all things from retail services to affordable furniture, etc was enabled by technology. Technology is more than just devices that consumers directly use in their hands.

I would agree that smartphone-enabled social media has had some severe downsides for society in terms of productivity, attention, and opportunity cost, but that is just a subset of tech.


My Roomba is one of the few pieces of technology I own with no downsides to my life. I often think of it as an example of technology done right.


There's been lots of great technological advancements in household stuff. Cordless tools come to mind as another.

I think it's when you put a screen on anything that it starts the downhill slide.


I am trying to replace most electric tools with hand-cranked cordless tools and I am especially excited to find a barely used 40 y/o tool where I know that it will last me easily another 40 years.

Helps me to stay in shape, to have a resilient toolbox and to lower my energy bill. I also dislike the noises of many electric tools.

My battery-powered drill stays with me though, as it is just too helpful.


The bell system 2101A is a fun bit of history too, our original copper phone/internet infrastructure was built with those



Lol, well I’ve got the old dumb one that you just push a button on to clean. His boundaries are set by my little fence pod things that run on D batteries which I have to push the button on too.


What is a fulfilling life? I would say most technology I'm using has given me a more fulfilling life.


Technology is a miraculous thing if you don't spend all day reading and debating internet comments.


The immediate counter example that comes to mind is the MRI but there are countless other advances in medical technology since the 70s.


I don't really have a stance on the moral or ethical points but some of the results in the illustrations included here are amazing. If you mixed them up and asked me to identify those which were originals and those which where AI generated I would fail miserably. That in amazing in my book.


AI is still pure magic to me. It's the first time something makes me feel like I'm part of the "before" era.

I thought it would be cool, like someone seeing the space race. Instead, I find it terrifying, like the nuclear race.


There's already a lot of discussion on the legal/moral arguments here so I'd like to comment on something more concrete.

As I understand it, an illustration for a magazine like the New York Times might net anywhere from $100 to $1000 and require 8 hours of work. An illustrator working for someone like the new york times or magic the gathering would likely consider this the pinnacle of a stable job. Many, including my comic books teacher, spent years moonlighting a service job before making it and publishing (Kikuo Johnson). With the advent of generative AI art, it seems immoral from a fiduciary responsibility point of a view that an art director doesn't train an AI model on their illustrator's art before laying them off.

I have no doubt that generative AI will continue to push forward irrespective of the legal arguments being made. I'm fearful for the frictional unemployment that comes. Having come from art school (and luckily working in tech), my illustration peers are creative but such creativity doesn't necessarily translate into creative use of tooling, business-savyness or marketing. All I can say is that I empathize with a lot of the fear and hope for the best.


> With the advent of generative AI art, it seems immoral from a fiduciary responsibility point of a view that an art director doesn't train an AI model on their illustrator's art before laying them off.

If they do that, the quality of illustrations they'll get will be vastly worse (as can be seen from the comparisons in the article). If they were willing to spend $1000 on an illustration in the first place, I doubt they'd accept that quality drop.


As someone who's collected roughly ~40k images over the past decade, I'm tempted to say a lot of the images in the article look pretty good. That being said, I think the quality will only continue to improve. Moreover, for an art director who can easily generate a couple hundred images, it shouldn't be difficult to pick out 2-3 good looking ones from a large batch.


Superficially, they look good. That's what's new about 2022 AI image generation, and it's legitimately impressive. But as I've seen more of these images (and especially used the tools myself) and taken a closer look, I've noticed that they lack the deeper content that all human art has.

Hollie Mengert points this out in the article:

> “I feel like AI can kind of mimic brush textures and rendering, and pick up on some colors and shapes, but that’s not necessarily what makes you really hireable as an illustrator or designer. If you think about it, the rendering, brushstrokes, and colors are the most surface-level area of art. I think what people will ultimately connect to in art is a lovable, relatable character. And I’m seeing AI struggling with that.”


Where do you get $100-1000 in cost for a major publication? The artists I know easily charge in that range, and that's for private commissions that they retain the rights over.

I could not imagine the custom work done that helps sell publications, *especially MTG or similar* only net $1000 per piece. Certainly they have some sort of royalties contract, at the very least.


This is pretty anecdotal (6 years back) but I've had one professor tell me they were paid $800 for a NYT illustration and $400 for a smaller publication. I might be off with the specific numbers but these are the general ballpark numbers I remember. Nothing about MTG, etc that I can give concrete numbers for.


It’s a quite dystopian technology for sure, but maybe it isn’t all that bad. the artists just (cough) have to pivot from delivering images to delivering style sheets for generative models…


It is worth asking, what legal rights of the author's may have been violated here?

Glancing at https://cyber.harvard.edu/property/library/moralprimer.html the following ending sentence seems particularly applicable:

If a person uses the identity of an author, or the works of the author, for her own benefit without the author's permission, then she may have violated the author's right of publicity or may be guilty of misappropriation of the author's work.

And the previous line is nearly applicable:

If authorship of a work is attributed to an author against her will, or misattributed, the author may have a state action for defamation against the person responsible for the attribution.

Of course "moral rights" are particularly weak in the USA. I'm sure that there would be a much better case in the EU.

Of course this gets directly to the question of what happens when laws conflict with technology. People in technology generally think that technology should win. People who benefit from the laws think that the laws should win. Both popular opinion and real world results generally wind up somewhere between.


So, if we remove her name and call this style #27b-6, there’s no issue.

She does not own the style. She owns her name and can get fussy about how it’s used, but you can’t stop someone from learning your style.


The relevant part is probably

> ...works of the author, for her own benefit without the author's permission

which did happen, in the training and fine-tuning process.


Training on copyrighted material is legal. As it should be, IMO.


Is that established law (in what jurisdiction?), or just your opinion?


I am pretty sure that it is not established law, but I am pretty sure that that is how it will work out. US provisions for fair use make training models likely OK, and the EU is carving out exemptions for it. See https://valohai.com/blog/copyright-laws-and-machine-learning... for more.

The question of whether the output of the model itself counts as a derivative work, though, is rather more complex. In the case of Github Copilot it has proven very adept at spitting out large chunks of clearly copyrighted code with no warning that it has done so. And lawsuits are being filed over this.

But in the case of the visual artwork, I'm pretty sure that it is going to be ruled not derivative. Because while it is imitative, you cannot produce anything that anyone can say is a copy of X.

But as ML continues to operate, we'll get cases that are ever closer to the arbitrary line we are trying to maintain about what is and is not a copyright violation. And I'm sure that any criteria that the courts try to put down is not going to age well.


In the music industry, even the tiniest sound sample used in a work entitles the original creator to compensation. It might come to pass that using any portion of an author's work in your training data will confer certain rights to the author over the AI generated product. You'll have to perpetually keep records of your training data for commercially available work, lest you be sued. A whole bureaucracy will evolve, a Getty Curated Training Data, public domain training data sets. Basically the same sorts of issues that we've had over the past 30 years, except replacing "internet" with "AI."

And if the past is any guide, the forces of capital will prevail commercially, but after aborted attempts to rein them in with lawsuits, hobbyists and kids on social media will be mostly ignored by rights holders.


That would be silly.

It’s about outputs not inputs. If you create copyrighted material you must compensate.

One day a robot is going to be walking around the world. Will it have to pay someone every time it glimpses a video / book / logo / car design / etc?


> One day a robot is going to be walking around the world. Will it have to pay someone every time it glimpses a video / book / logo / car design / etc?

Looking at all of the people happily deciding to interpret copyright as widely as possible is kinda horrifying in this context: https://www.youtube.com/watch?v=IFe9wiDfb0E

> Your stored mind contains sections from 124,564 copyrighted works. In order to continue remembering these copyrighted works, a licensing fee of $18,000 per month is required.

> Would you like to continue remembering these works?

> [you have insufficient funds to pay this licencing fee]

> Thank you. Please stand by.

> [Copyrighted works are being deleted]

> Welcome to Life. Do you wish to continue?


Yes exactly :)

Somehow I trust the law is going to work out ok though, despite all the hot takes from people who don’t really understand ML or copyright.

It’s scary to see so many people not understanding the distinction between an input and an output.


I’m the EU it’s established law. In the US it’s basically true but it will be playing out in the courts a little in the future.


We don’t know that yet.


Things algorithmically generated from a copyrighted work constitute derived work.

Obviously if the thing is as complex as an AI model it might be hard to prove that a copyrighted work was among the inputs.


>Things algorithmically generated from a copyrighted work constitute derived work.

This is a very strong statement that is not, at least to me, obviously true. I am curious what your argument is - learning from a copyrighted work does not automatically make a work derivative.

These models do not store copies of images they learned from, or attempt to replicate these images. They learn about constituent parts and assemble them based on the prompt, which is not conceptually all that different from how humans do the same thing.

There are obviously moral questions and legal questions around AI art, and I expect that we will see more, but I'm not sure that this statement is accurate.


This is certainly the legal question of the hour, and it may allow courts to sidestep the equivalence problem to apply different rules to humans than to AIs. History is replete with courts deciding that the rights of some minority don't count because they are deemed different (or less) in some way. Might as well get started with AIs right away. Might as well make sure that when AI is eventually demonstrably sentient there are already tons of established laws and billions of dollars that says they are not.

It seems to me that you cannot take someone's art and plug it into an empty model: we have to start with a model that has been trained on a huge corpus of art, and only then can you show it some pictures and ask it to imitate. This is no different than a human. But my opinion does not matter at all. All it requires is that a jury says "This is an algorithm, therefore it is derivative."


It's a derived work but also transformative and thus almost certainly fair use.


The legal right that's been violated is copyright: copies of her work were taken and fed into the ML model. This was done without the picture owner's approval. Moral right may be weak in the USA, but copyright ain't. Most likely making the owners whole is going to be an expensive proposition at some point.


Copyright law is the right to control a set of specific kinds of actions. A complete list of which ones is at https://www.copyright.gov/what-is-copyright/. "Inputing the work to a computer program" is not in the list of actions. Whether or not the result of running that program may violate copyright is a question of fact based on what the program does.

As for what it does, my non-lawyerly opinion is that it is no different than a human artist looking at paintings then imitating the style. Which is very much legal.

That said, a case can be made for derivative works. I don't think it is a very good case, but a case can be made for it.


The legal right that's been violated is copyright: copies of her work were taken and fed into the ML model...

You've stated this very confidently but I see no legal standing for it. Generally copyright infringement concerns itself with the reproduction aspect. Feeding a bunch of images into a model which generates a set of mathematical weights is at worst transformative in nature.

Now whether or not the images produced by this model constitute copyright infringement is a different matter.


> The legal right that's been violated is copyright: copies of her work were taken and fed into the ML model.

If copies of her work are fed into my brain via my eyes, and I then make an artwork in a style similar to hers (but not close to any particular piece she's created), there is no violation of copyright.

What's different?


The difference is that brains don't get copies of files as input, as you wrote: "via my eyes". Looking is not copying. ML models do get actual copies of files as input.


Training on copyrighted material is legal, and rightly so.


Are you sure? Has this been tested in court? As you write: "rightly so", may I ask if you are a judge, lawyer? Has this decision of yours been appealed, gone to a higher court? Does the word "training" as applied to a computing system not simply mean "copying"? Who decides this?


Copyright is still about outputs instead of inputs, just like it is with human learners.

Training isn’t copying. Its an input. It’s akin to “seeing” or “reading”.

Any judge needs to think in terms of outputs. If a system outputs something that violates copyright, there’s a problem.

But attempting to regulate inputs can’t work. For instance, it will be impractical/impossible when we have agents moving around in the real or virtual world, how are they supposed to know when they should turn their sensors off so they don’t see copyrighted material.


As always with generative AI, the legalality is far less interesting than the morality.


The problem with discussing morality is that we each bring our own moral systems to bear, then use emotionally charged words like "right" and "wrong". This leads to people at first disagreeing, and then arguing past each other. Doubly so because you feel like you have made a real point when you agree with yourself, while the other person feels like you haven't. And vice versa.

As a result I only want to discuss morality with people who either largely agree with me, or who are able to take a step back from their own moral system to discuss what someone else from their moral system might think.

By contrast laws and human behavior are external and hence easier for people to agree on. Yes, they are dissatisfying because none of us entirely agree with the law, and laws are deliberately vague in certain things. But I find that discussions of them tend to actually work out better.


You can't copyright a style. And this is automation, which put lots of people out of work a while back eg https://en.wikipedia.org/wiki/Luddite amongst other protests (people like Hollie, too). But Disney right now is a soulless factory as much as any other and they'll be onto this tech like a shot, and Hollie... I don't know. I wish her well.


Why would I want to spend years developing my own art style if I know that it can be copied and used by one of these AI engines in a few hours? Are we going to end up with less human created artwork?


Artistic expression is, I would say, an inherent feature of the human brain. It is also a consequence of the available tools. Even if I grant that people are simply going to stop drawing and painting (which I find highly unlikely), what would happen in that instance is that new techniques will arise to harness these new tools in novel ways. "Okay, I can get the computer to more or less generate anything I want. How can I compose and combine this tool into giving me something unique and interesting?"


You're missing the point of artistic expression which for a lot of us is purely an individual pursuit.

It's like if a band wants to make music purely for the sake of make money - it's just a foolish endeavor.


this whole argument is bullshit

artists are no more a protected class than programmers

learn to use the automation to further your own goals/career

or move out of the way

At some point, some non-artist, non-director, non-screenwriter will make a brilliant movie using these ML-assisted technologies. After few iterations of this success, most people will shut up and learn to harness the benefits.


My dad once told me about when he was a kid, some Christmas specials would only be played a few times on CBC right before Christmas. If you missed them, you had to wait a year. He said that became a tradition.

When he went to share his tradition with us, he was a bit bothered by the idea of "get it on VHS!" He actually protested the idea. "I didn't want you watching it at any time. It should be a tradition." And indeed it became a tradition for my brothers and I.

My kids are 3 and 5 and "Nightmare Before Christmas" was a huge hit last Halloween. My kids wanted to watch it 3 or 4 times that Halloween week. It's available for streaming so that was easy. But then they watched it in November. And January. And a bunch in the spring and summer. ...And they never asked for it this year.

There's a quality bestowed through scarcity. When you have it all the time, on-demand, there is no scarcity. When you can generate art instantly, in any style, of anything, I think it stops being exciting. For example, I don't think the concept of "ahahaha look, it's the Avengers but as Muppets!" or "It's me, but as a Simpson's character!" will have any amusement value by next year.

Maybe I should be asking Gene Roddenberry but I'll ask all of you: do you think there's something lost by eliminating scarcity?


> If you missed them, you had to wait a year.

This is literally the intro section to Chuck Klosterman's Nineties - about the number of people who watched Seinfeld and those that missed it, just missed it.

> There's a quality bestowed through scarcity.

There's a depth to scarcity, but your kids are going to watch way more things growing up than you did and you have no idea what that means for them.

I get that you aren't able to pass on that "value the thing you hold, not the two you can get later" sort of commitment to a single thing.

But that's valuing the only true scarce thing left in my life at least - my time and attention.

I remember feeling the "post-scarcity" thing going to a library for the first time in the US.

I used to buy books and read them, before I came to the US. Each book was a hard choice on whether to spend money on it or buy something else.

I now decide to read through a book based on whether I want to spend the time, not the money.

Unless I'm going to be immortal, there's no fixing that scarcity (and I get how young people/kids don't get that part at all - their life is endless from where they are).

Maybe do a little better so that my 70s aren't spent in a bed, but in a park outdoors reading.


I really appreciate your response because it argues with my point and yet resonates with me. I think there's truth to what you're saying. Perhaps that "tradition" I'm seeking is, in a way, obsolete, and my kids are faced with an entirely different (and maybe better) problem: what to spend their time on?

I'll add one thing this made me think of: when I was a kid, my family computer and Game Boy had a fixed number of games. Getting a new game was a big deal. I took it seriously and I scraped the fun out of every game thoroughly. Once I got to college and got some money, I ended up with the "I have ten thousand Steam games and I don't feel motivated to stick with any of them" problem.

Not totally sure it's related, but I thought of that.


My younger sibling and I are quite far apart age wise and we grew up on other sides of this divide. When I grew up, families would buy VHS tapes for kids or record shows and kids were known to watch the same shows over and over again. Everyone in my cohort has memories of watching that one movie or that one show so many times that their parents were utterly sick of it (and of messing up a recording of the last episode of their favorite show.) My sibling grew up in the age of rental DVDs and Youtube and the idea of running out of content to them was laughable.

But as children we were both limited by our mental development, time, and attention spans. My sibling just watched the same Youtube video over and over just like I watched the same VHS tape over and over again. As much as material conditions change, humans tend to stay the same.


> … "I ended up with the "I have ten thousand Steam games and I don't feel motivated to stick with any of them" problem."

For my own personal instance of that exact problem, I've chosen to install a limited number of games at a time which fit into my current set of interests in gaming, and choose from among those when I feel like playing a game, until each has been thoroughly explored and enjoyed to it's fullest before uninstalling that one or two specific fully-explored game(s) to replace with another. Works for me … YMMV.


> do you think there's something lost by eliminating scarcity?

Certainly whatever these models can output just from text plus a bit of work/skill on the prompt plus a bit of selection will not be very valuable.

But it will see applications that don't need something very valuable, and where otherwise a piece of art would be not be affordable. I run a daily word puzzle game, and every daily puzzle comes with an image that helps tie the theme together. This is a low value piece of art that's nearly free for me to make, which wouldn't exist otherwise.

But the other thing is that people will make scarce and valuable things using generated art. Right now we're seeing tons of art where the direct output of a model is posted after less than an hour of work on prompts and selection by a person.

But when more people start combining all of these tools, spending dozens of hours on an image that's made out of components that are generated using AI tools, and other components that are manually edited or created, I think we're going to see a lot of brand new art and brand new skills.

There's always going to be people who pour their creativity, skills, and time into something. I can't wait to see what people can make when they do that while incorporating all these new tools that are being created into their process.


I hope we can distinguish it from other normal AI artworks, I very I don't have this ability:(


Others answered you in many interesting ways, so I'll just try to address this:

> Maybe I should be asking Gene Roddenberry but I'll ask all of you: do you think there's something lost by eliminating scarcity?

I'd like to think Gene would tell you, there's always going to be a scarcity of something - the adventure is in chasing it, but it's much more enjoyable when you aren't coerced into it by the need to feed and shelter yourself and your close ones. Basically, scarcity of food, healthcare, housing and opportunities is bad. Scarcity of unique experiences and relationships is enjoyable, and it's always going to be there.

I emphasize with how you feel about your kids, but I think this is less about scarcity per se, and more about your kids not wanting to participate in your tradition. It's possible they still might - after all, these kinds of traditions aren't really about a movie, but about the time spent together. Perhaps when they grow up, they'll voluntarily abstain from rewatching that movie on their own, outside Halloween.

Personal and much more childish example: in my more naive years, I've established a tradition, and roped a few friends into it, of watching "V for Vendetta" on the 5th of November. That was way back when the "oh fuck, the Internet is here" meme was funny, and not a stuff of nightmares. It was a completely random and voluntarily tradition, that lasted a couple years before naturally dissolving. My point being, the availability of the movie had little to do with it - it was all about voluntary choice of a group of people to do something together.


Absolutely. But where scarcity has been abused for domination and power over others, as in food or oil or money scarcity, it's not a good thing and whatever we can do to bring those systems down is, in my book, good.


Perhaps, finally, the persistent over emphasis on the expensive yet empty production value will give way to content composed of moral catch-22's and the richly described personalities enduring such situations, what is traditionally considered story first film making. No gloss is necessary when the story is strong.


Your story reminds me of a newspaper we have in France "La bougie du sapeur", it's only published on February 29th.

https://en.wikipedia.org/wiki/La_Bougie_du_Sapeur


I have some friends whose young children are addicted to watching videos of other kids opening presents on YouTube.

I've been trying to articulate why I reacted negatively to that concept, and I think you nailed it.


I have that exact problem here. I want my kids to have latitude and freedom with what they watch and do. But their child brains need adult moderation. They literally want every day to be Christmas, and they do not understand how that might "burn out" their synapses on things like the act of waiting, being excited, and eventually getting new gifts.


https://en.wikipedia.org/wiki/The_Work_of_Art_in_the_Age_of_...

Valuable material in the disciplines of aesthetics and philosophy of technology on this.


In addition to oversaturation, on-demand access to things can also lead to a sort of paralysis: the fact that I have access to shows on demand all the time means I almost never actually watch them. After all I can always do something else (read a book, browse HN) and watch that thing some other time. Whereas I will watch live TV (Jeopardy, sports), because I have to turn on the TV at a particular time to see it.

Your line of thought "rhymes" in an interesting way with Matt Levine's column today about how illiquidity sometimes has value [0]:

> Another, funnier sort of financial innovation is about subtracting liquidity. If you can buy and sell something whenever you want at a clearly observable market price, that is efficient, sure, but it can also be annoying. Consider the following financial product:

> 1. You give me the password to your brokerage account. > 2. I change it. > 3. You can’t look at your brokerage account for one year, because you don’t have the password. > 4. At the end of the year, I give you back your password and you pay me $5.

> Is this a good product? For me, sure, I got $5 for like one minute of work.[1] For you, I would argue, it’s also pretty good. For one thing, you avoid the stress of looking at your brokerage account all the time and worrying when it goes down. For another thing, you avoid the popular temptation of bad market timing: You can’t panic and sell stocks after they fall, or get greedy and buy more after they rise, because I have your password.

[0]: https://news.bloomberglaw.com/banking-law/matt-levines-money...


I'm sure there was something nice about hunting down big animal once a year and having nice feast by the fire next to the cave.


> do you think there's something lost by eliminating scarcity?

What a silly question. How does an extension of an already bottomless internet of content that you couldn’t consume over the course of hundreds of lifetimes come anywhere remotely close to ending food insecurity?


Certainly something is lost by removing scarcity... at the same time, once that scarcity is gone, it can't be restored short of civilizational collapse. There's no way to put the streaming genie back in the bottle, nor the Stable Diffusion genie.


Let's not conflate ethical inputs for full-out "undo" panic.

I don't see many people saying that we need to "un-release" anything.

What I do see is a want for ethical considerations for vulnerable parties be part of the discussion. I don't think consent is too large a barrier, especially since the music-focused variant of these tools currently has the developers walking on egg shells to avoid aggravating people with deeper pockets.

Instead what I see is the borderline criminalization of the people who were forced into the role of gatekeeping to be able to support themselves in the span of a few months because someone said "We can!" and apparently didn't watch Jurassic Park.


Yeah, I think you're 100% right. There is no "should we do this?" question here. Like every technology ever invented, it's now here.


Just because it's here doesn't mean it can't be modified going forward, and that we have to surrender to any and all ability to regulate anything. Fatalism can be attractive if you don't want to think about regulation, though, I suppose.


Regulation can't trump individual morals, and is driven by collective ethics. Deepfakes still exist and are actively being made and developed, even though many platforms have regulated it. Ignoring the difficulty of passing a hotly debated premise, regulating would only limit the actions of those who align with the regulation - as the article demonstrates even if Google are cautious to release their DreamBooth model and specifics it doesn't take much for someone to replicate it, ignoring (or ignorant) of such concerns.

This is obviously something that we're going to have to collectively figure out - the technology is here and the technology is still being developed. Either we adapt our thinking or consider it taboo. Anything else (such as restricting usage to a "trusted subset") is just delaying the inevitable.


It's a bit like saying we can't stop music piracy, now that Napster exists.

Napster was a peer-to-peer file sharing application. It originally launched on June 1, 1999, with an emphasis on digital audio file distribution. Audio songs shared on the service were typically encoded in the MP3 format. It was founded by Shawn Fanning, Sean Parker, and Hugo Sáez Contreras. As the software became popular, the company ran into legal difficulties over copyright infringement. It ceased operations in 2001 after losing a wave of lawsuits and filed for bankruptcy in June 2002.

Use of the output of systems like Copilot or Stable Diffusion becomes a violation of copyright. The weight tensors are illegal to possess, just like it's illegal to possess leaked Intel source code.

If you use the art in your product, on your website, etc., you risk legal action.

The companies that train these systems can't distribute them without risking legal action. So they won't do it. It's expensive to train these models.

It will always exist in the black-market underground, but the civilized world makes it illegal.

That's where this is going, I hope. Best case scenario.


Read the thread. Hell, read THIS thread. It’s one thing to steal somebody’s lifework and point towards ongoing legal cases that will make it ok because the practice will benefit big companies. It’s another thing to be an asshole about it. If it was “wow love her style”, sure, still can’t pay her bills with that but it’s actually “mid style, fuck you also go away”. If ML people continue to treat the sole source of their models like this, they shouldn’t be surprised to be treated in a similar fashion.


Except there was nothing stolen. The artist stills own all her work.


Just to imagine how some good might come out of this for Hollie. Lots of famous artists' original works are very valuable, even though they have a lot of imitators. No matter how many hours of GPU time you have, you can't produce an original work by Hollie Mengert. Maybe the AI content firehose could act to enhance her reputation?

Maybe artists will start training their own AI replacements. At least then they'll control it in some sense, and perhaps they could rent it out to people.


She already has a great reputation, since she does works for Disney. That's basically commercial artist endgame.


I think one of the things that really isn't being picked up about this, is that this isn't just about copying. Hollie Mengert isn't only objecting to her work being stolen. She's objecting to having her name associated with it - and she's objects because she doesn't think it's good. And she's absolutely right. People really aren't engaging much with that as a problem for this tool. I'd put a simple test to you - sit and look for 1 minute at each of the images she created and the machine created. There's a very simple way of telling who did what. The longer you look at Hollie's images the more you notice. The longer you look at the generated images the less it makes sense. The perfect example of this is the image of the woman in the middle of the party - there is so much more going on it that single image than all of the generated images together - the focus and composition, and the interaction of the different characters are features basically missing from most of the generated images.

I do wonder if a large blind spot of the engineers who are working on this is that they don't actually understand art and therefore don't understand what they're missing.


Does anyone else feel painfully unsure of their opinion on all of this? I honestly don't recall the last major thing I've felt this completely uncertain about. All my opinions generally lean in one direction at least a little bit.

On one hand, I think it might be ridiculous for an artist to get to "own" a "style" of art. In the first example on this page, none of the art looks plagiarized. It looks like what every artist has done: been inspired by or borrowed ideas from other sources.

But on the other hand, if left unchecked, this will further harm our creative industries. We're going to be starving out our artists because robots can generate art _far_ more easily than they can. If this continues, it disincentivizes anyone from trying the already very uphill battle of making a living by creating art. One might say, "capitalism, baby! we don't need those artists, because we have AI and look at what it can do in seconds!" But I think that even if AI can "discover" new art styles and trends, there's something lost by humans not doing it.

I don't think AI will be able to replace human creativity for discovering new paradigms as fast as it will replace human application of existing paradigms. And by doing the latter really well with AI, we're killing our ability to do the former. We'll end up with a sterile art trajectory.

I guess my uncertainty is: something about this _feels wrong_ and yet I cannot point to any one moral/ethical thing that feels wrong about it.


Artists don't generally try to OWN styles or prevent others from using them. They are the result of years of training, adapting, etc. They put their own spin on it. It's effectively a brand of that artist, and it's generally beholden to a sort of "honor" code that you'll likely get called out for breaching if you're flagrantly trying to pass it off as your own.

The core issue illustrating this is when people use an artist name in a prompt. If these models did not exist, if you wanted something in that style, you would likely be reaching out to that individual, or asking someone else to try to emulate someone. In that instance, the emulation is generally accountable. In these instances, there is no accountability towards the algorithm, as it's not making creative choices, to say nothing of moral or ethical ones. That was done by the individuals with venture-capital backing, using research loopholes to fund the legally questionable scraping of this data in the first place, which in some instances, violates the EULAs of the sites they were scraped from.

At the end of the day, these artists, styles, etc. would not exist without the artists who had no say in their "democratizing art".


Oh, agreed. This is Brave New World territory.

My suggestion is to accept it as a thing that will be here and tune our expectations appropriately. Because if it is made illegal, it will be one of those things that's illegal-but-omnipresent, like sharing music on BitTorrent... The Western copyright regime doesn't blanket the world, and the advantages of these tools are so big that places it doesn't reach will just use them. The fact that this story is about a Nigerian engineer in Canada using software developed in San Francisco running on some computers in Northern Virginia to ape the artistic style of an artist from LA, none of these parties having ever met each other, indicates how empty the bottle is the genie used to live in.


> We're going to be starving out our artists because robots can generate art _far_ more easily than they can. If this continues, it disincentivizes anyone from trying the already very uphill battle of making a living by creating art. One might say, "capitalism, baby! we don't need those artists, because we have AI and look at what it can do in seconds!" But I think that even if AI can "discover" new art styles and trends, there's something lost by humans not doing it.

This has already happened.

People used to get paid to recreate paintings but now we have the technology to do it without a painter.

You used to need a musician whenever you wanted live music but now we just play a recording.

Basically, what you are describing is how technology has worked forever. I am not sure why people have chosen this particular instance of technology lessening capitalistic opportunities for artists to get their knickers in a twist over.

The same thing that has happened in the past will happen again. Artists will adapt and learn to use new tools, less people will make money via art, and people will keep making art anyways because the reason most people do it isn’t money.


Because it's based on LLMs and LLMs will possibly do this to all knowledge work in our lifetimes.


> I don't think AI will be able to replace human creativity for discovering new paradigms as fast as it will replace human application of existing paradigms. And by doing the latter really well with AI, we're killing our ability to do the former. We'll end up with a sterile art trajectory.

This may actually end up making the few artists creative enough to create bold new art styles even more valuable, if they can basically not release their art and hide it behind a model.

Though I guess anyone with access to that model's output could then just generate a few samples and train on those, so maybe not.


It will change the landscape of the art market, but it won't destroy it. Digital art will be less valuable, canvases and sculptures will become more valuable.


The cost of materials and transportation and time using the expensive CNC machine will be the major costs of sculpture. Generating the same quality 3D models is at the very furthest 18 months away. And animating and rigging the models and giving them auto-generated RL policies will surely come very quickly next.


If you were going to make a 3D model sculpture you would probably want to just 3D print it instead of using a CNC. In any case sculptures aren't as simple as cutting a 3d form out of a hunk of metal; the variety of materials and techniques is arguably much more interesting than the shape at the end. And the physical nature of it by definition resists the infinite generative shit that AI throws on the internet; there isn't space in the world to store tons and tons of nonsense sculptures, and the cost and time of making them is nontrivial as well.


Reading all of that, the biggest issue was just the art style naming.

One of the key features of Stable Diffusion is adding "in the style of <artist name>" to the prompt. This just has a contemporary/living artist and actually lets an individual train Stable Diffusion for anyone to add that style to their own Stable Diffusion instance, instead of waiting for Stability.ai to release another dataset.

He has since renamed the style, but he should just say "inspired by <artist name>"

It is so similar to what someone inspired by a particular artist would do that I can't make a separate standard.

Basically the pushback comes from the level of discipline that was once required in the past (2 months ago) compared to now. That level of discipline is no longer required.


I foresee super quick closure of almost all graphic artists in a closed and heavily moderated pay per view/use communities. Or they will simply starve. This so called A"I" producing a ton a derivative art will mess up a lot of industries.


I mean, the alternative is arguably that artists become indentured servants to scraping/training models that benefit the people who are doing the scraping/training, at the arguable expense of the artist.


If they stay in a closed pay-per-view environment, I have a feeling most of them will definitely starve then.

Technology has messed up industries before. Everyone's not going to die.


We are observing a live example in the OP right now. If you are an artist working in some specific style, then even a small dataset (just 30 images in the OP) will render future works much less desired. Of course big companies have compliance rules, reputation and so on, so they won't use "gray" art in the production. But the is definitely not enough bigcorp contracts for every artist, so smaller less known ones would compete basically with themselves. Imagine someone choosing between paying for original art X thousands of dollars, or generating 80% similar in quality art for "free".

PS: after posting my comment above I've realised that by idea won't work. Art will become public sooner or later, just scan physical medium, or rip drm off the digital one and you will get the dataset for NN generator. Well, it will be a huge mess. We will see how it will turn out.


I believe Yizahi is talking about pornographic furries art, which is (from what I understand) a very lucrative market for artists (regardless of their personal disgust of the subject).


It managed that from just 32 example illustrations?

Irrespective of everything else here, I find that alone hugely impressive.

ed - for 'hugely impressive' also insert 'slightly terrifying'.


The 'rules' of making a picture are codified in the large pre-trained model, the 32 examples are just guidance on top.


So the artist has been trained in the ways of the brush, but was only exposed to 32 examples of another's work to copy it specifically henceforth?

I'm still impressed/terrified.


Artist using Disney trademarks without their permission has style taken without their permission


She specifically said those images were commissioned by Disney and she doesn't own the copyright on them.


I get the feeling the ruling on Andy Warhol Foundation for the Visual Arts, Inc. v. Goldsmith will point how the Supreme Court will see this. I'm starting to think that the entity who owns the data used for training should own the output.


One way to consider copyright: when the concept gained traction 16th and 17th centuries, the doctrine existed to protect the author’s ability to get someone to finance the printing of his work. That is reproduction was costly and required an investment by a printer. The printer faced more economic harm than the author if he spent $20,000 typesetting and printing a thousand copies of something only to find a competitor beat him to the market. To be clear, the copyright attaches to the author, but it is only valuable because it enables the author to induce a printer to publish his work.

Now, as the costs of publishing have dropped to zero, the concept is beginning to make less sense.


I think it's interesting that for the past 50+ years we have been having an ongoing philosophical debate about the ethics of genetic manipulation. The warning stories are abound in sci-fi, and though we have the tools to do so, we use extreme restraint in their research.

Yet we haven't had such discussions in computer science about the use of AI. While other fields wrestle with the ethics of doing things, we have had no such discussions. (Due to the nature of our education we are blind to the demands of ethics.) We have made the tools so easy to use, and so accessible, that even if we should discuss the implications of said tools the cat is already out of the bag.

Lately, I've been thinking about that poem "First they came..." by Martin Niemöller and wondering if now is the time to heed it's warning.

    First they came for the socialists, and I did not speak out—
         Because I was not a socialist.

    Then they came for the trade unionists, and I did not speak out—
         Because I was not a trade unionist.

    Then they came for the Jews, and I did not speak out—
         Because I was not a Jew.

    Then they came for me—and there was no one left to speak for me.


There are a lot of science fiction stories warning about AI. Although, they are about actual AI and not machine learning. Personally I think it is actually literally impossible to make AI built from conventional computer hardware. Anything you could ever do as a computer scientist can only ever result in more sophisticated, but not sentient algorithms.


True, we do have warnings about human level AI... but we never really cover the Machine Learning is gonna take all y'all's jobs.


I've been musing for several years that there doesn't seem to be an organized "philosophy of computer science".


I agree, we need some kind of organized thoughts on computing. Heck just a straight up ethics course would be a huge improvement. I didn't realize until after I graduated that most of the hard sciences require undergrad ethics, and if you do any graduate work you will take more ethics courses.


I think people are being to extreme in their thinking around AI art.

Yes, it's legal to learn from and imitate art styles.

Sure, AI doing the same thing is the same thing.

But if I actually copy someone's IP (e.g. characters), I am infringing. You can do that with AI. And if anything, it's easier to accidentally infringe due to how the AI works. So I think AI will end up coming with more required diligence.

This doesn't even touch on how art will still be driven by taste and skill. No matter how much tech types wish art wasn't actually a skill (is it out of envy? idk. But there's a strong HN voice acting like art skill isn't actual skill).


I feel for her, but there’s nothing inherently wrong with this and there is no going back.

These are original derivatives. In the same way a human is inspired by other arts and may mimic their style, machines can too, they’re just way more efficient. What are they going to do? It feels unfair to punish the machine simply for being more efficient than humans.

Instead, artists need to find a novel way to generate revenue by leveraging their uniqueness. Perhaps using Stable Diffusion themselves and turning their creations into NFTs to prove their authenticity.


The journo just can't help itself and leads with hostile language right from the start. Hopefully it will soon be made obsolete.


It's simple: if you don't respect other's intellectual property, expect that no one respects yours, and if you give away for free what others sell for a living just because you have another source of income, expect the latter to be taken away from you too.

Truly it's a sad and ugly state of affairs these days.


So with these generative models running rampant, what's to even motivate aspiring artists to develop and hone their craft, if their years of work can be copied so easily?

Maybe it doesn't practically matter, because some art style generative model can be developed and feed into the diffusion model so it can generate art of new styles.


> So with these generative models running rampant, what's to even motivate aspiring artists to develop and hone their craft, if their years of work can be copied so easily?

…because they like to make art?

The idea of making a living as an artist has only been possible for a very small fraction of human history, yet we have always had art.

We are in the process of reverting to the norm. The world will never again be like it was in the post industrialization and pre information era.


In this case the AI guy made public which images he used.

However, if someone wants, they can simply never publicize what the training set was, copy other artists and always plausibly deny they used any unlicensed input data, were they to get into legal trouble.

I don't really see a way around this tbh.


Can styles be copyrighted? So for example, if a human artist makes a novel work in the style of another artist, is there any legal grounds for claiming copyright?

Assume that the specific characters, names, etc aren’t being used. Think cartoon in the style of Disney, but not Mickey Mouse.


I feel for her, this is a new frontier we don't have the legal framework or the cultural norms to handle this appropriately yet. As a developer, i can see doing something like this because "its cool" and not considering the impact to other people


Maybe I'm oversimplifying but the question is: "does an artist _own_ a style?"

To me the answer is no.


I agree with your answer, but not that your question is the question. Some that I think are closer:

Does an artist own their name? Yes. I can’t publish a work of art and say “authored by savant_penguin”.

Can I sell art with product name “in the style of savant_penguin”? I think this varies by jurisdiction; it’s not going to be legal in the EU I think. It might well be in the US.

Edit: fleshed out more thoroughly by someone else already: https://news.ycombinator.com/item?id=33423857


Won't recordings of music put live performers out of business?

...I say this facetiously, but I think the comparison is fair. at this point: it is so cheap and easy and useful to do it feels impossible to stop.

$2 and an internet connection. When you are at that point, the game is lost.


Kind of off topic, but it would be a fascinating experiment if you could take all the art that influenced a real artists and train an ai on it an compare to the artist actual output.

Do humans have original ideas or are we synthesizers of what we take in?


I think the only disputable thing is using a person's name in the model name.

If model's name would have been Funny Cartoon Creator instead of X Y, then there wouldn't be any reason for outrage and vitriolic articles.


Call me Luddite but this has gone too far, please stop it already.


How can it be stopped? Seriously, is there any way to put the open source genie back into the bottle?


Training data should absolutely be under control of the creator / owner of the IP, and AI trained with stolen IP should itself be considered stolen. The lack of clarity here is only because of the newness of the technology, but this will be as clear to our future selves as "warez" are now known to be stolen work.


How completely unsurprising - a person who grew up in a country most famous for chop yo dolla gives two fucks about the context of achieving his personal desires at the expense of those actually willing to do the work. It’s his own goddamn narrative. I hope Disney bankrupts him as a consequence to his ignorant and selfish actions.


> Aside from the IP issues, it’s absolutely going to be used by bad actors: models fine-tuned on images of exes, co-workers, and, of course, popular targets of online harassment campaigns. Combining those with any of the emerging NSFW models trained on large corpuses of porn is a disturbing inevitability.

I'm wondering when the destruction of the traditional internet and trust networks, ala Neal Stephenson's "Fall, or, Dodge in Hell" is brought about. In the book (spoilers I suppose), an engineer develops a set of tools to cheaply and easily generate an internet hoax, the example the engineer uses to demonstrate the technology being a fake nuking of a town in if I remember correctly Ohio. Through sockpuppeting and a few choice hiring of actors (and just a teensy bit of image and video manipulation), he's able to convince the world for about a day that the USA has lost a remote town to a nuclear strike, and in fact it remains a conspiracy for decades after whether the town actually was nuked or not.

The point that I'm thinking about now though is the engineer develops these tools a bit further, then, after (with consent) targeting them wholesale on a single person, releases them to the world, which basically completely destroys the modern internet as misinformation and information become perfectly interchangeable.

A lot of reviews leave out how important this plot point was for the development of the various technologies that feed the main plot of the novel, that being an mind-upload scifi, namely technology around anonymous distributed ledger identification based on traits only AI can accurately tie to a person (i imagined it as like, millions of subtle characteristics such as how someone used their mouse, the pauses between typing certain characters, etc).

Long story short it become impossible to trust anything viewed directly on the internet without extensive editor (in the publication sense of the word) support.


So? It’s not a copy of her work, just her style. I see no problems here.


1. I like this style of work. I cannot make this style of work. I will pay this artist to produce this style of work.

2. I like this style of work. I cannot make this style of work. I will not pay the artist. I will instead take copyrighted material, and process it into a result using a machine that is not able to ethically refuse.

2.1. I like this style of work. I cannot make this style of work. I will not pay the artist. I will use this model someone else made to produce this style of work.


Calling it "her" style is part of the problem. "The style she uses" would be more appropriate.

ofc that takes longer to say so it will never happen.


I think the reason this is creating so much angst is that it's not really just about the future of digital art as a career. It doesn't take much imagination to see this as the vanguard of what will be coming for all jobs that involve manipulating bits on a computer, which is a huge class of jobs these days, including most of the jobs done by participants on this site.

The rapid development of AI may mean the near-completion of the hollowing out of the middle class. On one side the capitalists who own the machines and data that can do all the knowledge work, on the other the mass of workers trying to scrape by in menial service jobs. What will remain in the middle will be the craft trades that require dealing with complex real world physical problems and thus are still extremely hard to automate (plumbers, electricians, etc.).


Before this article, I had no idea that a Hollie Mengert existed.

In a different light, this could be a new market of people the artist can cater to. A flesh & blood original could hold sentimental value, with the AI being an ok sneak preview.


Thing I don't get is people post on public internet and then somehow expect that that information is somehow protected from uses that they don't want. If you don't want your data mis-used, don't post it on the internet.


Well, this is a weird argument. There are lots of things that are posted on the internet that are illegal to re-post elsewhere without attribution/payment. That's, like... all of internet copyright law.


Where did she post the images she created?


They'll be AI's that generate new styles eventually.


Interestingly, she doesn't mind the fact a model was created so much as the commandeering of her name and identity associated to it. Literally everything it produces is associated to her ... whether it's beautiful pictures or CSAM.

In this respect I'm curious what right people have in general to object if someone falsely associates your name with something. It's not even an AI question really. If I create a different kind of morally objectionable software under the name "Elon Musk" and represent that it embodies his style and personality, can he object to that? If I run a brothel under his name? At what point are natural names effectively implicit trademarks where the person inherits some kind of legal right to control or object about how it's used?


Not a single mention of Greg Rutkowski in this thread?


Is it just me, or does her style look like Disney's style?

Yes I do see that she has drawn Disney characters.


I'm an AI researcher working on generative modeling, so I want to note my bias upfront.

But I'm also a musician/artist and so I find some of these conversations odd. The problem with them I see is that they are oversimplified. To get better at drawing I often copy other works. Or I'll play a piece exactly as intended. Then I get more advanced and learn a style of someone I admire and appreciate. Then after that comes my own flair.

So I ask, what is different between me doing it and a machine? The specific images being shown in this article shows that Hollie is doing the same process as me and the machine. Their work is in fact derivative. There's absolutely nothing wrong with that either. They say a good artist copies but a great artist steals. I don't think these generators are great artists, but they sure are good ones.

I learn by looking at a lot of art and styles. I learn by copying. I am able to be more efficient than the machine because I can understand nuance and theory, but the machine is able to do the former much better than I can. It can practice its drawings a hundred or thousand times an hour.

Now there are highly unethical stuff that is actually going on and I don't want the conversation getting distracted from. Just today a twitter account posted a demo of their work and actively demonstrated that one can remove watermarks with their tool[0]. This is bold and borderline illegal (promoting theft). There are also people presenting AI generated work as human digital paintings (we need to be honest about the tools we use). People presenting work in ways that it was not actually created is unethical. But there are other generative ethical concerns.

Now there are concerns about photographer's/artist's rights. If I take someone else's work and post it as my own, that is straight up theft. Even celebrities can't post photos of themselves that were taken by others[1]. This gets muddled if I make minor changes but it's been held up in court that the intention matters and it needs to be clear that the changes were new in an artistic manner and not a means of circumventing ownership rights. These are some of the bigger issues we're running into.

A problem with these generative models is interpretation. How do you know if the image you produced actually exists in the wild or if it is new and unique? There's been papers that show that there are privacy concerns[2] and that you can pull ground truth images out of the generator. I'd argue that this question is easier to answer the more explicit the density your model calculates. Meaning that this is very hard for implicit density models (such as GANs), moderately difficult for approximate density models (such as diffusion and VAEs), but not too bad when pulling from explicit density models (such as Autoregressive or Flow based models).

This is a concern that is implicit by articles such as this, but fail to actually quantify the problem here: "How do we meaningfully differentiate generated images from those made by real people?" I'm a strong advocate for the study and research for explicit density models, but a good chunk of the community is against be (they aren't currently anywhere near as powerful, but there's nothing theoretically in the way. I'd argue it is that few people are researching and understanding these models. There is a higher barrier to entry). So I'd argue that the training methods aren't the major concern, but what is actually produced. While the generators learn in a similar fashion to me it is clear that I'd get in trouble if I was passing off a fake Picasso as a legitimate one. But it is also fine for me to paint something in that same style as long as I'm honest about it.

The nuance here really matters and I think we need to not lose sight of that. This is a complex topic and I would like to hear other views. But I'm not interested in mic drops or unmovable positions. I don't think anyone has the right answer here and to solve it we must get a lot of different view points. So do you agree or disagree with me? I especially want to hear from the latter.

[0] https://twitter.com/ai_fast_track/status/1587475575479959559

[1] https://collenip.com/taylor-swift-entitled-say-photographers...

[2] https://arxiv.org/abs/2107.06018


>But I'm also a musician/artist and so I find some of these conversations odd. The problem with them I see is that they are oversimplified. To get better at drawing I often copy other works. Or I'll play a piece exactly as intended. Then I get more advanced and learn a style of someone I admire and appreciate. Then after that comes my own flair.

>So I ask, what is different between me doing it and a machine?

You are a human. If you practice art as a hobby you can feel pleasure doing it, or you can get informal value out of the practice (there is social value in showing and sharing hobbies and works with friends). One could try to formalize that value and make a profession out of it, get livelihood selling it.

When all that "machinery" to (learn to) produce artistic works was sitting inside human skulls and difficult to train, the benefits befell on the humans.

When it is a machine that can easily and cheaply automate ... the benefits are due to the owner of machine.

Now, I don't personally know if the genie can be put back into bottle with any legal framework that wouldn't be monstrous in some other way. However, ethically it is quite clear to me there is a possibility the artists / illustrators are going to get a very bad deal out of this, which could be a moral wrong. This would be a reason to think up the legal and conceptual framework that tries to make it not ... as wrong as it could be.

It could be that we end up with human art as a prestige good (which it already is). That wouldn't be nice, because of power law dynamics of popularity already benefit very few prestige artists. So it could get worse. But could we end up with a Wall-E world where there are no reason for anyone to learn to draw any well? When a kid asks "draw me a rabbit", they won't ask any of the humans around, they ask the machine. The machine can produce a much more prettier rabbit, immediately and tailored to their taste.


> One could try to formalize that value and make a profession out of it, get livelihood selling it.

I really think it is bad to frame this about profits. I mean in my case I am doing it purely for the pursuit of pleasure. I'd argue that these models allow more people to do so as it lowers the barrier to create quality work. It can also be used as a great tool for practice, as you can generate ideas and then copy and/or edit them. It is also a great tool to make quick iterations as you explore what certain ideas and concepts might look at. But I do not believe that it is anywhere near ready to be a replacement for humans. Especially since it is highly limited in its creativity (something not discussed).

Also, I want to add that these methods allow for new types of art that didn't exist before. There are artists working and exploring this path. Questioning how these tools can be used to modify things or create things to be modified.

> When it is a machine that can easily and cheaply automate ... the benefits are due to the owner of machine.

In what way? If you are not paying for the system and it is freely handed over, why is it not "you" who is benefiting? I would understand this comment if the benefit was behind a paywall (e.g. Dall-E) but it isn't (e.g. Stable Diffusion).

> Now, I don't personally know if the genie can be put back into bottle with any legal framework that wouldn't be monstrous in some other way. However, ethically it is quite clear to me there is a possibility the artists / illustrators are going to get a very bad deal out of this, which could be a moral wrong. This would be a reason to think up the legal and conceptual framework that tries to make it not ... as wrong as it could be.

I guess part of the issue I have with this is that it sounds a lot like the arguments made when digital art itself was beginning. How do we differentiate the "I hate it because it's new" from the "I hate it because it is unethical?" This is not so obvious to me to be honest, because one can think the former and say the latter. I am not going to shy away from the fact that transitionary periods can be rough, but I'm not convinced it is going to kill artists' livelihoods. Especially since there is a lot of effort that is still needed to produce high quality images.

I think this might be a point where people working on these machines (like me) vs the people that aren't (maybe you? idk) have different biases. All day I see a ton of crap come out of these. But if you just payed attention to articles like this or Twitter you'd think they are far more powerful than they are. These selected images are being created by expert artists too, that deeply understand aesthetics and the prompt engineering required to make high quality work. Maybe we'll get there, and that makes the point moot, but I'd argue that we're still pretty far away from that. I don't think this is going to kill off professional artists by any measure (especially because this exclusively affects the digital media domain and no other form) but may make the barrier to entry slightly higher (but it also might help artists become more creative as you can quickly explore ideas).

> The machine can produce a much more prettier rabbit, immediately and tailored to their taste.

Actually I'd argue the opposite. While this may be true for your average person making a drawing I still have significant doubts that the machines will be able to create better results than professional artists within the next 5-10 years (plenty would bet against me though, so that's fair). I also think there's issues with the diversity of these images and that it can't be resolved simply by adding more data (the "scaling" paradigm). I think they will rather reinforce that certain things look a specific way. Especially since these models do not understand a lot of basic concepts that we humans take for granted (a fundamental problem in AI: causal understanding). I don't think these issues are insurmountable, but they are a lot harder than many give credit for.

But I do want to make it clear that while we disagree I respect your position and still do think you bring up some good points. But I do think we also have fundamentally different vantage points. Which probably makes it a good thing that we're actually discussing this together and not in our respective bubbles.


Software is next.


Is AI the problem here, or is this story a symptom of a bigger problem?

This is a somewhat long comment, but just thought I would share my thoughts.

The concern is that someone who has worked hard to create art wants use it to leverage it to obtain security in their life, and they are at risk of losing that leverage with new AI technology. Well, first of all, plenty of people work hard their whole lives for low compensation because their skills are not unique, and often don't have the time/capital/energy to build new skills even if they wanted to.

I question the way we live our life, there is a lot of suffering in the world because we are focused on taking from each other (in some cases using violence, in other cases laws). In the first world most people are expected to spend their childhood preparing to work away their adult life to earn minimum compensation enriching someone else, and in the third word most people are in extreme poverty, which the first world is happy to exploit for labor, resource extraction, brain drain, strategic military advantage, ect.

Artificial intelligence is about to make a lot of things that required difficult and unique training trivial. Making many more people redundant, and no longer unique/needed even if we do end up creating copyright laws enforcing that nobody can use someone's work to train the AI. Do we really think that we won't become redundant just because it doesn't train on our art/code? So like the artist here, we should all be wondering what is to come.

I think reason people are so worried is because we have it in our heads the way to conduct trade, and ourselves is by extracting value from one another. Instead of a culture of giving gifts without guarantee of reward, fostering relationships and caring for and understanding each other, we are trying to take from each other to ensure our needs/wants are met. The ultimate way to achieve security under this way of life is to create a dependence, become the owner of the assets/capital that can be used to take value/labor from others want/depend on what we have.

We act like we have only a few choices when it comes to how to live, religion, capitalism, communism, socialism, and so on. But we have the choice to try and understand the people we meet in life, and reach out to those who we don't know. Knowing each others needs/wants we can help each other out. We have the ability to be generous when we can, foster relationships, so that in our time of need we might be helped, by someone that cares (because we're there for them too). It's just that we listen to the advertisements telling us if we want to be happy, to buy more and more things (possibly at determent of our own health/well being), we listen to leaders that would have us go to war with each other, we listen to society telling us to focus on making more money, exchange labor for wealth, and use the wealth to obtain security/happiness.

What if instead we believed the most important thing was to live with balance with each other and with nature, and to communicate with each other to see how we can help. We do after all live in this world together, and when there is an imbalance we see the effects like poverty, stress/anxiety, addiction, theft, violence, wars, exploitation, hate, not to mention the obscene amount of time we spend working (40+hr/wk for 40+ years) to enrich someone that doesn't care about us.

Are we not living in a backwards way? Most people are dependent on a few, by culture, or by force (think of how this force is created - if you don't obey those with power/authority, someone else who does will threaten you with violence or revoke privileges). If someone breaks a law, or threatens our livelihood, we'll just take away their livelihood (or what little they had to begin with) - how well has this worked out?

With the advanced technology to communicate more easily than ever before, and provide what we need/want with much less labor/skill as before, and the wealth of knowledge available, it's time realize, that by giving to each other and helping one another, we will all be better off. Well, maybe not the extremely wealthy and powerful powerful people - but even they don't get to live in peace because someone is always trying to take their spot - think about all the wars fought because one authority is threatened by another, or all the companies trying to hold their grip in the face of more desperate innovate companies.

I'm not saying doing something different is easy, or straightforward, but it might be better. I think we don't need to ask anyone's permission to do it, we just start doing it within our own networks - building up each other, making new relationships, looking out for one another.

If you want your security to come from laws and by taking instead of giving, how long that will last, just how secure is that anyways?


So she is claiming the flat corprate style, but with more saturation, as her own? Thats ballsy. It's a style for sure, but one based on looking oversimplified and generic imo. No suprise its easy to emulate, and a common averaging result. Its not as likely an actually distinct style, unless massively popular, would have this issue. That said, the greater question is no less valid; what elemens of style do we own, and how will ownership manifest moving forward when many new works use ai images as a starting poit, which are then based on other artists work, some of them being ai generated. Will it become a feedback loop sooner, or later, and what will that look like?


DALL-E and Stable Diffusion are very good at replicating the distinctive styles of famous artists like Van Gogh or Seurat or anyone else you can think of, as well as other living artists: https://www.technologyreview.com/2022/09/16/1059598/this-art...

This is not just about styles that you personally don't like. It will work for any style with a lot of available existing work, popularity is irrelevant.


Go take a look at her portfolio: https://holliemengert.com/

It's clear to me that she has a very distinctive style of her own. It's not "flat corporate style" at all.


That’s was not my takeaway:

>… the rendering, brushstrokes, and colors are the most surface-level area of art. I think what people will ultimately connect to in art is a lovable, relatable character. And I’m seeing AI struggling with that.”

So she doesn’t want her name associated with, in her eyes, bland illustration that copy the the style she has used for some big well known clients.


Homeboy in the article specifically trains AI on her work to get specific results that return when her name is used because he wants to be able to recreate her style.

Fans of her work alarmed by the closeness point out to her what's happened.

Qull as appointed gatekeeper of style for the internet: 'she has no discernable style'.


Systems like Copilot and Dall-E and so on turn their training data into anonymous common property. Your work becomes my work.

This may appeal to naive people (students, hippies, etc.), for whom socialist/communist ideas are attractive, but it's poison in the real world because it eliminates the reward system that motivates most creative work. People work hard for credit or respect, if they're not working for money.

Ask yourself, why does the MIT License (https://opensource.org/licenses/MIT) contain the following text?

  Copyright <YEAR> <COPYRIGHT HOLDER>

  The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
These systems are a mechanism that can regurgitate (digest, remix, emit) without attribution all of the world's open code and all of the world's art.

With these systems, you're giving everyone the ability to plagiarize everything, effortlessly and unknowingly. No skill, no effort, no time required. No awareness of the sources of the derivative work.

My work is now your work. Everyone and his 10-year old brother can "write" my code (and derivatives), without ever knowing I wrote it, without ever knowing I existed. Everyone can use my hard work, regurgitated anonymously, stripped of all credit, stripped of all attribution, stripped of all identity and ancestry and citation.

It's a new kind of use not known (or imagined?) when the copyright laws were written.

Training must be opt in, not opt out.

Every artist, every creative individual, must EXPLICITLY OPT IN to having their hard work regurgitated anonymously by Copilot or Dall-E or whatever.

If you want to donate your code or your painting or your music so it can easily be "written" or "painted", in whole or in part, by everyone else, without attribution, then go ahead and opt in. Most people aren't so totally selfless.

But if an author or artist does not EXPLICITLY OPT IN, you can't use their creative work to train these systems.

All these code/art washing systems, that absorb and mix and regurgitate the hard work of creative people must be strictly opt in.

I say this as a person who writes deep-learning parallel linear algebra kernels professionally.

We've crossed a line here.


For generative art, trademarking your name might help prevent people from using it in prompts, but for general copyright, where does the line stand between someone casually publishing every color in the rainbow, every note combination, every letter in the alphabet, and claiming anyone else is infringing on their copyright?

If someone copies your thesis, abstract, poem word for word, that is clear violation of your IP, but we are all remixing words that everyone uses, colors, brush strokes, API terms, programming language keywords, and notes. Copyright law has the fair use doctrine and transformative use is explicitly allowed to allow iteration. There is some level of granularity that is essential to creativity - otherwise one entity can copyright all possible combinations and prevent any creativity from happening legally. If AI goes below that threshold, all of humanity has a chance to iterate far faster and find new spaces and fill new needs for everyone. Humans have been able to draw in the style of Picasso or Monet for centuries. A program doing it is not infringement, just much faster iteration.


> A program doing it is not infringement, just much faster iteration.

"Much faster" is absolutely relevant, morally and legally. Visiting a website a bunch of times is not illegal, programmatically DDoSing it is. Having a private conversation with someone and writing down what they said afterwards is not illegal, but recording the conversation and perfectly reproducing it without their permission often is. Shouting at someone in public is generally ok, having a drone follow them around anytime they're in public playing a recording of whatever you shouted is probably not ok.

Computers are not people. Just because it's ok for a person to do something, doesn't mean it's ok to have a computer do the same thing a billion times per second.


DDoSing is bad not because of the speed but because it overwhelms the infrastructure a product is designed for. Doing it to your own computer by making it crunch AI models until it runs out of memory is perfectly legal and iterative. Printing pages of a book in seconds vs dedicating lives of people to hand draw each letter in the monastery is iteration.

Computers are not people. Computers are iteration tools people use to free up precious lifetime they have and bring more value to the world. If you are a human who trains 10000 hrs to invest like Paul Graham, or draw like Thomas Kincade, or play the piano, or operate as a top brain surgeon, you have spent a fraction of your life to do this fast and reap the rewards. But that fraction of your life has tremendous cost on society. Many people paid with their time and money to feed you, teach you, house you, during that time and during your upbringing which allowed you to have those 10k hrs to dedicate to this task. Now all of that work can be used by you to do exponentially more with your precious life. Instead of spending days or years making a portrait, you’d spend seconds. Now you can find higher purpose and solve much bigger problems - instead of asking for 100 to hand draw a portrait for a few hundred people in your lifetime, you could create one for every teen who needs a boost in their self esteem and raise their confidence and ability to cope with challenges in their life at massive scale.

More importantly, since there is a huge scarcity of people trained to fulfill each niche need that forms a bottleneck on society’s capacity to use that. Imagine if instead of airplane we counted on a few trained supermen to fly people who needed to cross places fast by hand. How many people would die before they see the world or are taken to a doctor, etc. The world can’t survive on superheroes or super trained people. The world can do more with the time and lives of people in it.


I disagree but this is a reasonable perspective. One specific point:

> instead of asking for 100 to hand draw a portrait for a few hundred people in your lifetime, you could create one for every teen who needs a boost in their self esteem and raise their confidence and ability to cope with challenges in their life at massive scale.

This is a misunderstanding that reminds of those startups that were like "we realized people love getting hand-written cards, so we built a product to learn your handwriting and generate them for you!" No, the effort is the point. For people who like those cards it's not about the aesthetics of handwritten text, it's about knowing somebody dedicated some of their limited time to you personally. A depressed person is not going to be cheered up by an auto-generated portrait, even if it's indistinguishable from one a human artist spent 12 hours on. You can't "scale" human connection like this, unless you hide the fact that robots are involved.

I'm not saying all technology is bad. I think robo-surgeons would be great if they can save more lives, even if they put human surgeons out of work. In this particular domain, right now it seems like these tools have the potential to discourage future generations of artists, which would be self-defeating because the models are not AI and will stagnate without additional training data. I don't think they should be banned, but I think we should take human artists' concerns seriously, not co-opt someone's artistic identity if they ask us not to, and try to make sure we think about the unintended consequences of a powerful new tool.


People who love to walk will continue to walk even when there are bikes, cars, airplanes, self driving tech, and teleportation etc, available to them. AI art does not discourage artists any more than restaurants discourage home cooks who enjoy cooking. In all of those scenarios the tech caters to people who are in need of a task done and not in desire to spend their life on the craft.

There will always be unintended consequences and some will be severe. But what is happening now is pent up demand that finally found an outlet - like a bunch of high pressure mountain water that found a hole through a cave wall and into the ocean - it is gushing.

Instead of blind fear of change, I try to see the value previously unseen and it is tremendous. As the creator in the article said she does not see her true art in the stable diffusion creations: the eyes that speak to character in each character, the poses that show confidence or query, or passion. Instead she sees images that mimic her style of drawing.

I do see how someone may choose to not become an artist for a living because AI art becomes so ubiquitous that they could never make a living with a paintbrush. BUT, with so much 80-90% of desired art generated by AI, there will now be huge demand for skilled artists who can take a generated image the rest of the way to desired results. I trust that human ambition and taste always expands past superhuman capabilities. The artists if tomorrow will have much different brushes than those of yesterday and be far better and more productive than those of yesterday. There will likely be 3D art in the real world and universes to explore in the virtual one. I’m more concerned we will and are running out of space to store our contraptions. Data storage manufacturers will be thriving.


That is a fake argument. It has been proven wrong on every one of these articles, yet pro-stealing people like yourself keep posting it. You can't copyright such works, only an 'installation' of those works. If you want to talk about copyright, maybe educate yourself on copyright. It's Title 17. https://www.copyright.gov/title17/


Which argument, which articles - be specific. AI content is not copyrightable at this stage. Drawing styles are not copyrightable. Name calling and labeling are ad-hominem attacks however and that has no place on HN.


Edit: First and foremost, I'm sorry you felt attacked. That is never OK. I need to step back from posting anything until I can be a decent human being. I regret not doing that before I posted.

----- People claiming you can copyright a canvas that is just a color (when you can't, you can only copyright the art installation/display). People claiming that you can copyright the alphabet (when you can't). It's just frustrating that HN wants to keep having this discussion but with ZERO basis in actual copyright law, and people making factually inaccurate claims as if they understand it. I had the same issue on a criminal law post. Someone posted completely factually inaccurate information regarding title 18 statutes, and HN blocking me from responding in a timely manner while people were reading the post. This just isn't a forum for informed discussion I guess, but peoples gut feelings and what they THINK copyright law is. Opinions are great, and needed, but established law is a real thing and should be part of a discussion that at it's core is about copyright.


Look at the inputs and outputs of generative AI art - specifically the ones shown in the post and others. The disconnect here is not our insights on copyright law (At least not for generative art. Maybe for copilot, code, and GPT3, but not for the art.) Hollie is a Character Artist. A Character has a combination of appearance, emotional radiance, mood, behavior, that have to fit in to create a unique persona and trigger an emotional response to that character from viewers. Stable Diffusion does not take or copy her characters - it mimics her drawing style in colors, hues (it reads what pixels are proximate to what other pixels and tries to apply that proximity to similar colors, line patterns in the future). When you create a stable diffusion image, you set a parameter of how much randomness to apply and how much variety, and what else to mix in. So any content it's trained on becomes a soup of colors and lines and hues and saturation. So if Hollie draws with de-saturated colors, and favors pinks and greens, an image produced with her name will probably have that, but not more than a soup mixed with objects searched by the words of the generator human. Code generation is different as a code has to appear in some form to work properly and that happens with a lot less variance, so generators there would likely remix a LOT more of the original, but images have a much much wider variance unless you ask specifically for as many keywords as possible (aka Mona Lisa by Leonardo Da Vinci) in which case you'd fall under trademark law as well probably (Which is also why Microsoft added trademarked images on their CDs so they could prosecute infringers easier).


I disagree that creative individuals have to do anything explicit here: copyright law is pretty clear that the burden of proof of right is with the copier, not the copied. I expect most artists won't be sending invoices for licensing fees just yet, but corps surely will bleed dry anyone that produce unlicensed derivative works that generates any income.


At the moment there's no legal protection for style in an of its self. Additionally there maybe (and should be) if this style-capture actually displaces the artists they're apeing. But I don't see that happening. IMHO its a tempest in a teacup.

Why? Because Ai generated "art" is a soupy mess and real life human artists can speak and understand colloquial language, work quickly, and develop new styles based on new direction.

But then again maybe we're looking at the death of a widespread industry like when gigantic industrial looms came on the scene, but I highly doubt it.

Then again, last of all, I do see a future where AIs generate full feature length photo-real movies in minutes based on prompts and cheaply.


Exactly, any coder and artist should learn from scratch without absolutely any exposure to existing code, work of art or even idea. Anything else is outright STEALING!

Excuse me when I make an apple pie from scratch.


Have you maybe possibly previously been exposed to the concept of an argumentation straw man? Feeding actual works of art into an approximation machine, and no expecting the output of said machine to not be owned by the author of the art is making a big assumption I think. There is the word copy in "copyright" and the model did definitely got a copy of the original at source. No matter the dilution, copyright is being breached, as I understand it.


Thats a reducto ad absrudium at best. While you have a point, schools and even museums are generally compensated for providing these training models to the public, to look at it in a ml way.


Sure, I also had to pay for the books I studied from, but Dr. Tanenbaum is yet to knock at my door to assert copyright on all the code I have written.


Out of curiosity, how would you feel if someone fed your HN comment history into a ML model, then used that to respond on every HN topic and conversation under the username "othergpderetta"?


My HN history is public, so I wouldn't have a problem with training a model with it. I would have a problem with the model attempting to pass as my self of course.


Interesting argument, but it'd be outputting fully sounding word salad garbage.


So exactly like my comments!!


This is a perfect example because, depending on the apples you're using, growing them may have required a license and adherence to licensing requirements.

https://mnhardy.umn.edu/apples/licensing https://provarmanagement.com/cosmic-crisp/


>Excuse me when I make an apple pie from scratch.

You’ll have to invent the universe first.


I expect a big bang anytime now!


As the artist in the article points out, the artwork in the model doesn't belong to her and by current legal standards she has no authority to give permission; of course the corporate owners do have authority, and I'm not even sure you need new laws to enforce the copyright complaint.

I was complaining about all of this when the derivation was based on "the internet" and everyone was being ripped off at once. All the AI-generated art out there is doing the same thing.

Of course most of this is being used to create derivations of trendy pop art, so are we really losing anything? Was there ever any hope for artistic capitalism as something that communicates in meaningful ways beyond the most local of scale?


you can get copilot to regurgitate copyrighted code verbatim, but I haven't seen stable diffusion recreating copyright works yet, which is quite an important difference.


Wasn't it regularly spitting out whole watermarks?


AI acts like an alien more than a copier in that case. If you tell stable diffusion to make you clip-art of people in a conference room, you will probably see humans that don't have noses or fingers, or have 4 arms and some unreadable text that looks like a watermark going across them. The AI parameters have been trained on millions of clip-art images and assume the image should have whatever statistically applies to most images that have the same keywords on them. You can't make it copy an image, even if you tried. You can't even get it to fix the face or hands without additional processing with differently trained models. It sees as an alien would.


was the image underneath watermarked, or it just reproduced the watermark style over an unrelated image?


NFTs to the rescue.


Serious question: What is the difference between a human intelligence looking at a work and using concepts from it in their own, compared to an artificial intelligence doing it?

If the copyright violation occured by the AI's inputs looking at the work... how is that different than an image of the work landing on a human's retinas?


This is so clearly a copyright violation that I am hoping for a big lawsuit soon so the courts can confirm it.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: