The Three Artists Behind a Landmark Lawsuit Against AI Art Generators

CaptainZapp · on Jan 23, 2023

" In an interview with Forbes, Midjourney CEO David Holz said that the dataset that the company trained its AI model on was “just a big scrape of the internet” and the company hadn’t sought consent from artists who owned the copyright to their work. “There isn’t really a way to get a hundred million images and know where they’re coming from,” Holz said."

So you just go ahead and steal copyrighted images without ask and attribution?

This attitude is far beyond the pale and I, for one, hope that the artists succeed with their lawsuit.

jaggs · on Jan 23, 2023

I think it might help if you understand how AI art works. They're not scraping images, they're scraping rules. Completely different thing.

https://i.redd.it/2f00l6vsso6a1.jpg

CaptainZapp · on Jan 23, 2023

Does that really matter if you can easily generate something that comes pretty close (at least in a superficial way) to the artist by just entering the artists name with your description?

jaggs · on Jan 23, 2023

That's a really good question, and it's key to the issue at hand I think.

Any competent artist can create a lookalike style of another artist at will. It's just a matter of looking at the paint type, brushes, materials, strokes, composition, etc and copying that.

In fact, here's Vincent Van Gogh copying other artists of his time - https://www.dailyartmagazine.com/van-gogh-copy/ .

How does this differ from a machine doing the output? Apart from speed and flexibility?

diemes1 · on Jan 23, 2023

Van Gogh's studies of existing works of the time still maintain his 'style', his brush strokes, use of color, rendering of form. No matter how many times he copies another's work he's still Van Gogh and his work expresses and reflects who he 'is'. There is no existing model from which we can accurately imitate the human creative process, so it begs the question, what is the nature of the imagery generated?

Think of it like your hand writing and signature style. Even if it may appear similar to others, it's still uniquely identifiable as yours. If someone copies your writing style and signature to forge documents, it wouldn't just become ok as long as they forge new documents, in any case it would be immoral and illegal without your express consent.

That is the nature of algorithmically generated imagery, complex forgery, but instead of the forgery being of an individual, it's modeled from billions of images scraped from the internet. This is entirely different from the human creative process.

williamcotton · on Jan 23, 2023

Go type in any of those artists and/or the names of their works in to both Google Images and Stable Diffusion.

Notice a difference?

The image that is generated from SD is going to be so far from similar and something that someone else would choose instead of the plaintiff’s works it’s comical.

Rather people are using SD to create music groove synths, remove people from the backgrounds of images, turn their dogs into Dali paintings, and many non-infringing uses (Sony v Universal).

At some point VCRs, Google Images, Google Books and Stable Diffusion all make indiscriminate copies of many works. A factor that weighs heavy in both determining fair use or not is how useful the tool is for non-infringing cases, in favor for Google Books, against for TorrentFire-things. Since SD is basically incapable of recreating any of the plaintiff’s works and entirely capable of many things that has nothing to do with the plaintiff’s entire industry it weighs heavily in favor of fair use.

The fact that the artist’s names come back is because the name is associated with a visual style. For Sarah Anderson you can get better results for her style by using “quirky, black and white cartoon, striped sweater, big eyes”… and then get back grotesque cartoons and sift through trash, inpainting for hours… and then end up with an entirely original cartoon looking thing that is noninfringing because styles cannot be copyrighted. Neither can ideas! (Seldon v Baker)

Someone uploading “Sarah Anderson SD Tuned Model” to HuggingFace could be infringing on a trademark violation or a common-law publicity violation, but that’s already protected and has nothing to do with making copies for the sake of training an algorithm.

I’m not sure it is currently illegal and should ever be considered wrong to have “Sarah Anderson is known for her quirky black and white cartoon characters with big eyes and black and white striped sweaters that she puts on daily calendars and sells on Amazon” be knowledge in the public domain as this is a fact about the world.

poulpy123 · on Jan 23, 2023

You don't ask permission to look at an image that is publicly on display. AFAIK there is no law saying that machine learning bots are forbidden to look at an image that is publicly on display.

Spivak · on Jan 23, 2023

Does this apply to GPT too? They also scraped massive amounts of copyrighted text. The situation with art is super weird because it seems the difference between them is how threatened they are not whether they actually have a compelling argument which is why there was a whole kerfufle about Copilot because devs felt threatened.

When you pick people who aren’t in the situation where their livelihood depends on them holding a specific opinion on AI, like authors, the copyright issue suddenly doesn’t matter.

rose_ann_ · on Jan 23, 2023

You might want to learn how the models work before making inflammatory comments.

To simplify the entire concept: Imagine you were an alien. You decide to work as an artist on earth. So you go on this thing the Humans call 'the internet', view millions of images & works of art, then start creating images & art pieces of your own based on the underlying rules you've gleaned from those millions of images you've viewed, without in any way keeping or saving a single copy of any of those images or artworks.

Would you call the above copyright infringement?

Or are you now claiming its 'copyright infringement' if anything or anyone apart from a human being views an image?

sleeplessworld · on Jan 23, 2023

Perhaps we are in new legal territory, where existing copyright laws and legal theory doesn't cover the issues completely?

I wonder how things will go if (or when) similar AI technology enters the realm of production industries e.g. manufacturing or chemical. One can imagine that data of actual production plants and processes are obtained using drone cameras, pictures, schematics, text books etc. This data is then analyzed and synthesized into improved production flows, using the same method, which basically claims (it seems to me): we did not use any of your actual production knowledge, we just let the AI learn the principles. None of the original IP is in the AI working data. No patents are violated etc.

If this technology is sold to the manufacturing companies - that can choose to opt out or opt in by buying or not bying - there probably isn't a problem. If this technology is used to establish competing manufacturing companies, where the economy can be significanty improved, and thus drive the existing companies out of business, then what?

I think in the latter case, the AI company would be obliterated using lawsuits, without any thought given to the merit or legality of the AI tech. Or at least the AI company would be bought out using an "you can't say no" offer. And thus effectively shut down.

Now I think of it, this seems to boil down to some philosphical issues where these thought experiments are needed and have value in order to work out what the issues really are. Perhaps an issue here is that there isn't much value on artists and works of art in general (with some economically strong exceptions like movies), but there is very strong protection of other industries - industries with vested interests, strong economies and monetary resources to defend against attackers.

SuchAnonMuchWow · on Jan 23, 2023

Your analogy with a company processes doesn't quite work because patents works differently from copyright:

With copyright, you are allowed to use the original work as your own if you transform it enough (basically, you can get inspiration from it, but not exact copy).

With patents, you infringe a patent when you use the work described in it without the license, even if it's just as an inspiration and if you transform the method. It also doesn't matter if you rediscovered the content of the patent independently from the inventor, even if you can prove it.

So in this case, the original manufacturing company can defend its IP even if its work has been learned then rediscovered by an AI.

jaggs · on Jan 23, 2023

Great points. The one thing I would respectfully disagree with is the idea that this stuff can be shut down by buyout, legal action or whatever. The fact that the technology and process is now 'out of the bottle' so to speak, means that it won't go away. Progress is progress, and whether it's open source or foreign countries as actors, means that love it or hate it, we're going to have to live with the consequences of our actions.

It's going to be fascinating to see how we deal with androids which integrate advanced AI and therefore perfectly mimic human beings. The mind boggles.

sleeplessworld · on Jan 23, 2023

These are thought experiments. So other sensible viewpoints are only a contribution. I am myself also trying to work out where AI leaves us. Not just in this case, but because we will be hit by AI in the computer science field - hardware, software, everywhere - and potentially many places in general society.

I do not immediately see why strong market actors would not be able to close an attacker down in this way. I buy the argument that tech can be 'out of the bottle'. But this does not necessarily mean that it is feasible to use for market competition. It is often more likely that the tech is absorbed by existing strong market players. But as happens right now, you can influence markets, professions and peoples behaviour by making free (no pay) market tests on a large enough scale. Which is interesting in itself.

I did business studies at university and we had a lot of work on market forces. And the text books were full of examples of market situations where competitors attacked and was succesfully fended off using legal action - both warranted and unwarranted. And by other means. And wipe out by buy out is completely normal. I do not have much data on this, but I would think that a lot of acquihires, are actually done as market defence. With the basic intention to close down competitive tech. Maybe the acquired tech, knowledge and brains is used sometimes, and then probably incorporated into existing tech, but I think this is not always the main purpose of the acquihire. It is probably counter intuitive to think like this for tech people (I am one myself). But business people think in market dynamics (and money) first. The tech can often be a surprisingly long way down in the list of priorities - where we think it should be much higher up.

I have myself worked in companies that did a merger, where the merging entity was spun off by a parent company - in all cases a tech department that was no longer wanted by the parent company - and in the aftermath of the merger, the business entity was in reality obliterated by the new company, much more than it was merged or absorbed. Made a lot of people in the merged entity unhappy and rightfully so.

poulpy123 · on Jan 23, 2023

> Perhaps we are in new legal territory, where existing copyright laws and legal theory doesn't cover the issues completely?

Not only there is probably no law but there is no precedent in human history to define the morality of these tools. Until now it was universally accepted that people could see public art and learn from it. But here it's not a person, it's an algorithm.

diemes1 · on Jan 23, 2023

Inflammatory comments should be welcome. New technologies, when not scrutinized, become tomorrows problems.

Just because the model does not save any imagery directly does not change the fact that the model cannot exist without copyrighted imagery. These diffusion models are being made by humans, not aliens and should be judged as such.

williamcotton · on Jan 23, 2023

> > " In an interview with Forbes, Midjourney CEO David Holz said that the dataset that the company trained its AI model on was “just a big scrape of the internet” and the company hadn’t sought consent from artists who owned the copyright to their work. “There isn’t really a way to get a hundred million images and know where they’re coming from,” Holz said."

> So you just go ahead and steal copyrighted images without ask and attribution?

It's not ever "stealing". It's a copyright infringement and only if it isn't in fair use.

> This attitude is far beyond the pale and I, for one, hope that the artists succeed with their lawsuit.

The fact that the model is based on scanning a hundred million images in order to function and that less images means it would not function as well and the fact that it is actually impossible to ask for a hundred million permissive uses is all highly relevant to judgements related to fair use.

And of course that there are many non-infringing uses factors in heavily.

In the case of Sony Corp. of America v. Universal City Studios, Inc. (1984), also known as the "Betamax case," the Supreme Court found that the sale of home video recording equipment (such as VCRs) was not copyright infringement, because the equipment had "commercially significant noninfringing uses." This principle, known as the "Betamax defense," was relied on by Google in the later case of Authors Guild, Inc. v. Google, Inc. (2015).

In the Google Books case, the court found that Google's scanning of millions of books to create a searchable database was a fair use of the copyrighted works under copyright law. The court cited the Betamax case in finding that Google's use of the books had a transformative purpose, and that the creation of a searchable database had significant public benefits, including the ability to find and identify books, and the ability to create new forms of research and scholarship. The court also found that Google's use did not harm the market for the original books, as it did not serve as a substitute for the original books and that the use of the copyrighted material was minimal.

Similarly, in the case of Stable Diffusion, it could be argued that the tool is being used for a transformative purpose, such as creating new and unique images (for inpainting new background imagery in place of removed people from vacation photos, or generating spectrographs of "70s disco rock", or converting a photo of a dog into a painting of a dog in the style of Dali) , and that it is not harming the market for the original works. Additionally, the fact that Stable Diffusion is free for the public to use may further support the argument that it is fair use, as it does not compete with the original images and does not generate commercial gain for the defendants.

jaggs · on Jan 23, 2023

>> so it begs the question, what is the nature of the imagery generated?

Yep, exactly. That's the question we're all going to have to deal with as AI grows in popularity and utility.

Spivak · on Jan 23, 2023

I think the current state of things is the most sane approach. Copyright on the inputs doesn’t matter but copyright on the output does. If you use the tool to create a work that a court would consider copying, congrats it’s infringement. If not you’re good. Not sure? You’re now in in the same boat at other artists who can’t remember if their idea is original or not.

dzonga · on Jan 23, 2023

if art or whatever can be reproduced in original form as cheaply as possible.

the the next question becomes is it still worth what it was worth before. given that in economics - we know yesterday's price is not today's price.

at a certain point - if we can't distinguish works by AI vs Humans then whatever gave those works certain value has diminished. However, I will also also argue truly impressive works by humans will become more valuable over time. than generic AI generated works.

there's a reason - we read books that are hundreds of years old.