Artists sue developers of Midjourney and Stable Diffusion, claiming copyright infringement

Originally published at: Artists sue developers of Midjourney and Stable Diffusion, claiming copyright infringement | Boing Boing




I know nothing about training AI, but wouldn’t it be cool if Midjourney/Stable Diffusion were trained using only stuff in the public domain, creating a distorted window into the past?


With all this in mind I wonder what would happen if someone asked Midjourney to create art in the style of Stable Diffusion.


It’s robots all the way down maaaan




This will be an interesting one to follow. A question I have (which I welcome comments on) is what is the difference here between what the technologists are doing in the training vs. an artist creating an artwork “in the style of” xyz artist (which wouldn’t be copyright infringement?)


Alas, that looks like it’s challenge based on copyright as well. If it had been a challenge based on Getty Images’ terms of service I could see it fairing better. There’s no copyright infringement inherent in generating the model (so the Butterick suit is probably doomed), but depending on what terms the images were made available online under - e.g. not allowing them to be downloaded anywhere but the browser’s cache for viewing by a human - there could have been infringement in obtaining the images for analysis.


That’s an interesting question. I think it’s the difference between “I taught myself to paint like this other person” and “I took lessons from this other person and then refused to pay them for what they taught me.”

At least that’s how I see it.

That’s not a given yet. It’s a question that will need to be settled by a court. For example, in music courts have ruled that sampling to produce a new product is infringement. AI art might very well go in that direction, too. They are not merely emulating your style, but literally sampling it to produce new versions of your work.


I think it’s apples and oranges; the software is just a tool that could be used to create images that potentially infringe on someone’s copyright. Whether it’s an artist using a digital painting program, or an end user feeding a prompt into a image generator, it’s a person creating an image that’s potentially substantially similar enough to someone else’s copyrighted work to be infringing. And from what I understand, except for some over-represented stock photo backgrounds that some versions of the models memorized in detail, it takes effort to coax a specific result out of the generator, so the generator might be no more at risk than a digital painting program in terms of facilitating copyright infringement.

The training is also I think completely different from what an artist would be doing. I’ve heard it described as a “reverse image recognition”; instead of producing a description for an image, it’s producing an image from a description. An artist wouldn’t need a description of the artwork’s style or the name of its artist in order to mimic it, and s/he would only need a few examples of it. The model generator needs so much more input to “learn” how each word should influence the output.

1 Like

The issue with the music comparison is that samples actually are the original music. The AI generated art contains none of the art used to train the model. It contains things that were made to look similar, but not the actual art itself.

I think that might be the key issue in the case.

If they want to expand copyright to include “looked at it and made something similar” then that’s going to be a nuclear bomb in soooo many areas.

From what I can see they are looking to expand copyright to cover “training an AI”, which is currently doesn’t.

My suspicion is that this lawsuit will fail, but it will be a spark for legislators to start looking at the issue, because rights-holders are a lot of huge corporations.


So are we going to see Getty watermarks on images because the AI was trained with them present.


It literally contains the art, and sometimes folks have found artist’s actual signatures in these AI works. Again, I think the courts are going to have to weigh in on this, but to say it’s not a “copyright violation” is premature, and there are definitely grounds it might well be considered a violation. And the more big companies start weighing in with the lawsuits, the more the courts and government will be pressured to make sure it IS considered a copyright violation.


It’s like a ship of Theseus and the new ship neither contains any original parts of the original ship nor does it follow the original design exactly. It’s like a shipbuilder with a decent memory building a loose replica at best from memory.


That’s the thing though; they’re not sampling it or producing new versions of anyone’s works. To put it in text terms (where I’m less likely to make a fool of myself in terminology) on a basic level it’s like taking an author’s bibliography and generating a frequency list of the words they use and the turns of phrase they like the most, and using that to tune the generated text so it sounds more like what that author would have written. But generating that frequency list doesn’t infringe the author’s copyright, even though it includes every word they’ve every written (with the number of times they used it, to boot!)… the actual works were not copied, just individual uncopyrightable words.

Shutterstock, allegedly. I haven’t seen any examples myself, but I’ve heard that they sometimes show up mostly complete.

I’ve certainly heard of things that look like signatures showing up - and had them show up myself - but I’ve yet to see an example of a real person’s signature showing up. The model has been trained that some images include signature-like squigglies, so sometimes they are generated.


They literally are. The model is trained on their works. That’s called sampling in computer terms. EDIT: striking out the signature thing, I may well have been wrong about that.

Maybe the courts will rule it’s NOT a violation. Maybe. But I absolutely could see lawyers using music sampling as an example to help judges understand the technical aspects here. If the courts don’t weigh in, governments will, because this will soon be about big bucks, and where there’s money, the corporate vultures will circle.


Technically no, it’s not sampling because it doesn’t copy and paste the original works. It memorizes how to recreate the style or the object in an original work, which arguably would not be copyright infringement due to the idea/expression dichotomy.

Here’s a deeper explanation:


Arg. This argument again.

No, they haven’t found art with signatures. They have found art generated by a model that was trained on art with signatures, so it knows it has to put a scribble down at the bottom because that makes the output art meet the criteria better.

It literally does NOT contain the art. It doesn’t cut and paste. At all. That’s not how it works.

I’ve never seen an actual copied signature, and neither have you.

Interestingly, I think that will be the key to any copyright challenges like this.


The output for the prompt “the developers of midjourney running away from a process server” on Midjourney

@beschizza how many fingers did you prompt?

Screen Shot 2023-01-17 at 23.04.45


I don’t think the courts will… if action is demanded loudly enough, it will probably have to be via legislation.