Nightshade: a new tool artists can use to "poison" AI models that scrape their online work

Ditto comparing horses to cars or solar panels to the way plants harness energy.

One may perform some version of a task provided by the other, but we certainly wouldn’t want to live in a world where the law or the culture at large didn’t see any fundamental distinctions between the two.

6 Likes

Given the enormous size of the training set data, I wonder how many poisoned images you’d have to feed into the system before it started breaking down. I could see artists protecting their specific style by doing this, but even then, it’d only work if their work hadn’t already been scraped.

Adversarial attacks on “AI” image recognition, by way of subtle, non-obvious (to humans) changes to pixel values, has a long history. It’s just never had much of a practical application until now.

“…networks are prone to error through attacks that confuse the model into making wrong predictions when small changes are introduced to the training dataset. The small pertubations are designed to have a significant effect on the model’s performance even when the change is not visible to us.”

7 Likes

the other interesting thing is that – really – the “ai” has probably consumed more work than i or any other individual human ever will. humans don’t need data to generate art, they need experience. it’s fundamentally different even with the “argument” people trot out about neurons

12 Likes

Schitts Creek Yes GIF by CBC

But… but… but… Elon’s cyborbmonkeys are teh F4T4r3!!! why do you want to hurt the cyborbmonkeys and not let them do the creatives… WHY??? /s

These are training ON THEIR WORK, without permission, so YES… hypothetical “AI” that doesn’t exist do not have more rights than living human beings who do…

image

Seth Meyers Idk GIF by Late Night with Seth Meyers

It’s like some people around here have never read any Cory Doctorow, or at the very least, thought his work were guidebooks…

But of course, the current existing AI and “neural implants” aren’t remotely like we see in sci-fi…

10 Likes

Exactly. What if one could intercept sat images from space before transmission back to earth and change the locations of hundreds of tanks and missile launchers to extra trees, cows and farm tractors?

(Maybe already been done, just saying)

Could NS protect said imagery?

1 Like

I think in theory, this might work. And it can even be tested out in pytorch, but the problem is that there is so much secrecy around how the big LLMs are trained (possibly even for this very reason), that we don’t know if the big LLMs will fall for the same traps that pytorch does. And even if they do, it would be hard to empirically test how well it works.

2 Likes

The full paper is here if you want the details, but the tl;dr: is that while a complete data set is large, some specific prompt concepts are limited enough to be susceptible with a small amount of poisoned training images.

While not exactly what they’re going for, I’d imagine using a technique like this could be quite effective at frustrating efforts to train LoRA models to match a specific target.

This doesn’t seem feasible to implement at scale, however. If future training pipelines include tools that can check and correct this kind of adversarial input, you’d have an ongoing cost to keep any public-facing media updated with the latest protection algorithms. It’s possible we could see content server plugins to do this kind of alteration on demand, but it seems more computationally expensive than, say, image compression.

5 Likes

It’s worse. It’s like they never rented a copy of windows, “bought” an app or a book on kindle.

“ see themselves as temporarily embarrassed millionaires.” needs to be updated to something like see themselves corporations and not the product.

3 Likes

Assuming the image analysis was being done with a neural network, yeah (and I suspect a lot is, these days) - and something like NS would be the tool used to transform targets into… something else.

1 Like

I wonder if, like with LLMs, even the occasional “hallucination” could potentially make an image generating tool useless (Seems like image generators are more likely to be used with a human in the loop to notice poisoned output, whereas things like ChatGPT are being used without any human eyes on the process.)

1 Like

Tv Show Comedy GIF by HULU

5 Likes

The way current open genAI models work, it’s relatively simple(if not cheap) to remix one with unwanted concepts filtered out or additional training checkpoints included. The hard part would be detecting which concepts were poisoned, but if someone noticed, that specific data could be replaced without having to re-train the entire model. Presumably the closed-source models have a similar process.

This style of attack is only really effective if all available copies of images pertaining to a specific concept are poisoned, and even then only until someone trains a model to account for it. Even if an artist rigorously “re-nightshades” their images whenever a new protection algorithm is released, any prior saved images will still be vulnerable.

4 Likes

What Techbro fanboys say, approvingly: “Techbro entrepreneurs are building neuralink interfaces between AIs and the human brain.”

What I hear:

Just look up the entirety of the Ghost in the Shell universe. But specifically in this case The Laughing Man.

7 Likes

… now they are also brain surgeons :brain:

4 Likes

Ah, but don’t you see, any sufficiently advanced curve fitting algorithm is indistinguishable from magic!

3 Likes

Objections to LLM isn’t moral panic.

I can see that if we do not intervene soon the future will be flooded in low effort content spam.

We have been here so often it’s tiring. Tech bros are once again so enamored with a hyped technology that they refuse to see the negative externalities.

9 Likes

Oddly enough, no, I’m not the least bit concerned about what LLMs “deserve.” They are not in any way sentient. LLMs are no more a concern to be protected than any other algorithm, you may as well be concerned about what Dijkstra’s algorithm deserves when it’s used to calculate a route.

6 Likes

I fall in that category; most of my art is not already digitized, so this tool could be useful to me as a preemptive measure.

Theft of intellectual property is NOT okay. Just because we’re not rich that does make it acceptable to steal from artists and writers.

9 Likes

Careful, Brian can hear us.

1 Like

1 Like