A new tool presented by researchers at the University of Chicago attempts to prevent AI models from siphoning off art and using it for training without permission by "poisoning" the image data.
Known as "Nightshade," the tool modifies digital image data in ways that are invisible to the human eye, but which are claimed to cause all sorts of disturbances for generative learning models such as "DALL-E," "Midjourney," and "Stable Diffusion" to be tampered with.
Known as data poisoning, this technique claims to cause "unexpected behavior in machine learning models during training." A team of researchers at the University of Chicago claims to have shown that such poisoning attacks are "surprisingly" successful.
Apparently, images of poison samples appear "visually identical" to benign images; Nightshade's poison samples are "optimized for potency," claiming that less than 100 poison samples can destroy the Stable Diffusion SDXL prompt.
The details of how this technique works are not entirely clear, but it involves changing pixels in an image in a way that is invisible to the human eye while machine learning models misinterpret the content. It has been argued that poised data is very difficult to remove and that poised images must be manually identified and removed from the model.
Researchers have used Stable Diffusion as a sample and found that it takes only 300 poisoned samples to confuse a model into thinking a dog is a cat or a hat is a cake. Or would it be the other way around?
In any case, the impact of poison images can extend to related concepts, and a modest number of Nightshade attacks can "destabilize common features in text-to-image generative models, effectively disabling their ability to produce meaningful images."
Nevertheless, the team admits that bringing down a large model is not that easy. Thousands of poisoned images would be required. From a malicious actor's point of view, that is probably a good thing. In other words, it would take a concerted effort to undermine any large-scale generative model.
So, has your AI image model gone up in smoke? Perhaps, but one might also imagine that the generative collective intelligence of a powerful AI would require 3 picoseconds to register, coordinate, and make fully redundant such a measure now that the technology is publicly available.
That would be about it. Whether this kind of countermeasure would really work, and if so, how long it would last, would perhaps be even more important.
Comments