Update: The creator of the tool has posted a Twitter thread better explaining how it works and reiterating that it does not function as intended with non-AI generated images. They also note that “clearly there’s a lot of room for improvement here”, and that not everything is operating yet in the public-facing tool.
Stable Attribution does appear a rocky tool in practice for it, but it’s still in beta today. It’s not a simple process to discern where exactly an AI has learned to create what it creates, or how much any one image out of billions plays into the creation of another. Perhaps we’ll see this tool materialise into something useful for attribution with time, or the companies behind these AI image generators will find another way to calm artists demanding some token for their work in making it all happen. But maybe that’s just wishful thinking from me on the artists’ behalf.
Original story: AI art tools have enjoyed a meteoric rise in popularity over the past year. Millions now use incredibly impressive tools like DALL-E 2 and Stable Diffusion to generate images seemingly out of nowhere using text prompts. But they’re not coming from nowhere, and Stable Attribution hopes to make everyone more aware of the human-made art that AI art ultimately derives from.
Stable Attribution is an algorithm that lets you sniff out the source images that are likely used in the creation of AI art. It’s a sort of reverse-engineering algorithm; finding the human-made artwork that helped the AI get by. It could become something very important for the artists in an ongoing feud with AI image generation tools. Stable Attribution could offer a way for artists to claw back control over use of their images.
Here’s an example of how it works. I enter a prompt into AI image generator Stable Diffusion: ‘a giant PC roaming through the woods looking for fresh PC parts to consume.’ The AI spits out the following image.
Then I download this image, drop it into Stable Attribution, and it spits out a collection of images that it believes were used in the training of Stable Diffusion and likely referenced in the creation of my prompted image. In this case: a product banner, an image from a vocational school in Spain, a product lifestyle shot, and many more.
If any of the images are yours or someone you know, you can submit a link to appropriate credit.
Stable Attribution works by decoding an AI model into the most similar examples it can find in the datasets available to it. Some companies use publicly available datasets, such as Stable Diffusion and LAION, and that means Stable Attribution can also index a copy of the dataset for cross-referencing. However, OpenAI’s training data isn’t publicly available, which halts any such tool for DALL-E 2 in its tracks.
Stable Attribution isn’t a perfect algorithm for attribution just yet, either.
“Version 1 the Stable Attribution algorithm isn’t perfect, in part because the training process is noisy, and the training data contains a lot of errors and redundancy.” Stable Attribution says.
How similar the output image looks to those used in training isn’t a perfect way to discern what went into making an image in that exact instance. AI algorithms are getting more and more complex in how they go about this.
“But this is not an impossible problem,” Stable Attribution continues.
As AI expert Alex J. Champandard notes, it could be as useful a tool for AI artists as it is for artists who feel their copyright has been infringed upon. If you’re selling art generated by an AI, is there a case for a copyright claim against you? Are you responsible for using an AI tool or rather the company that creates the AI tool? Is training on a dataset using copyrighted material unfair and in breach of copyright? These are the questions we don’t have clear legal precedent for yet, but you can be sure this will be a debate that rages on for years to come.
It largely comes down to training. AI image tools use masses of data for training. Training is the process of teaching an AI to do something, and in the case of Stable Diffusion and DALL-E 2 that is to produce images that match the descriptions given to them. There’s nothing inherently wrong with training an AI to do something—unless it’s something morally ambiguous or damn right evil. In this case, these AIs are being trained in simple, mostly harmless image generation. No issue there, right?
Well, no, not necessarily.
The data used for this training is usually scraped from the web, meaning a url and description for the image is stored on a database and later fed into the algorithm. These datasets can contain millions of url and description tag pairs, and most often sold off or given away by third-party dataset gathering companies.
The issue that artists are levelling at these AI art tools is that these datasets are often filled with copyrighted imagery. Scraping is largely indiscriminate; a dataset might contain some art you uploaded to a forum in 2008, or some art you entered into a competition online once. It could contain offensive images that have to be culled. A dataset might even have a picture of you in it, and you’ll have to ask the dataset company to remove it if you don’t like the idea of an AI being trained on your likeness.
It’s pretty messy. The law hasn’t caught up with the AI tools, yet we’re beginning to see multiple lawsuits filed at StabilityAI, OpenAI, and many more AI businesses like them, for infringement of copyright. Artists feel their art style is being copied by an AI and used to prop up companies now worth billions of dollars. Meanwhile, they never gave consent for their images to be used, regardless of these images’ copyright standing, and rarely receive a penny for their contribution.
We have seen attempts to make AI art tools fairer to artists, however. Shutterstock now offers AI art in partnership with OpenAI but will pay artists whose work was used in its creation a fee. A completely different strategy to Getty Images, which has filed a lawsuit against Stability AI over the use of images from its library for training purposes. Stable Diffusion also plans to allow artists to opt-out of future versions of the tool, yet you can’t help but feel like that should’ve been an option in the first place. Stable Diffusion really has no claim to any of these images in any legal sense, even if the use of them in training is a somewhat grey area, but why should the onus be on individuals to tell them to lay off?
These court cases won’t throw out any answers for some time, and even then, one opinion is unlikely to be the end of the matter. Legislation has a lot to catch up on regarding AI, and image generation is only a small part of the debate.