SWITCHED ON
The daily technology series nobody asked for but everyone needed
Who Owns the Brushstroke: AI Art, Creativity, and the Copyright Crisis
AI image generators trained on billions of human artworks are producing images in seconds. The artists whose work trained them would like a word.
A human artist spends years developing a style. An AI model trains on that style, millions of examples at once, and can reproduce its aesthetic in four seconds flat. Whether that is innovation, theft, or something copyright law has simply never had to think about before is the question currently eating the creative industries alive.
Yesterday we tackled AI regulation — the EU AI Act, the US patchwork of executive orders and voluntary commitments, China's content-control-as-governance approach, and the structural problem that a borderless technology and fragmented regulatory landscape creates when you try to actually govern anything. Today we are staying with AI but moving from policy to something more immediately felt by a much wider range of people: what AI is doing to creative work. Specifically, what happens when systems trained on the accumulated creative output of human civilisation can reproduce, remix, and generate in the style of any artist, writer, or musician who has ever put their work online — without asking, without paying, and without the legal framework having a clear answer about whether any of this is permissible. The artists are furious. The AI companies are arguing fair use. The courts are working through it. Nobody is entirely right. Let's go.
01 — How AI Image Generation Actually Works
To understand the copyright dispute, you need to understand what AI image generators actually do when they train, because the legal and ethical questions hinge on the technical process in ways that are often misrepresented in both directions.
Diffusion models — the architecture underlying Midjourney, Stable Diffusion, DALL-E, and similar systems — are trained on enormous datasets of image-text pairs scraped from the internet. LAION-5B, one of the most widely used training datasets, contains approximately five billion image-text pairs. The model learns, across this dataset, the statistical relationships between text descriptions and visual features — what "impressionist painting" looks like, what "hyperrealistic portrait" looks like, what "in the style of Greg Rutkowski" looks like, where Greg Rutkowski is a fantasy artist whose name was among the most frequently used style prompts in the early Midjourney era, without his knowledge or consent.
The model does not store the training images. It stores the patterns learned from them — weights and parameters that encode the statistical regularities of the visual world as represented in the training data. When you prompt the model, it generates a new image by iteratively denoising random noise guided by those learned patterns and your text description. The output is not a collage of training images. It is a novel image generated from learned representations. Whether those learned representations constitute a derivative work of the training data, and whether their creation required a licence, is the question that courts are currently being asked to answer.
The training process is simultaneously clearly inspired by human creative work and clearly not copying it in any traditional sense. This is precisely why existing copyright law, written for a world of direct reproduction, struggles to give a clean answer.
02 — The Lawsuits
The legal challenge to AI image generation arrived quickly and from multiple directions. In January 2023, a group of artists including Sarah Andersen, Kelly McKernan, and Karla Ortiz filed a class action lawsuit against Stability AI, Midjourney, and DeviantArt, alleging that their copyrighted artwork had been used without consent to train AI systems that now compete directly with them commercially. Getty Images filed a separate lawsuit against Stability AI, alleging that billions of images from its licensed photo library had been scraped and used for training without authorisation.
The artists' lawsuit has proceeded slowly through the US court system, with initial rulings narrowing some claims while allowing others to proceed. The core question — whether training on copyrighted images without a licence constitutes copyright infringement — has not yet received a definitive ruling from a US appellate court. The outcome will be significant not just for image generation but for every AI system trained on copyrighted text, code, music, and other creative works.
In parallel, the New York Times sued OpenAI and Microsoft in December 2023, alleging that its journalism had been used to train large language models without compensation and that those models could reproduce Times articles with a high degree of fidelity. This lawsuit is proceeding through courts and has produced some of the most pointed exchanges in the broader AI copyright debate, particularly around the question of whether AI outputs can constitute copyright infringement even if the training process does not.
On the output side, the US Copyright Office has been clarifying its position through a series of guidance documents and decisions. The consistent principle: copyright requires human authorship. AI-generated works, where the human contribution is limited to a text prompt, do not qualify for copyright protection. Works with substantial human creative input that uses AI as a tool may be eligible, with protection extending to the human-authored elements. The line between "using AI as a tool" and "AI generating the work" is, inevitably, contested and context-dependent.
03 — The Fair Use Argument and Its Limits
The AI companies' primary legal defence is fair use — the doctrine in US copyright law that permits use of copyrighted material without a licence for purposes including commentary, criticism, parody, education, and transformative use. The argument runs roughly as follows: training on copyrighted images is transformative because the model learns general visual representations rather than reproducing specific works; it is analogous to a human artist studying the works of others to develop their own style; and the outputs do not substitute for the original works in any market.
The analogy to human artists studying others' work is rhetorically effective and legally imprecise. A human artist who studies a thousand paintings cannot reproduce any of them with fidelity, because human memory and human hands introduce transformation at every stage. An AI model trained on those same paintings can, under some conditions, reproduce them with considerable accuracy — as demonstrated by the Getty lawsuit's evidence of Stable Diffusion outputs that closely resembled Getty watermarked images, watermarks included. The transformativeness of the learning process does not automatically make all outputs transformative.
The market substitution question is where the fair use argument is most vulnerable. If an AI image generator can produce work in the style of a specific living artist on demand, at zero marginal cost, that is a direct commercial substitute for commissioning that artist. The artist loses work. The fact that the output is not technically a copy of any specific artwork does not change the economic impact. Fair use doctrine considers market harm, and the market harm to illustrators, concept artists, stock photographers, and other commercial visual artists from AI image generation is documented and substantial.
04 — What the Creative Industries Are Actually Doing
Artists and writers have not simply waited for courts to sort it out. The response has been varied, creative, and in some cases technically inventive.
Organised resistance has included open letters, industry coalitions, and legislative advocacy. The Writers Guild of America's 2023 strike included AI provisions that resulted in negotiated protections against AI replacing writers or being used to generate scripts without disclosure. The Screen Actors Guild negotiated similar protections around the use of AI to replicate actors' voices and likenesses. These are real gains within specific industries that have strong unions and collective bargaining infrastructure. They do not extend to the much larger population of freelance illustrators, photographers, and visual artists who lack equivalent organisational structures.
On the technical side, tools like Glaze and Nightshade — developed by researchers at the University of Chicago — attempt to add imperceptible perturbations to images that disrupt AI training without affecting how the images appear to human viewers. Glaze attempts to mask an artist's style so that models trained on their work learn incorrect style associations. Nightshade attempts to poison training data so that models trained on protected images produce corrupted outputs. These tools are imperfect, arms-race in nature, and require individual artists to actively apply them to every image they publish. They are nonetheless a genuine technical response to a problem that legal systems have been slow to address.
The creative industries are not facing a disruption comparable to what the internet did to music. They are facing something larger — a technology that does not just distribute creative work differently but generates it, at scale, trained on the accumulated creative labour of human history, with the economic benefits flowing almost entirely to the companies that built the models.
05 — The Harder Question
Underneath the copyright litigation and the fair use arguments and the poisoning tools is a question that legal frameworks are particularly poorly equipped to answer: what do we owe to the people whose creative work made these systems possible?
Copyright is a property framework. It asks whether a specific use of a specific work requires a licence. It does not ask whether the cumulative effect of training on millions of works, each individually perhaps too small to matter, produces an obligation to the creative community whose collective output enabled the capability. It does not ask about the difference between a technology that amplifies human creativity and one that substitutes for it. It does not ask what a creative economy looks like when the marginal cost of generating competent visual art, competent prose, and competent music approaches zero.
Some AI companies are beginning to offer licensing deals and opt-out registries as partial answers to the training data question. Adobe's Firefly model was trained on licensed stock images and Adobe's own content library, and is positioned explicitly as a commercially safe option. Getty has its own AI image generator trained on its licensed library. These are genuine attempts at a more sustainable model — one that compensates the creators whose work enables the capability. Whether they represent the future of ethical AI training or a temporary competitive differentiator before the courts decide that the alternative is permissible anyway depends, ultimately, on legal outcomes that have not yet arrived.
Tomorrow we are taking the AI conversation in a direction that is simultaneously more intimate and more alarming than copyright disputes — AI in hiring, lending, and criminal justice. The systems making decisions about whether you get a job, a loan, or bail are already running algorithms. Whether those algorithms are fair, transparent, or accountable is a different question entirely. See you then.
Switched On is a daily technology series covering AI, social media, data privacy, and the digital forces reshaping modern life — with no corporate spin, no false comfort, and absolutely no mercy for buzzwords.



