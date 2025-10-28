By Laurel Kilgour, Research Manager

A little after 6 am on Saturday, October 4th, Reddit user Proof_Worldliness297 told the world how bummed out they were that copyright “killed” their favorite app—OpenAI’s new “video generation model” Sora 2—barely a day after it had soared to the top spot on Apple’s app store. “When we use a neural network with characters, we call it fun,” Proof lamented, “and they call it copyright infringement.”

OpenAI’s modest steps to prevent some infringing uses of Sora 2 may have frustrated users, [1] but to founder Sam Altman and head of policy Chris Lehane, such reactions were no doubt welcome. Turning popular opinion against legal protections for artists and creators is no easy task. Still, for OpenAI and other generative artificial intelligence companies, it is a key part of a multi-pronged campaign to persuade policymakers that copyright must be weakened. Otherwise, they say, US-based companies cannot compete. OpenAI has told the Trump administration that developers in China will get “unfettered access” to copyrighted data to train their models, so unless American companies get limitless access too, “the race for AI is effectively over.”.

So, what exactly did OpenAI do in pursuit of its goal? And how will this play out in court cases over copyright infringement that seem all-but-inevitable?

Immediately upon release, it was clear that Sora 2, which generates videos based on user prompts, was not only trained upon but also reproduces copyrighted content. Some outputs appear to be unaltered, including a Disney castle, other major studio movie openers, and well-known pop songs. Other times, it plops familiar characters into new contexts. A classic Sonic the Hedgehog video game is re-set in San Francisco. A live action Super Mario faces a dragon. The South Park kids take their first Waymo ride. Family Guy is restyled as “analog horror.” Lord of the Rings is reimagined with Gen Z slang. There’s even a “Jewish R2D2 wedding.” Famous celebrities—both living and long-dead—star in weird skits. And one features an uncanny Sam Altman saying “I hope Nintendo doesn’t sue us”—while grinning at a field full of Pokémon. [2]

OpenAI’s hardball tactics to attack copyright protections started even before Sora 2’s launch. According to the Wall Street Journal, OpenAI “began alerting” studios and talent agencies that copyright holders would have to “opt out of having their work appear” in videos generated with Sora 2. But, facing backlash, OpenAI eased off quickly. First, it pulled back on generating recognizable public figures (likely because there’s a separate body of law—called the right of publicity—that governs the use of a real person’s likeness). Then, just three days after launch, Altman wrote a blog post indicating that OpenAI would switch to an opt-in system—while maintaining that “a lot of rightsholders… are very excited for this new kind of ‘interactive fan fiction.’”

It’s too little, too late to dodge litigation. Even if Sora 2 stops generating copyrighted content without permission—which Altman himself acknowledged OpenAI cannot guarantee—courts will likely be asked to sort out whatever damage has already been wrought by OpenAI’s unauthorized copying, reproduction, and distribution. It’s unlikely that OpenAI will be able to seriously contest that it directly infringes—or at least induces users to infringe—copyrighted material.[3] Copyright clearly protects creative expression in the form of cartoon characters. As a result, litigation outcomes will instead hinge upon OpenAI’s affirmative defense of “fair use.”

Under copyright law, the fair use defense prevents “rigid application” of the Copyright Act “when, on occasion, it would stifle the very creativity” the law aims to foster. When determining whether this defense applies, courts consider four factors: 1) the “purpose and character of the use,” 2) the “nature of the copyrighted work,” 3) “the amount and substantiality of the portion used in relation to the copyright work as a whole,” and 4) “the effect of the use upon the potential market for or value of the copyrighted work.” Notably, each copying or use of a work may be treated separately. Thus, if OpenAI obtained any copyrighted material through piracy, it could—like Anthropic—be on the hook for that act, even if it is exculpated for every step taken afterward.

Although each case will turn on the application of the four fair use factors to the specific facts involved, here is a quick-and-dirty hot take: On the first factor, so far judges have typically found that training generative AI models is transformative—meaning that it “adds something new, with a further purpose or different character”—thus satisfying factor one, even for commercial use.[4] OpenAI’s generation of identical characters might be a closer call, but this factor could nonetheless tip in its favor. The second factor, by contrast, likely favors creators, because things like cartoons, movies, and songs are considered far more creatively expressive than, say, factual news.

The third factor is more of a toss-up. Judges have found that copying entire books did not use an undue amount of the work for the purpose of training LLMs—at least when the amount of identical output maxed out at 50 words of the entirely-copied books, even after “adversarial” prompting designed to maximize exact reproductions.[5] Here, there is an open question as to whether OpenAI’s generation of identical characters might be viewed as more extensive reproduction than just recreating a small snippet of a long book.

Ironically, OpenAI may also point to its initial “opt-out” policy in support of its fair use defense for infringing videos. Legal commentators quickly protested “that’s not how copyright works” when OpenAI announced its opt-out policy. They probably are—and certainly should be—right. But OpenAI was likely cribbing from an old Google playbook. In 2004, when Google scanned millions of books to make them searchable online, Google adopted an opt-out policy for authors and publishers as part of a failed attempt to stave off class action litigation. Google’s legal position was never fully tested before settlement, but some friendly scholars proposed ways that an opt-out program could support a new fair use defense (even as others pointed out that fostering a “world with a large and ever-changing list of opt-out programs” would overwhelm creators). This is likely not a strong argument, but uphill odds often do not deter powerful players with money to burn on novel theories.

In any event, the fair use factors are not weighted equally. The “most important” factor is the last one: will the use harm the market for—and value of—the original creative work? Put differently, will it ruin the incentive for creators to keep creating?

Two judges in the same federal district recently took very different approaches to this question, even as they both ultimately ruled that plaintiffs failed to bring adequate claims against generative AI defendants. First, Judge William Alsup opined that training large language models poses no more harm to the market for copyrighted books than “training schoolchildren to write well” and eventually produce competing works. Such protectionism, he claimed, was inconsistent with the goals of the Copyright Act: namely, promoting the progress of science and the arts, without diminishing the incentive to create.

Just two days later, Judge Vince Chhabria shot back—in a separate case concerning similar allegations—that Alsup’s “inapt analogy” was “not remotely like using books to create a product that a single individual could employ to generate countless competing works with a miniscule fraction of the time and creativity it would otherwise take.” Certainly, it was “not a basis for blowing off” a thorough analysis of the fourth fair use factor, or whether the alleged infringement deprives a creator of their ability to earn money from their copyrighted work. Chhabria could easily imagine a situation where a user generates highly transformative outputs that are “similar enough (in subject matter or genre) [to the original] that they will compete with the originals and thereby indirectly substitute for them.” In that instance, a defendant with a transformative technology “nonetheless loses on fair use because allowing people to engage in that kind of use would have too great an effect on the market for the original work.” The plaintiffs in Chhabria’s case failed to support that argument well enough to survive summary judgment. But the studios and creators OpenAI just provoked might be able to develop a stronger record.

Several major studios already have other pending litigation where they can learn from the mistakes of the plaintiffs in Chhabria’s case. Disney and Universal, and then Warner Brothers, recently sued Midjourney—which, like OpenAI, offers AI image and video generation services.

How Disney expresses its feelings about copyright infringement.

In those cases, the plaintiffs noted that Midjourney could block the generation of copyrighted content if it wanted to—as evidenced by the existence of infringement-blocking by other AI services, and Midjourney’s own blocking of other content, such as porn—but apparently chose not to do so. Midjourney’s answer, meanwhile, echoed a now-common refrain for generative AI defendants: “copyright must give way to fair use, which safeguards countervailing public interests in the free flow of ideas and information.” Midjourney also refused to accept any blame for what its users do, because “legitimate, noninfringing grounds to create images incorporating characters from popular culture… including non-commercial fan art, experimentation and ideation, and social commentary and criticism. Plaintiffs seek to stifle them all.”

To be fair, existing copyright law is far from perfect—not least because it has strayed far from the limited duration originally envisaged by Thomas Jefferson (19 years, equivalent to one generation), and instead turbocharges corporate monopolies and creates artistic aristocracies by extending the term decades past the life of any human creator.

But lawmakers and judges should not give tech oligarchs yet another free pass to trample on the rights of others at their whims. Amidst robust debate at the Copyright Office and beyond about the copyrightability of works generated via AI, any reform should focus on how best to incentivize human creativity, not just because the resulting output is desirable to consumers, but because there is societal value—especially in a democracy—in encouraging people to “think different” and giving those who do a fair shot to support themselves through their creative endeavors.

[1] Entirely coincidentally, this happened to be the user’s only Reddit post or comment in the account’s three-year existence. Definitely not sus! In any event, some other users—with more cromulent account histories—complained about Sora 2’s new guardrails as well.

[2] Generation of manga, anime, and Nintendo characters was particularly prolific, leading Altman to write “we are struck by how deep the connection between users and Japanese content is!” Such mass reproduction was especially brazen given that Japan had already made efforts to meet the AI industry’s reform desires halfway by amending their laws in 2019 to add an “AI-friendly copyright exception for data training.” But even an “AI-friendly” country has limits, as Japan’s minister of state for IP and AI made clear.

[3] OpenAI may argue that it is just providing a tool and not inducing users to infringe. Given that employees publicly spread infringing videos they made with Sora 2, this may be a difficult argument to win. More importantly, OpenAI would likely fail to get a lawsuit dismissed entirely based on that distinction, because OpenAI likely directly infringed when their own employees created videos at OpenAI’s direction, and may have also infringed when uploading content if any content was torrented from piracy sites, and possibly during their model training process as well. As discussed in this post, a fair use defense will be the most hotly disputed issue in litigation.

[4] Thus, OpenAI’s controversial transition from nonprofit to for-profit status seems unlikely to tip the balance on factor one either way.

[5] Different LLMs vary in their ability to prevent identical output. Law professor Mark Lemley and his computer scientist co-authors found that in Meta’s LLAMA model “the first Harry Potter is so memorized that, using a seed prompt consisting of just the first few tokens of the first chapter, we can deterministically generate the entire book near-verbatim.”