Judge calls out OpenAI’s “straw man” argument in New York Times copyright suit

05,04,25
After The New York Times sued OpenAI in December 2023—alleging that ChatGPT outputs violate copyrights by regurgitating news articles—the ChatGPT maker tried and failed to argue that the claims were time-barred. According to OpenAI, the NYT should have known that ChatGPT was being trained on its articles and raised its lawsuit in 2020, partly because of the […]
OpenAI loses bid to dismiss NYT claim that ChatGPT contributes to users’ infringement.

After The New York Times sued OpenAI in December 2023—alleging that ChatGPT outputs violate copyrights by regurgitating news articles—the ChatGPT maker tried and failed to argue that the claims were time-barred.

According to OpenAI, the NYT should have known that ChatGPT was being trained on its articles and raised its lawsuit in 2020, partly because of the newspaper's own reporting. To support this, OpenAI pointed to a single November 2020 article, where the NYT reported that OpenAI was analyzing a trillion words on the Internet. But on Friday, US district judge Sidney Stein disagreed, denying OpenAI's motion to dismiss the NYT's copyright claims partly based on one NYT journalist's reporting.

In his opinion, Stein confirmed that it's OpenAI's burden to prove that the NYT knew that ChatGPT would potentially violate its copyrights two years prior to its release in November 2022. And so far, OpenAI has not met that burden.

"The fact that one of The Times’s reporters discussed OpenAI’s" AI training fails to make it clear that the newspaper knew that ChatGPT outputs years later could possibly regurgitate the NYT's reporting, Stein explained.

And OpenAI's other argument—that it was "common knowledge" that ChatGPT was trained on NYT articles in 2020 based on other reporting—also failed for similar reasons.

"OpenAI fails to explain why the articles, even if their existence had been known to plaintiffs at the time of their publishing, are sufficient to put plaintiffs on notice of the particular infringing conduct by defendants that provides the basis for plaintiffs’ claims," Stein wrote, which "involve the specific copying of plaintiffs’ works by OpenAI."

Essentially, the judge agreed with the NYT that OpenAI has not yet provided any evidence that the newspaper knew how ChatGPT would perform until the product was out in the wild. Therefore, he denied OpenAI's motion to dismiss those claims as time-barred, while denouncing as a "straw man" an OpenAI argument that the NYT, "as a 'sophisticated publisher,' had a duty 'to take prompt action after being put on notice of what it now claims to be alleged infringement.'"

OpenAI may still be able to prove through discovery that the NYT knew that ChatGPT would have infringing outputs in 2020, Stein said. But at this early stage, dismissal is not appropriate, the judge concluded. The same logic follows in a related case from The Daily News, Stein ruled.

Davida Brook, co-lead counsel for the NYT, suggested in a statement to Ars that the NYT counts Friday's ruling as a win.

"We appreciate Judge Stein's careful consideration of these issues," Brook said. "As the opinion indicates, all of our copyright claims will continue against Microsoft and OpenAI for their widespread theft of millions of The Times’s works, and we look forward to continuing to pursue them."

OpenAI may contribute to ChatGPT users’ infringement

The New York Times is also arguing that OpenAI contributes to ChatGPT users' infringement of its articles, and OpenAI lost its bid to dismiss that claim, too.

The NYT argued that by training AI models on NYT works and training ChatGPT to deliver certain outputs, without the NYT's consent, OpenAI should be liable for users who manipulate ChatGPT to regurgitate content in order to skirt the NYT's paywalls.

To win this claim, the NYT claims all that's required is a showing that OpenAI had reason to know that ChatGPT could be used this way, due to media reports and the NYT contacting OpenAI directly. But OpenAI thinks a more heightened standard should apply, and the NYT must prove that OpenAI had "actual knowledge" of or "willful blindness" to the alleged infringement.

At this stage, Stein said that the NYT has "plausibly" alleged contributory infringement, showing through more than 100 pages of examples of ChatGPT outputs and media reports showing that ChatGPT could regurgitate portions of paywalled news articles that OpenAI "possessed constructive, if not actual, knowledge of end-user infringement." Perhaps more troubling to OpenAI, the judge noted that "The Times even informed defendants 'that their tools infringed its copyrighted works,' supporting the inference that defendants possessed actual knowledge of infringement by end users."

"Taken as true, these facts give rise to a plausible inference that defendants at a minimum had reason to investigate and uncover end-user infringement," Stein wrote.

To Stein, the fact that OpenAI maintains an "ongoing relationship" with users by providing outputs that respond to users' prompts also supports contributory infringement claims, despite OpenAI's argument that ChatGPT's "substantial noninfringing uses" are exonerative.

OpenAI defeated some claims

For OpenAI, Stein's ruling likely disappoints, although Stein did drop some of NYT's claims.

Likely upsetting to news publishers, that included a "free-riding" claim that ChatGPT unfairly profits off time-sensitive "hot news" items, including the NYT's Wirecutter posts. Stein explained that news publishers failed to plausibly allege non-attribution (which is key to a free-riding claim) because, for example, ChatGPT cites the NYT when sharing information from Wirecutter posts. Those claims are pre-empted by the Copyright Act anyway, Stein wrote, granting OpenAI's motion to dismiss.

Stein also dismissed a claim from the NYT regarding alleged removal of copyright management information (CMI), which Stein said cannot be proven simply because ChatGPT reproduces excerpts of NYT articles without CMI.

The Digital Millennium Copyright Act (DMCA) requires news publishers to show that ChatGPT's outputs are “close to identical” to the original work, Stein said, and allowing publishers' claims based on excerpts "would risk boundless DMCA liability"—including for any use of block quotes without CMI.

Asked for comment on the ruling, an OpenAI spokesperson declined to go into any specifics, instead repeating OpenAI's long-held argument that AI training on copyrighted works is fair use. (Last month, OpenAI warned Donald Trump that the US would lose the AI race to China if courts ruled against that argument.)

"ChatGPT helps enhance human creativity, advance scientific discovery and medical research, and enable hundreds of millions of people to improve their daily lives," OpenAI's spokesperson said. "Our models empower innovation, and are trained on publicly available data and grounded in fair use."

Source: https://arstechnica.com/tech-policy/2025/04/judge-doesnt-buy-openai-argument-nyts-own-reporting-weakens-copyright-suit/

linkedin facebook pinterest youtube rss twitter instagram facebook-blank rss-blank linkedin-blank pinterest youtube twitter instagram