The New York Times Sues Microsoft, OpenAI for Copyright Infringement

Paul Thurrott
Dec 27, 2023
25

Excerpt from The Times legal filing

The New York Times has sued Microsoft and OpenAI for copyright infringement and is seeking “billions of dollars in statutory and actual damages.” The issue? The companies trained their AI on “millions” of New York Times articles that require paid subscription access.

“Defendants’ unlawful use of The Times’s work to create artificial intelligence products that compete with it threatens The Times’s ability to provide … trustworthy information, news analysis, and commentary,” a 69-page legal filing with a U.S. District Court in New York City reads. “Defendants’ generative artificial intelligence (GenAI) tools rely on large-language models (LLMs) that were built by copying and using millions of The Times’s copyrighted news articles, in-depth investigations, opinion pieces, reviews, how-to guides, and more. While Defendants engaged in widescale copying from many sources, they gave Times content particular emphasis when building their LLMs—revealing a preference that recognizes the value of those works.”

With this filing, The Times became the first major media organization in the United States to stand up to the Big Tech firms that are now steamrolling their business models to collect valuable, accurate data that they repackage and sell to their own customers. This has been happening for years—Google infamously scrapes entire articles from publishers big and small to populate its search results—but the data consumption needs of AI threaten to exponentially expand that theft.

The Times also revealed in its filing that this suit came about after negotiations with Microsoft and OpenAI broke down. The publication says it approached the AI giants with its intellectual property rights concerns in April to see if they could reach an “amicable resolution” that included both financial compensation and “technological guardrails” that would prevent future theft. But Microsoft and OpenAI “refused to recognize” The Time’s Constitutional copyright protections and chose instead to “generate output that recites Times content verbatim, closely summarizes it, and mimics its expressive style” and to do so without permission. The firm told The Times that using its copyrighted content to train their AI was “fair use.”

“Microsoft’s deployment of Times-trained LLMs throughout its product line helped boost its market capitalization by a trillion dollars in the past year alone,” the filing explains. “And OpenAI’s release of ChatGPT has driven its valuation to as high as $90 billion. The Defendants’ GenAI business interests are deeply intertwined, with Microsoft recently highlighting that its use of OpenAI’s ‘best-in-class frontier models’ has generated customers—including ‘leading AI startups’—for Microsoft’s Azure AI product.”

Based on the similar battles that Google fought around the world for years and the publisher partnerships that have arisen in their wake, I’m guessing that Microsoft, OpenAI, and other AI makers will need to compensate The New York Times and other publishers accordingly. And that this disagreement is in essence a stalling tactic that the firms are using to ensure that their offerings can advance as quickly as possible before regulators and lawmakers crack down on this illegal practice.

The Times filing is worth reading for all kinds of reasons, but among other things, it provides an undistorted history of OpenAI, its strategies and offerings, and its strange relationship with Microsoft. It also explains how LLMs and the resulting chatbots work, and how the various ChatGPT generations grew in size over the years. But the most compelling part, perhaps, is a set of examples that shows how the output from ChatGPT and Microsoft Copilot doesn’t just mimic The Times’ content but rather spits it back almost verbatim.

I certainly have my issues with The New York Times and the quality of its work. But as the publication correctly asserts, there is nothing “transformative” about what OpenAI and Microsoft are doing. Unless of course you consider copyright infringement at scale to be transformative.

Tagged with

About author

Paul Thurrott

Paul Thurrott is an award-winning technology journalist and blogger with 30 years of industry experience and the author of 30 books. He is the owner of Thurrott.com and the host of three tech podcasts: Windows Weekly with Leo Laporte and Richard Campbell, Hands-On Windows, and First Ring Daily with Brad Sams. He was formerly the senior technology analyst at Windows IT Pro and the creator of the SuperSite for Windows from 1999 to 2014 and the Major Domo of Thurrott.com while at BWW Media Group from 2015 to 2023. You can reach Paul via email, Twitter or Mastodon.

View Articles

Currently on Forums
Visit the forums
- [CLOSED] Ask Paul for Friday, June 19
  Posted by Paul Thurrott
  
  5
  comments
- Microsoft Office 365 Desktop Apps – Upgrade your plan banner
  Posted by Lee Thacker
  
  4
  comments
- Screen Capture
  Posted by Greg Edwards
  
  3
  comments
- Statement on the US government directive to suspend access to Fable 5 and Mythos 5
  Posted by Jogy
  
  6
  comments
Podcasts
Podcast Hub
- First Ring Daily 1979: The Change is Not Easy
  
  Aired on June 18, 2026 by Brad Sams with 0 Comments
- First Ring Daily 1978: Surface It
  
  Aired on June 17, 2026 by Brad Sams with 1 Comment
- First Ring Daily 1977: The Way of GPU
  
  Aired on June 15, 2026 by Brad Sams with 3 Comments
- Windows Weekly 987: SelfLoathing.md
  
  Aired on June 11, 2026 by Paul Thurrott with 2 Comments
Join the crowd where the love of tech is real - become a Thurrott Premium Member today!

Explore Premium Benefits

Tagged with

Share post