We Need to Discuss OpenAI’s Response to the New York Times (Premium)

OpenAI today responded to The New York Times' copyright infringement lawsuit, but I feel that it bungled a chance to set the record straight, in part by admitting to the crime at the center of the paper's allegations.

"While we disagree with the claims in The New York Times lawsuit, we view it as an opportunity to clarify our business, our intent, and how we build our technology," a new OpenAI blog post notes. "Our "position can be summed up in four points: We collaborate with news organizations and are creating new opportunities; training is fair use, but we provide an opt-out because it’s the right thing to do; 'regurgitation' is a rare bug that we are working to drive to zero; [and] The New York Times is not telling the full story."

Um.

By explaining that "regurgitation"---what The New York Times called the "verbatim" recitation of its content as demonstrated by multiple examples---is real, OpenAI has explicitly admitted to the publication's central allegation: ChatGPT literally can and does republish copyrighted content, as it charged. And while this bug may or may not be rare---here, one might say that it is OpenAI "not telling the full story"---it is apparently enough of an issue that the company is "working to drive to zero," meaning they are working to eliminate it. But the NYT complaint is about what OpenAI does now, not what it may or may not do in the future. And right now, OpenAI just admitted that the publication is correct.

Anyway. OpenAI says that it has met with "dozens" of news organizations to "opportunities, discuss their concerns, and provide solutions." And while it doesn't specifically mention paying to access copyrighted content, it does point to its partnerships with just four organizations, the Associated Press, Axel Springer, American Journalism Project, and NYU as "a glimpse into [its] approach." So OpenAI has agreed to license content from those four sources, an act that involves a one-time fee or ongoing payments. But no parties have ever revealed the financial terms of these deals. Yes, I looked.

OpenAI then claims that "training AI models using publicly available internet materials is fair use," an emphatic shift from its earlier, pre-lawsuit position that "[OpenAI] believes that the training of AI models qualifies as a fair use." But now, it has expanded on that claim, noting that AI training is somehow "fair to creators, necessary for innovators, and critical for US competitiveness" without ever trying to explain why. It doesn't matter: The only legal issue in all that is the fair use bit, and as I noted earlier, that legal standard is in fact not clear-cut at all.

(Fair use is like pornography in that you always know it when you see it. For example, a movie critic playing clips from a movie while reviewing it is an example of fair use; playing the entire movie, discussing it afterward, and not paying its owners is not fair use. That's obvious. But the tricky bit is finding the line wher...

Gain unlimited access to Premium articles.

With technology shaping our everyday lives, how could we not dig deeper?

Thurrott Premium delivers an honest and thorough perspective about the technologies we use and rely on everyday. Discover deeper content as a Premium member.

Tagged with

Share post

Please check our Community Guidelines before commenting

Windows Intelligence In Your Inbox

Sign up for our new free newsletter to get three time-saving tips each Friday

"*" indicates required fields

This field is for validation purposes and should be left unchanged.

Thurrott © 2024 Thurrott LLC