OpenAI Built Text Watermarking Solution to Detect AI-Generated Content, But May Not Release It

OpenAI ChatGPT

OpenAI has developed a text watermarking solution to detect text created by ChatGPT, but the company isn’t ready to release it yet. According to a report from the Wall Street Journal, the tool has been ready for about a year and it could help to surface ChatGPT-generated text in essays or research papers with a high level of accuracy. However, the report says that the company has been internally debating about the potential impact of its anti-cheating tool for about two years.

The WSJ report details that OpenAI’s text watermarking technology works by slightly changing how ChatGPT selects the word or word fragments to come next in a sentence. As a result, these changes to the AI system would leave a trace that only OpenAI’s tool can detect. According to internal documents seen by the WSJ, these watermarks are 99.9% effective in detecting ChatGPT-generated content when enough text has been created with it, though these watermarks could still be erased with the use of translation tools.

OpenAI confirmed in a statement the existence of this text watermarking solution, adding that it’s still being evaluated for efficiency and externalities. “The text watermarking method we’re developing is technically promising but has important risks we’re weighing while we research alternatives,” the company’s spokesperson told the WSJ. “We believe the deliberate approach we’ve taken is necessary given the complexities involved and its likely impact on the broader ecosystem beyond OpenAI.”

Following the publication of the WSJ report, OpenAI also published an update to a blog post from May detailing its content provenance solutions, spotted by TechCrunch. In the update, the company explained that there are different ways to circumvent its watermarking technology via “global tampering.” These methods include “using translation systems, rewording with another generative model, or asking the model to insert a special character in between every word and then deleting that character.”

Last but not least, the company emphasized that the use of its anti-cheating tool could be discriminatory towards certain groups of ChatGPT users. “For example, it could stigmatize use of AI as a useful writing tool for non-native English speakers,” the company emphasized.

While OpenAI continues to evaluate risks related to its text watermarking technology, the company also continues to work on other tools for to detecting AI-generated images and videos. The DALL·E maker is currently prioritizing this work as it considers that audiovisual content present “higher levels of risk.”

Tagged with

Share post

Thurrott