OpenAI could watermark the text ChatGPT generates, but hasn't

Except that it doesn’t reliably “easily identify low-effort uses”, and once introduced will immediately induce low-effort circumventions. If one dials the detector gain it starts generating false positives, and that causes real harm to real people, as opposed to making some SEO optimizers job trivially more difficult.

Language is simply too malleable a medium for these kinds of counter measures to work. Ever.

OpenAI made the right decision by not deluding the public into believing they could work. Other companies have chosen another route, and the results have been entirely as expected.

Here is but one experiment on the topic:

No public AI text detector we tested scored better than random chance. Results were very unstable, with small changes to input text flipping detections in both directions. LLMs also failed to reliably detect LLM output in our tests.

Now, these are external tools, not internal watermarking, but the lessons are the same: this isn’t a solvable problem and it never will be.

1 Like