Jan 9, 2024

AI Content Detectors Are Not Reliable

A follow-up note on why AI text detectors can fail, from perplexity and burstiness limits to false positives and bias against non-native writers.

AI content detectors are not reliable.

Some time ago I wrote a post on how to detect AI-written texts. I finished it with a conclusion that we should approach AI detectors with a degree of skepticism and an understanding of their limitations. Many texts, due to their formal style and repetitive nature, may never be fully distinguishable as human or AI-generated.

This issue has real-world implications. Still, many students get unfairly accused of using ChatGPT to generate assignments whereas others cheat and are never caught.

A real-world example

One of the Redditors shared his story about how a history teacher accused him of using ChatGPT to write his essay. The accusation was based on the output of an AI content detector. Sadly, even when the student showed their Google Docs revision history to prove his innocence, the professor still did not believe him.

Why AI content detection tools are not always reliable

Let's take a step back here. AI text detectors often rely on "perplexity" and "burstiness" metrics.

Perplexity measures how probable or predictable a text is. Text that closely matches the training data tends to have low perplexity.

Burstiness compares the variation in sentence length and structure within a text. Human writing tends to show more variability, while AI-generated text can be more regular.

Where detectors fail

False positives: Some studies report very high false positive rates reaching even 50%, where human-written text is incorrectly classified as AI-generated.
False negatives: Some sources show that AI-generated text can be indistinguishable from a human one 20-30% of the time just by optimizing the prompts.

Every reported result depends on the benchmark used and also a type of generated context. Getting a 2% error on some datasets does not mean that the AI detector has 98% accuracy in general.

Moreover, those tools tend to have high biases against non-native writers. Past research highlights that AI detectors often falsely flag non-native English speakers as using AI, simply due to unique linguistic patterns. This could lead to unfair outcomes if relied upon.

We have to understand that while these detectors use advanced algorithms, they are not perfect. They might work well on some documents, and fail on others.

Original post: LinkedIn

← AI explained