AI detection tools analyse text and tell you whether it was likely written by a human or by AI. But how do they actually work? And why do they sometimes get it wrong?

Here's a plain English explanation, no computer science degree required. If you want to see how a detector actually scores a real piece of writing, isitai.co.uk runs three free scans a day and isitai.co.uk/methodology documents exactly what gets measured.

The basic idea

AI-generated text has statistical properties that are different from human-written text. Detectors measure these properties and calculate a probability.

Think of it like handwriting analysis, but for writing style. AI has a "handwriting", consistent patterns in how it constructs sentences, chooses words, and structures arguments. Detectors look for this handwriting.

What do AI detectors actually measure?

Sentence uniformity

Humans write sentences of wildly varying lengths. A short punchy statement followed by a long, winding explanation followed by a fragment. AI tends to write sentences of more consistent length, typically 15-25 words, paragraph after paragraph.

Detectors measure the variation in sentence lengths across your text. Low variation = more likely AI.

Vocabulary predictability

Given the start of a sentence, how predictable is the next word? AI models generate text by predicting the most likely next word (or one of the most likely). This makes AI text more predictable at the word level than human writing.

Humans use unexpected word choices, slang, personal expressions, and unusual combinations. AI sticks to the statistically safest option.

Structural patterns

AI text follows predictable structural patterns:

Introduction that states the topic

Body paragraphs with topic sentences

Transition words at paragraph boundaries

Conclusion that summarises the main points

Human writing is more varied. We digress, circle back, start with a story, end abruptly, or structure our arguments in unexpected ways.

Phrase frequency

Certain phrases appear much more often in AI text than in human text. "It is important to note", "in today's fast-changing landscape", "look at", "multifaceted", these are statistical fingerprints of AI generation.

Detectors maintain databases of these phrases and score text based on how many appear.

What are the two types of AI detection?

Statistical analysis

This approach measures the mathematical properties of text, sentence lengths, vocabulary diversity, phrase patterns. It's fast, cheap, and doesn't require AI to run. It catches the most obvious AI text but can miss well-edited output.

AI-based analysis

This approach uses an AI model (like Claude or GPT) to read the text and assess whether it was AI-generated. The idea is that AI is good at recognising its own kind of writing. This catches subtler patterns that statistical analysis misses, and can identify which specific passages look AI-generated and explain why.

The most effective detection combines both approaches, statistical analysis for the obvious patterns, AI analysis for the subtle ones.

Why do AI detectors get it wrong?

False positives (flagging human text as AI)

This happens when human writing shares statistical properties with AI text. Formal academic writing, ESL writing, and heavily edited writing all have lower variation and more predictable structure, the same features detectors look for in AI text.

False negatives (missing AI text)

This happens when AI text has been edited to remove its statistical fingerprints. Adding personal anecdotes, varying sentence lengths, and removing formulaic phrases can make AI text look human to a detector.

As AI models improve, their output naturally becomes harder to detect because it more closely mimics human writing variation.

What do AI detection scores actually mean?

When a detector says "85% AI likelihood", it means: "Based on the statistical patterns in this text, there is an 85% probability that it was generated by AI."

It does NOT mean:

"85% of this text was written by AI"

"We are 85% certain this student cheated"

"This text is 85% similar to known AI output"

It's a probability based on pattern matching, not a measurement of authorship.

Why do flagged passages matter more than scores?

A score of 78% tells you very little. But seeing that specific passages were flagged, "This sentence uses formulaic hedging language typical of AI output", tells you something actionable.

If the flagged passages are just standard academic writing, the score is probably a false positive. If they're suspiciously polished with no personal voice and generic examples, there's something worth investigating.

This is why Is It AI? shows flagged passages with explanations rather than just a score. Knowing why text was flagged is more useful than knowing how much was flagged.

The point

AI detection isn't magic, it's pattern matching. It works well for unedited AI text and poorly for edited, mixed, or formally written text. Scores are probabilities, not verdicts. And flagged passages with explanations are more useful than percentages. If you want to test this on a real piece of writing, the free scanner at isitai.co.uk shows you the flagged passages alongside the score so you can see which sentences actually moved the number.

Try Is It AI? free, see the patterns in any text, explained in plain English.