AI for Normal People Part 5: When to Trust AI Answers

AI is brilliant at some things and confidently, fluently wrong about others. The tricky part — and the reason smart people keep getting caught out — is that the bad answers sound exactly like the good ones. Same crisp tone. Same neat structure. Same air of quiet authority. So the real skill isn't using ChatGPT. It's knowing when to trust it and when to double-check.

This is Part 5 in the AI for Normal People series. Earlier parts covered what AI actually is, how to use ChatGPT, whether it's safe, and how to spot AI-generated content. This one is about you using AI — and not getting burned by it.

Why AI sounds so sure when it's wrong

It's not lying — it just doesn't know what it doesn't know

Large language models are extremely good at one specific thing: predicting what word comes next, given everything that came before. That's it. They are not databases. They are not search engines. They are pattern-matchers trained on a vast amount of text, generating the statistically plausible continuation.

That works beautifully when the right continuation is well-represented in the training data — common facts, well-trodden explanations, standard advice. It falls apart at the edges: specific numbers, named people in narrow fields, anything that happened after the model finished training, anything where the surface form looks familiar but the underlying details are subtly different from what the model has seen before.

The result is what people call hallucinations: outputs that sound right, look right, are structured right, and are wrong. The model isn't trying to deceive you. It just generated a fluent continuation, the same way it always does, and there was no internal alarm to say this bit is made up.

The three places AI goes wrong

If you know what they look like, you can spot them in real time

1. Made-up specifics. Numbers, dates, statute names, study citations, quotations, URLs. The fingerprint here is precision — a vague summary is usually fine, but the moment the AI says 'a 2023 Harvard study found that 73% of users…', your skepticism should kick in. The study may not exist; the percentage may be invented; the year may be off. The fluency masks all of it.

2. Stale knowledge. Every model has a training cutoff — a date after which it knows nothing. If you ask about a product launched last week, a person who became famous last month, or a law that changed last year, the model will either say it doesn't know or make something plausible up. Both responses look the same on the page. ChatGPT's web-search mode and Perplexity reduce this risk by fetching live results, but if web search isn't running, you're back in the cut-off zone.

3. Confident analysis of incomplete information. You ask 'should I sell my house now?' and you get a thoughtful, structured answer that sounds wise — but the AI doesn't know your tax position, the local market, or your mortgage. It will still answer confidently because that's what its training rewarded. The structure of the answer is a tell: it looks like advice, but the inputs were never enough to actually advise you.

Five quick verification checks

Most of these take under 30 seconds

✓

Specific numbers and dates → spot-check one.

If the AI gives you a list of stats, open a new tab and verify the first one. If it's wrong, assume the rest are too — and re-prompt with 'cite sources I can verify'.

✓

Direct quotes → search the exact words.

Paste 8–10 words from the quote into Google in quotation marks. If nothing comes back, the quote is fabricated.

✓

URLs → just click them.

AI invents URLs that follow the right pattern but go nowhere. A dead link is a tell that the rest of the section may have been confabulated too.

✓

Date-sensitive claims → check the model's cut-off.

Most models will tell you their training cut-off if you ask. If your question concerns events after that date, the answer is at best a guess unless web search is involved.

✓

Legal, medical, financial → ask a human.

AI is fine for explaining concepts (what is capital gains tax?, what's a beta blocker?). It is not fine for personal advice. Use it to prepare questions for a professional, not to replace one.

Where AI is reliably good (and you don't need to verify)

The whole point is to save time — not to second-guess everything

The five-rules list above can read as if AI is too risky to use casually. It isn't. There is a large class of tasks where modern models perform reliably well, and demanding citations and cross-checks for these is just friction.

Rephrasing and summarising your own input. Paste in an email you wrote, ask for a clearer or shorter version. The AI isn't adding facts; it's reshaping what's already there. Low risk.

Format conversions. Markdown to HTML, bullet points to paragraphs, a paragraph into a table. The structure is the task; the content is just being reorganised.

Brainstorming options. Ask for ten possible names for a project, ten ways to phrase a difficult message, ten dishes you could make with what's in the fridge. The point is to surface ideas, not to find the One True Answer — your judgement filters them afterwards.

Explaining well-known concepts in plain English. What's a mortgage offset account? What does mitochondrial DNA actually do? Why is RAM different from storage? Stable subject matter, well-represented in training data. Reliable.

First drafts of anything. You're going to edit, anyway. The AI giving you a flawed 300-word starting point that you then rewrite is faster than staring at a blank page. Errors get caught in the editing pass.

Coding for established APIs and patterns. AI is very good at standard code — common framework patterns, regex, SQL, shell one-liners. Verify by running the code, not by checking it line-by-line against documentation. If it works, it works.

Two real-world examples

Example 1: The confidently wrong tax answer

Someone asks ChatGPT: 'What's the current ISA allowance for the 2026/27 tax year in the UK?' The AI replies with a clear, confident figure and a one-sentence explanation. The figure may be correct, may be last year's number, or may be entirely invented depending on how recent the model's training is and whether web search is on. Either way, this is a high-stakes specific fact — taxable consequences if you act on it.

The verification is one search: type 'ISA allowance 2026/27' into the gov.uk search. The official page is two clicks away. Total time: 20 seconds. Trust the AI for the concept of how ISAs work; trust gov.uk for the number.

Example 2: The genuinely useful summary

You paste a 600-word internal email into ChatGPT and ask for a three-bullet summary for someone who didn't have time to read it. The AI returns the bullets. There's nothing here to verify — the inputs were yours, the outputs are a compressed reshape of those inputs. Read the bullets, send if they capture the gist, edit if they don't. Total verification time: zero.

The difference between the two examples is whether the AI is generating facts or reorganising your own inputs. Generating facts → verify. Reorganising → just use it.

One simple decision rule

If you remember nothing else from this post, remember this

Frequently asked questions

Do newer models hallucinate less?

Generally yes. Each generation of GPT, Claude, and Gemini has cut hallucination rates compared to the previous one, and tool-using modes (web search, code execution, retrieval) reduce them further still by grounding answers in actual sources. But none of the models are at zero. A 'much rarer than before' hallucination is still a hallucination, and you can't tell which output is the rare one without checking.

Should I just turn web search on and forget about all this?

Web search helps a lot for date-sensitive questions and reduces fabricated citations, because the AI is now summarising real pages it just fetched. But it introduces its own failure mode — the AI can summarise a low-quality web page and present that summary with the same confident tone. Garbage in, confident summary out. Web search is a strong upgrade, not a magic fix.

What about coding — do I need to verify everything the AI writes?

Verify by execution, not by reading. Run the code, run the tests, see if it does what you asked. AI is very good at producing syntactically correct, plausible-looking code that has a subtle bug. A failing test catches that instantly; line-by-line code review often doesn't. If you don't have tests, write a quick one before trusting the output.

Can I ask ChatGPT to fact-check its own answers?

Limited usefulness. The model isn't checking against an external truth — it's just generating another plausible response. It will sometimes catch its own errors (especially obvious ones), and asking 'are you sure?' often produces a more cautious second pass. But trusting an AI to verify itself is fundamentally a closed loop. Real verification means leaving the chat window and checking somewhere else.

Is there a way to know when the model is uncertain?

Some models will explicitly hedge ('I'm not sure, but…', 'as of my training data…') and those are honest signals — pay attention to them. But many will state uncertain things confidently because that's the trained behaviour. Prompting the model to 'rate your confidence in this answer 1-10 and explain' helps a little; pairing the AI's answer with one quick external check helps much more.

How does this all link back to Part 4 of this series?

Part 4 (How to Spot AI Content) was about detecting AI-generated text that someone else has produced and is showing you. This post (Part 5) is the inverse: you are the one using the AI, and you want to avoid passing along its mistakes. Both posts ultimately hinge on the same skill — noticing the difference between fluency and accuracy. Once you can do that, AI tools become much more useful and much less dangerous.

Where the AI for Normal People series goes from here

The first five parts cover the foundations: what AI is, how to use it, whether it's safe, how to spot AI content, and how to verify AI output. From here the series gets more practical — specific workflows, comparison of tools, and the everyday use cases that actually save time.

If you missed any of the earlier parts: Part 1 — What Is AI, Actually? · Part 2 — Getting Started with ChatGPT · Is AI Safe? An Honest, Non-Scary Guide · Part 4 — How to Spot AI Content.

More AI explainers, no hype

Practical guides for using AI day-to-day without falling for it.

Browse AI for Normal People