1.5 Hallucinations: the predictable failure mode
Overview and links for this section of the guide.
On this page
- What “hallucination” means (in practice)
- Why hallucinations happen
- A practical taxonomy (types of hallucinations)
- Hallucination risk by task type
- How to detect hallucinations early
- How to prevent hallucinations (workflow & prompt tactics)
- The verification ladder (what counts as evidence)
- A hallucination recovery loop
- Using tools to reduce hallucinations
- Structured output as an anti-hallucination tactic
- Copy-paste templates
- Where to go next
What “hallucination” means (in practice)
In the LLM world, hallucination doesn’t mean “random nonsense.” It usually means the model produced something that sounds plausible but is not grounded in reality or in the context you provided.
In vibe coding, hallucinations show up as:
- APIs, settings, or commands that don’t exist.
- Code that compiles in theory but fails in your environment.
- Confident explanations for behavior that isn’t actually happening.
- “It should work” fixes that miss the real root cause.
Hallucination is a predictable failure mode of next-token generation. Your goal is not to “avoid it forever.” Your goal is to build a loop where hallucinations are caught quickly and cheaply.
Why hallucinations happen
Hallucinations are the natural result of how LLMs work. Common causes include:
1) No built-in truth-checking
By default, the model is generating a plausible continuation of text. Unless it has tool access and you ask it to verify, it cannot “check the world.”
2) Ambiguity + pressure to be helpful
If you ask a vague question (“Why is my API failing?”) the model often fills in missing specifics with plausible assumptions. The output can look confident because confident text is common in tutorials and docs.
3) Missing or polluted context
If the model can’t see the relevant file, error output, or constraints, it will guess. If the thread contains conflicting old instructions, it may pick the wrong ones.
4) Variance settings and long outputs
Higher randomness and longer completions increase the chance of subtle errors. Even at low randomness, a model can hallucinate if the prompt doesn’t constrain it.
Most hallucinations aren’t obvious. They’re small: a wrong flag name, a missing edge case, a subtle off-by-one, a misremembered library behavior.
A practical taxonomy (types of hallucinations)
Not all hallucinations are equal. Classifying them helps you choose the right fix.
1) Factual hallucinations (invented facts)
- Example: “AI Studio has a button called X” when it doesn’t.
- Example: “This library exposes
foo()” when it doesn’t. - Fix: verify in docs/UI, or run a tiny reproduction. Prefer sources or tools.
2) API hallucinations (nonexistent methods/flags)
- Example: suggesting a CLI flag that isn’t supported.
- Example: using an SDK call that changed or was removed.
- Fix: consult current docs, run
--help, or inspect installed package versions.
3) Context hallucinations (misreading your inputs)
- Example: it claims a function exists in your file, but it doesn’t.
- Example: it references a variable name from an earlier draft.
- Fix: tighten the working set; provide the exact snippet; restart with a clean state summary if needed.
4) Reasoning hallucinations (plausible logic, wrong conclusion)
- Example: it explains a bug with a nice story that doesn’t match the logs.
- Example: it proposes a fix that “should work” but fails a test.
- Fix: force intermediate artifacts (tests, minimal reproduction, hypothesis list) and verify step-by-step.
5) Tool hallucinations (claiming actions or results)
- Example: “I ran the tests and they pass” when no tool run occurred.
- Example: it invents file contents.
- Fix: require the model to quote tool outputs verbatim, or you run the commands and paste results back.
6) Format hallucinations (structure drifts)
- Example: invalid JSON, missing fields, wrong enum values.
- Fix: tighten schema constraints, lower randomness, and validate outputs programmatically.
“The model hallucinated” is not actionable. “This is an API hallucination; we need to check docs and installed versions” is actionable.
Hallucination risk by task type
Some tasks are intrinsically higher risk than others. Use this to choose your approach.
Lower risk (still verify)
- Generating scaffolding boilerplate inside a known framework.
- Refactoring with strong tests and clear constraints.
- Writing docs that you’ll cross-check against code.
Medium risk (add constraints + checks)
- Designing APIs or schemas from scratch.
- Debugging from partial logs.
- Performance advice without profiling data.
High risk (treat as hypotheses)
- Security advice without threat model + code context.
- Anything involving money, auth, permissions, or data deletion.
- Exact “what button to click” guidance in a rapidly changing UI.
- Claims about external facts without sources.
LLMs are especially likely to hallucinate UI steps and labels. Prefer concept-driven guidance and verify by exploring the UI directly.
How to detect hallucinations early
Early detection saves time. Use these heuristics:
- Over-specific confidence: exact flags, exact UI labels, exact error messages—without evidence.
- Missing uncertainty: it doesn’t ask clarifying questions even though your prompt is ambiguous.
- No verification path: it proposes a fix but doesn’t specify what to run to confirm.
- Hand-wavy steps: “just configure X” without telling you where/how or what success looks like.
- Version blindness: it doesn’t ask which version of a library/tool you’re using.
“What would convince you this is correct?” is a high-leverage question. It forces the model to propose a verification plan instead of a story.
How to prevent hallucinations (workflow & prompt tactics)
Prevention is mostly about changing the workflow so the model has fewer opportunities to guess.
1) Write a tighter spec (even for tiny tasks)
- Inputs: what data comes in?
- Outputs: what shape comes out?
- Constraints: what must not change?
- Acceptance criteria: how will you verify it works?
Ambiguity creates hallucinations. Specs remove ambiguity.
2) Force small steps (diffs, not rewrites)
Hallucinations compound with size. Small diffs are easier to review and easier to validate.
3) Require clarifying questions and explicit assumptions
Tell the model that guessing is not allowed:
- “If anything is unclear, ask questions before writing code.”
- “List assumptions; if you can’t verify, mark it as unknown.”
4) Use examples as mini-tests
Examples turn a fuzzy request into something measurable. They reduce hallucinations by anchoring output behavior.
5) Use conservative generation settings for correctness
For diffs, debugging, and structured output: lower randomness, fewer candidates. Use higher diversity only when you’re exploring options.
Clear constraints, small diffs, explicit verification. “Prompt tricks” are secondary.
The verification ladder (what counts as evidence)
When the model makes a claim, decide what evidence you need. A good default ladder:
- Executable evidence: tests pass, the app runs, the bug reproduces/fixes, a script prints the expected output.
- Observable evidence: logs, traces, metrics, or screenshots that match the claim.
- Code evidence: the referenced function exists; the diff matches the described behavior.
- Documentation evidence: official docs or release notes that confirm the API/flag/behavior.
- Model evidence: explanations and reasoning (helpful, but the weakest evidence).
Explanations are valuable for understanding, but they don’t prove correctness. Always climb at least one rung higher.
A hallucination recovery loop
When you suspect a hallucination, use this loop to recover fast:
- Freeze the scope: restate the goal and constraints in one paragraph.
- Ask for hypotheses: “List 3 plausible causes ranked by likelihood.”
- Ask for tests: “For each hypothesis, what single check would confirm or falsify it?”
- Run one check: you or tools produce real evidence.
- Apply the smallest diff: fix only what the evidence supports.
- Verify again: run tests, reproduce the scenario, lock the behavior.
Require the model to label each hypothesis with a confidence level and what evidence would change its mind.
Using tools to reduce hallucinations
Tools turn “plausible” into “provable.” Even simple tools help:
- Search: verify whether a function/flag/file actually exists.
- Build/run: compile or run a minimal reproduction.
- Tests: lock correctness and prevent regressions.
- Linters/formatters: catch syntactic and style issues early.
A safe pattern
- Model proposes a plan + minimal diff.
- You run verification commands (or allow tool runs).
- You paste exact outputs back.
- Model refines based on evidence.
Give tools least privilege. Add budgets/timeouts. Require approval for destructive actions. “Tools reduce hallucinations” only works if tool use is safe and observable.
Structured output as an anti-hallucination tactic
Hallucinations often show up as “format drift” and hidden assumptions. Structured output forces precision:
- Schemas define what is allowed.
- Validation catches deviations immediately.
- Explicit fields can separate facts from guesses.
Example: force uncertainty into the structure
{
"claim": "string",
"confidence": "low|medium|high",
"evidence_needed": ["string"],
"assumptions": ["string"],
"next_check": "string"
}
This makes it harder for the model to “hide” uncertainty in a confident paragraph.
Structured output becomes critical when you build real apps. It’s how you stop “parsing vibes” and start enforcing contracts.
Copy-paste templates
Template: force assumptions and unknowns
Before answering:
1) List assumptions you are making.
2) List what you are uncertain about.
3) Ask the minimum set of clarifying questions.
Do not invent APIs, UI labels, or file contents.
Template: hypothesis-driven debugging
Given this error + context:
1) Provide 3 hypotheses ranked by likelihood.
2) For each hypothesis, provide one fast check to confirm/falsify it.
3) Propose the smallest diff to fix the most likely confirmed cause.
4) Tell me exactly what to run to verify.
Template: diff-only implementation
Implement this as a minimal diff.
Constraints:
- Do not rewrite unrelated code.
- If you need more context, ask for specific files/functions.
Deliverables:
1) Patch/diff
2) Verification steps
Template: “answer only from sources” mode
Answer using ONLY the information in the provided sources.
If the sources do not contain the answer, say "not enough information."
Do not guess or fill gaps.
Make these templates part of your “house rules” so you don’t have to retype them every session.