10.3 Using checklists inside prompts

Overview and links for this section of the guide.

Why checklists work with LLMs

LLMs are great at generating content, but they can skip steps, forget constraints, or over-focus on one part of a request. Checklists solve this by making steps explicit and making “compliance” observable.

A checklist turns “do a good job” into “do these 7 things.”

Checklists reduce omission errors

Most prompt failures are omission failures: missing edge cases, missing tests, missing error handling, missing constraints. Checklists systematically catch these.

What makes a checklist good

Good checklists are:

  • short: 5–12 items is a good default,
  • actionable: each item describes an action or an artifact,
  • ordered: the order matches how you work (plan → implement → verify),
  • scoped: focused on the task (don’t include irrelevant rules),
  • verifiable: “add a test” is verifiable; “be elegant” is not.
Make the model check itself

Ask it to output the checklist with each item marked satisfied and explain how. This turns “follow instructions” into a visible contract.

Where to use checklists (high leverage spots)

  • Before coding: assumptions, plan, files in scope, acceptance criteria.
  • During coding: “diff-only,” “no new deps,” “preserve behavior,” “add tests.”
  • After coding: verification commands, mapping to acceptance criteria, risk review.
  • Safety: no secrets, safe logging, refusal-aware UX patterns.
  • Structured output: validate JSON schema, handle invalid outputs.

Reusable checklist templates

Template: safe code change checklist

Checklist:
1) Restate goal and acceptance criteria.
2) List assumptions and questions (stop if critical unknowns).
3) Propose a plan (3–6 steps) and wait for confirmation.
4) Implement step 1 only as diff-only changes.
5) Keep scope limited to the allowed files.
6) Do not add dependencies unless allowed.
7) Add/update tests for new behavior.
8) Provide verification commands.
9) Briefly note risks and what to watch in review.

Template: debugging checklist

Debugging checklist:
1) Confirm reproduction steps and expected vs actual.
2) List 3–5 ranked hypotheses.
3) For each, propose a confirming and denying test.
4) Recommend the cheapest next test to run.
5) After evidence, propose the smallest fix + regression test.

Template: LLM-call reliability checklist

LLM call checklist:
1) Set timeouts.
2) Add capped retries with backoff + jitter for retryable errors.
3) Validate structured outputs (schema).
4) Handle safety blocks/refusals as a normal outcome.
5) Log request id, latency, outcome category (no raw secrets).
6) Consider caching/dedup for repeated inputs.

Common pitfalls (and fixes)

Pitfall: checklists are too long

Fix: cut to the items that prevent your most common failures. Long checklists become ignored.

Pitfall: “checkbox theater”

Fix: require evidence for each checklist item (“tests added in file X,” “run command Y”).

Pitfall: checklist order doesn’t match workflow

Fix: put planning and assumptions first, verification last. Order is what makes checklists usable.

Where to go next