8. A Gentle Intro to Debugging With AI

Overview and links for this section of the guide.

What this section is for

Debugging is where vibe coding either becomes a superpower or becomes a time sink. The model is great at suggesting fixes—but only if you feed it the right evidence and force it into a disciplined loop.

This section teaches you how to:

  • paste errors so the model can actually diagnose them,
  • debug systematically (hypotheses + tests),
  • recover when the model goes off the rails,
  • add logging/observability without leaking secrets,
  • fix bugs without creating regressions.
The goal

Turn debugging into a short, repeatable procedure. When it’s procedural, it stays fast—even when problems are hard.

The debugging mindset that actually works with AI

Debugging with an LLM is not “ask for a fix and hope.” It’s closer to running a diagnostic assistant:

  • Evidence-driven: errors, logs, failing tests, reproduction steps.
  • Hypothesis-based: propose causes, then run checks to confirm/deny.
  • Small diffs: the smallest change that proves a hypothesis.
  • Regression discipline: lock the fix with a test.
Don’t argue with the model

If the model is wrong, don’t debate. Bring evidence: “Here is the failing output. Here is the code path. Here is what we expected.”

The core loop: reproduce → isolate → fix → lock

  1. Reproduce: make the failure happen reliably with a single command or test.
  2. Isolate: shrink scope until you can point at a specific module, line, or input case.
  3. Fix: implement the smallest change that resolves the failure.
  4. Lock: add/keep a regression test so it can’t come back silently.

This loop is the same whether you’re debugging AI outputs, parsers, web servers, or build pipelines.

What to give the model (debugging inputs)

Debugging quality is mostly about what you paste. High-value inputs:

  • the exact command you ran,
  • full error output (stdout/stderr),
  • what you expected vs what happened,
  • runtime/version info,
  • the smallest relevant code slice (files/functions),
  • any tests (especially the failing one).

Section 8.1 gives a reusable template.

What to demand from the model (debugging outputs)

Don’t accept “try random changes.” Demand structured output:

  • Hypotheses: 3–5 plausible causes ranked by likelihood.
  • Tests: for each hypothesis, 1–2 checks that confirm/deny it.
  • Smallest fix: minimal diff-only changes.
  • Regression protection: a test that fails before the fix and passes after.

Section 8.2 covers the loop in detail.

Section 8 map (8.1–8.5)

Where to go next