7.3 The "write tests first" vibe pattern

On this page

The core idea
Why tests-first speeds up vibe coding
When to use tests-first
A practical workflow (tests → implementation → verify)
What good tests look like (for AI-generated code)
Copy-paste prompt sequence
Common pitfalls (and fixes)
Ship points for tests-first
Where to go next

The core idea

In vibe coding, the model can generate code faster than you can judge it. Tests flip that power balance: they give you a mechanical way to decide whether the output is correct.

The pattern is simple:

Write tests first from acceptance criteria.
Confirm tests match intent (this is a spec review).
Implement the smallest change to make tests pass.
Run tests after every diff and iterate.

Tests-first is not “academic”

It’s the fastest way to keep AI output honest. A failing test is better feedback than 20 paragraphs of explanation.

Why tests-first speeds up vibe coding

It speeds you up because it reduces two expensive activities:

Argument debugging: “it should work” back-and-forth without evidence.
Regression hunting: discovering later that you broke something earlier.

With tests, your loop becomes:

Change → run tests → see failure → fix → repeat.
Not: Change → hope → discover break later → panic.

The model is good at writing tests, too

Ask it to propose test cases from acceptance criteria. You still review them, but it can generate the scaffolding and edge-case set quickly.

When to use tests-first

Tests-first is a great default when:

you are fixing a bug (regression test first),
you are refactoring (lock behavior before moving code),
the behavior is tricky (parsing, validation, serialization),
the output must be stable (structured output / schemas),
you are worried about breaking existing behavior.

You can skip tests-first when the change is truly trivial (but even then, at least run existing tests).

A practical workflow (tests → implementation → verify)

Step 0: define “done”

Tests-first starts with acceptance criteria (Section 6.3). If “done” is fuzzy, tests will be fuzzy too.

Step 1: ask the model for tests only

Your prompt should explicitly forbid implementation. You are creating a spec artifact, not code.

Step 2: review the tests like a code review

Tests are part of your product contract. Review for:

coverage of the acceptance criteria,
edge cases and failure behavior,
determinism (no time/network flakiness),
minimal coupling to implementation details.

Step 3: make tests fail for the right reason

Run the tests before implementing. If tests already pass, they’re not testing the new behavior (or they’re too weak).

Step 4: implement the smallest change to pass

Ask for diff-only changes. One small diff per failing test cluster is usually ideal.

Step 5: iterate and lock in

Repeat until green, then commit. This becomes your stable base for the next feature.

Don’t “accept green” too easily

If the model makes tests pass by deleting the test, weakening assertions, or changing the acceptance criteria, that’s not success. That’s cheating. Keep the contract stable.

What good tests look like (for AI-generated code)

Good tests have three traits:

They test behavior, not structure: outputs, errors, exit codes, schemas.
They are easy to read: a test is documentation for “what we mean.”
They fail clearly: when broken, you can tell what changed.

Example: CLI behavior test

For CLI tools, prefer tests that call the entrypoint function and capture stdout/stderr, rather than shelling out to a subprocess (unless you need a true integration test).

Make your CLI testable

If your CLI reads from real stdin/out, refactor to allow injecting streams. This single design choice makes tests-first much easier.

Copy-paste prompt sequence

Prompt A: tests only

We are using the “write tests first” pattern.

Task:
[Describe the change/feature.]

Acceptance criteria:
- [...]

Constraints:
- Language/runtime: [...]
- Dependencies: [...]
- Do NOT implement the feature yet

Output:
- Diff-only changes that add/modify tests ONLY
- After the diff, explain how each test maps to an acceptance criterion

Prompt B: implement to pass tests

Now implement the feature to make the new tests pass.

Constraints:
- Keep the public API stable unless the tests require a change
- Do not weaken or delete the new tests
- Keep diffs small and focused

Output:
- Diff-only changes

Prompt C: fix one failing test minimally

Here is the failing test output:
(paste)

Fix the failure with the smallest change possible.

Constraints:
- Do not change tests unless the test is wrong (explain if so)
- Diff-only changes

Common pitfalls (and fixes)

Pitfall: tests are too tied to implementation details

Fix: rewrite tests to focus on public behavior (function outputs, schemas, CLI output), not internal helper functions.

Pitfall: tests don’t fail before implementation

Fix: add assertions or cases that clearly require the new behavior. A test that always passes is just noise.

Pitfall: tests are flaky

Fix: remove time/network randomness; use fixed inputs; avoid relying on ordering unless defined.

Pitfall: the model “fixes” by weakening the contract

Fix: restate acceptance criteria and instruct: “Do not change expected behavior. Only change implementation.”

Ship points for tests-first

SP1: tests added and reviewed; they fail for the right reason.
SP2: implementation passes tests with minimal diff.
SP3: refactor/cleanup (optional) with tests still green; commit.

7.3 The "write tests first" vibe pattern

The core idea

Why tests-first speeds up vibe coding

When to use tests-first

A practical workflow (tests → implementation → verify)

Step 0: define “done”

Step 1: ask the model for tests only

Step 2: review the tests like a code review

Step 3: make tests fail for the right reason

Step 4: implement the smallest change to pass

Step 5: iterate and lock in

What good tests look like (for AI-generated code)

Example: CLI behavior test

Copy-paste prompt sequence

Prompt A: tests only

Prompt B: implement to pass tests

Prompt C: fix one failing test minimally

Common pitfalls (and fixes)

Pitfall: tests are too tied to implementation details

Pitfall: tests don’t fail before implementation

Pitfall: tests are flaky

Pitfall: the model “fixes” by weakening the contract

Ship points for tests-first

Where to go next