9. Model Selection Strategy (Fast vs Smart vs Cheap)

On this page

What this section is for
The three dimensions: fast vs smart vs cheap
Model routing: the practical strategy
Common failure modes (and what to change)
Section 9 map (9.1–9.5)
Where to go next

What this section is for

Model selection is not about picking “the best model.” It’s about picking the right model for the job you’re doing right now—and switching deliberately when the job changes.

This section gives you practical decision rules for:

prototyping fast,
handling complex refactors safely,
switching models mid-project without chaos,
controlling determinism for repeatable builds,
building cost intuition so you don’t get surprised.

Model names change; tradeoffs don’t

Even as product names shift, you will always be choosing between speed, reasoning depth, context capacity, and cost.

The three dimensions: fast vs smart vs cheap

Think in dimensions instead of brands:

Fast: low latency, cheap enough for many iterations, good for scaffolding and routine coding.
Smart: stronger reasoning, better at multi-step planning and complex tradeoffs, often slower and more expensive.
Cheap: cost-efficient for high-volume or batch tasks, may require tighter constraints or more verification.

In practice, you rarely get all three at once. Your workflow should assume you’ll switch between “lanes.”

Model routing: the practical strategy

A pragmatic strategy that works in real teams:

Default lane (fast): most iterations, small features, routine refactors, generating boilerplate.
Design lane (smart): architecture choices, tricky refactors, debugging ambiguous failures, writing a plan you’ll actually trust.
Scale lane (cheap): batch processing, large document sets, repeated tasks where you can constrain outputs heavily.

You don’t “choose once.” You route tasks.

Use smart models for plans, fast models for typing

Many teams get better results by asking a smart model for a plan and risk analysis, then using a fast model to implement steps with diff-only changes.

Common failure modes (and what to change)

Failure: output is plausible but messy

Often means: constraints and acceptance criteria are weak.
Try first: tighten prompt structure (Section 10) and add tests (Section 7.3).
Then: switch to a smarter model if you need deeper reasoning/refactoring.

Failure: you’re stuck in a loop (multiple failed attempts)

Often means: the model can’t reason through the failure with current context.
Try: provide better evidence (Section 8.1) and ask for hypotheses/tests (Section 8.2).
Then: escalate to a smarter model for diagnosis only, not a full rewrite.

Failure: cost or latency is exploding

Often means: prompts are too large, you’re repeating context, or retries are out of control.
Try: context trimming (Section 11) and caching/dedup (see Part V+).
Then: route high-volume tasks to a cheaper model with tighter schemas.

9. Model Selection Strategy (Fast vs Smart vs Cheap)

What this section is for

The three dimensions: fast vs smart vs cheap

Model routing: the practical strategy

Common failure modes (and what to change)

Failure: output is plausible but messy

Failure: you’re stuck in a loop (multiple failed attempts)

Failure: cost or latency is exploding

Section 9 map (9.1–9.5)

Where to go next

9. Model Selection Strategy (Fast vs Smart vs Cheap) sub-sections