Home/ Part XIII — Expert Mode: Systems, Agents, and Automation/42. Fine-Tuning vs Prompting vs Retrieval (Decision Framework)

42. Fine-Tuning vs Prompting vs Retrieval (Decision Framework)

Overview and links for this section of the guide.

The Triangle of Power

You have three levers to improve quality:

  1. Prompt Engineering: Fast, cheap, iterative. (Do this first).
  2. RAG (Context): Adding missing information. (Do this for knowledge).
  3. Fine-Tuning: Changing the model's behavior/style. (Do this last).

When to Fine-Tune

Fine-tune ONLY when:

  • You need a specific format that prompting can't enforce (e.g., a weird legacy XML schema).
  • You need a specific tone/voice (e.g., "speak like a pirate" or "speak like our brand guidelines") that consumes too many tokens to describe in a prompt.
  • You need to reduce latency/cost (distilling a Pro model's behavior into a Flash model).

Do NOT fine-tune for knowledge. If you fine-tune a model on your docs, it will hallucinate facts. Use RAG for facts.

Where to go next