31. Practical Defenses You'll Actually Implement

On this page

What this section is for
Defense philosophy: deterministic controls around probabilistic systems
Defense layers (input → retrieval → model → tools → logs)
How to implement defenses without slowing velocity
Section 31 map (31.1–31.5)
Where to start

What this section is for

Section 31 is a practical playbook: the defenses you will actually implement in real AI features.

These defenses are not “LLM magic.” They are normal engineering controls:

input validation,
schema enforcement,
permissions checks,
least privilege,
secrets hygiene,
testing and red-teaming.

The goal is to build a system that stays safe even when the model is confused, manipulated, or wrong.

Security without killing speed

Most defenses here are small, repeatable patterns. Once you build them once (validators, allowlists, logging policy), you reuse them across features.

Defense philosophy: deterministic controls around probabilistic systems

LLMs are probabilistic. Your defenses should be deterministic.

That means:

don’t rely on the model to refuse, enforce refusal in code when needed;
don’t rely on the model to sanitize, sanitize and validate deterministically;
don’t rely on the model to obey permissions, enforce permissions before retrieval/tool calls;
don’t rely on the model to keep secrets, keep secrets out of prompts and logs.

Defense layers (input → retrieval → model → tools → logs)

Think in layers so you don’t bet everything on one control:

Input layer: sanitize, cap size, allowlist permitted actions.
Retrieval layer: permissions filtering, corpus allowlists, authority weighting.
Model layer: structured output, strict instructions, “sources are untrusted.”
Tool layer: least privilege, schema validation, approvals, budgets.
Logging layer: safe logging, redaction, retention, auditability.

Defense in depth is not optional

Any single defense can fail: models ignore prompts, filters miss patterns, humans misconfigure access. Layers are how you make failures non-catastrophic.

How to implement defenses without slowing velocity

Practical approach:

Start with a safe default template: structured outputs + validators + not-found behavior.
Add a small allowlist layer: what tasks are allowed; what tools are allowed.
Implement secrets hygiene early: prevent “oops we logged the key.”
Write a red-team corpus: 25–100 adversarial cases you rerun on every change.
Automate gates: schema, citations, permissions, budgets (Part IX style).

31. Practical Defenses You'll Actually Implement

What this section is for

Defense philosophy: deterministic controls around probabilistic systems

Defense layers (input → retrieval → model → tools → logs)

How to implement defenses without slowing velocity

Section 31 map (31.1–31.5)

Where to start

31. Practical Defenses You'll Actually Implement sub-sections