15.5 Validating outputs and surfacing errors to users
Overview and links for this section of the guide.
On this page
- Goal: validation that improves reliability and UX
- Where validation belongs (boundary rule)
- What to validate (syntax vs schema vs semantics)
- Return shapes for errors (CLI and web)
- User-facing messages (safe and actionable)
- Debug modes (how to investigate without leaking data)
- Validation checklist
- Where to go next
Goal: validation that improves reliability and UX
Validation is not just “catch errors.” It’s how you make your AI feature predictable and safe.
Your goal is:
- parseable outputs for your code,
- clear outcome categories for your UI,
- safe logs and safe user-facing messages,
- fast recovery paths when outputs wobble.
Without validation, your app will crash or behave unpredictably when outputs are malformed. With validation, you can route to “retry,” “blocked,” or “invalid_output” states deterministically.
Where validation belongs (boundary rule)
Validation belongs at the LLM boundary (your wrapper module):
- call model
- parse output
- validate output
- return a typed result or a categorized error
Callers should never re-implement parsing/validation. They should handle outcomes.
This is how you avoid “every handler has slightly different parsing logic,” which becomes a correctness nightmare.
What to validate (syntax vs schema vs semantics)
There are three levels:
1) Syntax validation (JSON parse)
Is it valid JSON? If not, it’s invalid_output (optionally repair once).
2) Schema validation (shape + types)
Does it match required fields, types, enums, bounds? If not, it’s invalid_output.
3) Semantic validation (meaning)
This is harder. You can’t fully solve “truthfulness” with schema. But you can still enforce some semantic rules:
- non-empty bullets,
- bounded length,
- no duplicate bullets,
- no forbidden keys,
- required “caveats” presence.
For deeper semantic correctness (grounding, citations), you’ll use RAG and eval harnesses later in the guide.
Return shapes for errors (CLI and web)
Your UI needs a stable way to handle outcomes.
CLI pattern
- stdout: only the success output (valid JSON)
- stderr: human-readable error message
- exit code: maps to category
Web/API pattern
Return an outcome envelope:
{
"status": "ok" | "validation_error" | "blocked" | "timeout" | "rate_limit" | "invalid_output" | "unknown",
"result": object | null,
"message": "string | null",
"request_id": "string"
}
This makes frontend logic simple: branch on status, render result or message.
Provider errors can leak internals and are often confusing. Map them to your taxonomy and show a clear, safe message.
User-facing messages (safe and actionable)
Good user messages have:
- high-level category: timeout vs rate limit vs blocked
- next step: “try again,” “reduce input length,” “rephrase”
- request id: for support/debugging
Avoid revealing prompt text or user content in error messages.
Debug modes (how to investigate without leaking data)
Sometimes you need to debug schema failures. Do it safely:
- log prompt/schema versions and validation errors
- in dev only, optionally store the raw output for inspection
- in prod, require explicit “debug mode” with access control and retention limits
Remember: model outputs can contain user data.
Validation checklist
- Do we parse JSON with a real parser (not regex)?
- Do we validate against a strict schema (required fields, bounds, enums)?
- Do we categorize invalid output and handle it gracefully?
- Do we cap repair retries?
- Do we return stable outcome envelopes to the UI?
- Do logs include request id, prompt version, schema version?
- Do we avoid logging raw content by default?