Part X — Security, Privacy, and Prompt Injection Defense
Overview and links for this section of the guide.
On this page
What this part is for
Part X is about building AI features that are safe to ship.
If you’re vibe coding, you can generate functionality quickly. Security and privacy are what keep that velocity from turning into:
- data leaks,
- prompt injection incidents,
- unsafe tool execution,
- compliance failures,
- and expensive “we shipped a demo to production” postmortems.
This part gives you a practical threat model and defenses you can implement without becoming a full-time security team.
You cannot prompt your way out of security problems. Secure systems rely on architecture, permissions, validation, and observability. Prompts help, but they are not the boundary.
Security mindset for LLM features
LLM features change the threat landscape because they:
- accept untrusted natural language input,
- produce untrusted natural language output,
- can be tricked into following instructions (prompt injection),
- often integrate with tools/APIs (turning “text” into “actions”),
- often ingest untrusted documents (RAG: documents as attackers).
The safe posture is: treat the model as a powerful, fallible component. You design the system so failures are bounded and non-catastrophic.
If the model output can cause side effects (writes, deletes, money movement, user data access), it must go through explicit, enforceable controls that are not “trust the model.”
The main threat families (what you’re defending against)
- Data exfiltration: leaking secrets, PII, or proprietary data via outputs, logs, or tool calls.
- Prompt injection: user input or retrieved documents instruct the model to ignore rules or reveal data.
- Tool misuse: unsafe function calls, over-privileged tools, runaway loops, or “confused deputy” behavior.
- Supply chain issues: insecure dependencies, generated code risks, and build pipeline exposure.
- Governance gaps: missing policies for what can enter prompts, what gets logged, and how data is retained.
Core security principles (LLM edition)
- Least privilege: tools and data access should be narrowly scoped.
- Fail closed: when uncertain, abstain or ask for clarification; don’t guess.
- Don’t trust inputs: user input and retrieved documents are untrusted.
- Don’t trust outputs: validate and sanitize before acting on model output.
- Secrets stay out of prompts and logs: treat prompt text as a potential leak vector.
- Defense in depth: combine controls across retrieval, prompt composition, tool design, and logging.
- Auditability: record what the system saw and did (safely) so incidents are debuggable.
What you’ll be able to do after Part X
- Threat model an AI feature realistically (what attackers want, how they attempt it).
- Identify and prioritize data exfiltration risks (secrets, PII, proprietary data).
- Understand indirect prompt injection (documents as attackers) and design safe RAG behavior.
- Design safe tool calling (least privilege, allowlists, budgets, approvals).
- Implement practical defenses: input allowlists, schema enforcement, output filtering, secrets hygiene, and red-teaming workflows.
- Build a “security loop” that fits vibe coding: small diffs, tests/evals, and repeatable guardrails.
Part X map (Sections 30–31)
Where to go next
Explore next