30.4 Tool misuse (dangerous function calls)

Overview and links for this section of the guide.

Goal: keep “text” from becoming “unsafe actions”

Tool calling is the moment your LLM app becomes real software: it can take actions in the world.

It’s also the moment your risk profile changes dramatically. Your job is to ensure that:

  • the model cannot perform actions outside its scope,
  • high-risk actions require explicit approval,
  • tool inputs are validated and constrained,
  • runaway loops are prevented,
  • actions are auditable and reversible when possible.
Tool calling is not “just another output format”

Once tools exist, prompt injection becomes action injection. Least privilege and validation are no longer optional.

Why tool calling changes your threat model

Without tools, the model can mostly only output text. With tools, it can:

  • read data from privileged systems,
  • write data (create/update/delete),
  • trigger workflows (tickets, emails, deployments),
  • cause financial and operational side effects.

Attackers will try to trick the model into using your app’s permissions. This is the “confused deputy” problem.

Common tool misuse risks

  • Over-privileged tools: one tool can do too much (“run arbitrary SQL”).
  • Parameter injection: model passes unsafe parameters (wildcards, broad scopes).
  • Unauthorized access: tools bypass user/tenant permissions.
  • Runaway loops: repeated tool calls due to “keep trying” behavior.
  • Silent side effects: actions happen without user visibility or confirmation.
  • Data leakage: tool results contain sensitive data and get echoed into the response.

Design tools as security boundaries

Tool interfaces are your security boundary. Design them like APIs, not like “helpers.”

High-leverage design principles:

  • Least privilege by default: separate read-only tools from write tools.
  • Narrow scope: tools operate on specific resources, not arbitrary queries.
  • Explicit parameters: avoid free-form text that becomes an injection surface.
  • Constrained outputs: tools return minimal necessary data (ids and summaries).
  • Server-side enforcement: permissions and allowlists enforced in code, not in prompts.
Prefer “capabilities” over “superpowers”

A good tool does one small thing well. A bad tool is “execute any command” or “query any table.”

Budgets and termination conditions

Tool calling needs hard budgets to prevent runaway behavior:

  • Max tool calls per request: e.g., 3–10.
  • Max retries per tool: e.g., 1–2 with backoff.
  • Max total time: end-to-end deadline including tool calls.
  • Max data size: limit payload sizes to avoid “dump everything.”

Define termination rules:

  • stop when goal achieved,
  • stop when budget exhausted,
  • stop when missing permission or required info,
  • stop when repeated failures indicate a persistent issue.

Human approvals and “proposal-only” patterns

For high-risk actions, use a human approval step.

Two practical patterns:

  • Proposal-only: model proposes an action plan; a human clicks “approve” to execute.
  • Two-step confirm: model drafts a tool call, then must ask the user to confirm exact parameters.

This dramatically reduces risk because “the model wanted to do it” is not an authorization mechanism.

Make the approval meaningful

Approvals should show the exact action, scope, and impact. “Approve?” without details is not safety.

Validation: treat tool inputs as untrusted

Even if the model produced tool parameters, treat them as untrusted user input.

Validation should include:

  • Schema validation: types and required fields.
  • Allowlists: only allowed operations and resource ids.
  • Permission checks: user/tenant restrictions enforced in code.
  • Scope checks: reject broad scopes (“all users”, wildcard deletes).
  • Idempotency: use idempotency keys for write operations.

If validation fails, the system should refuse, ask for clarification, or propose a safer alternative.

Where to go next