30.4 Tool misuse (dangerous function calls)

On this page

Goal: keep “text” from becoming “unsafe actions”

Tool calling is the moment your LLM app becomes real software: it can take actions in the world.

It’s also the moment your risk profile changes dramatically. Your job is to ensure that:

Tool calling is not “just another output format”

Once tools exist, prompt injection becomes action injection. Least privilege and validation are no longer optional.

Without tools, the model can mostly only output text. With tools, it can:

Attackers will try to trick the model into using your app’s permissions. This is the “confused deputy” problem.

Over-privileged tools: one tool can do too much (“run arbitrary SQL”).
Parameter injection: model passes unsafe parameters (wildcards, broad scopes).
Unauthorized access: tools bypass user/tenant permissions.
Runaway loops: repeated tool calls due to “keep trying” behavior.
Silent side effects: actions happen without user visibility or confirmation.
Data leakage: tool results contain sensitive data and get echoed into the response.

Tool interfaces are your security boundary. Design them like APIs, not like “helpers.”

High-leverage design principles:

Least privilege by default: separate read-only tools from write tools.
Narrow scope: tools operate on specific resources, not arbitrary queries.
Explicit parameters: avoid free-form text that becomes an injection surface.
Constrained outputs: tools return minimal necessary data (ids and summaries).
Server-side enforcement: permissions and allowlists enforced in code, not in prompts.

Prefer “capabilities” over “superpowers”

A good tool does one small thing well. A bad tool is “execute any command” or “query any table.”

Tool calling needs hard budgets to prevent runaway behavior:

Define termination rules:

For high-risk actions, use a human approval step.

Two practical patterns:

Proposal-only: model proposes an action plan; a human clicks “approve” to execute.
Two-step confirm: model drafts a tool call, then must ask the user to confirm exact parameters.

This dramatically reduces risk because “the model wanted to do it” is not an authorization mechanism.

Make the approval meaningful

Approvals should show the exact action, scope, and impact. “Approve?” without details is not safety.

Even if the model produced tool parameters, treat them as untrusted user input.

Validation should include:

If validation fails, the system should refuse, ask for clarification, or propose a safer alternative.