Guide · MCP Protocol · Security
MCP server tool approval
LLM agents are increasingly being handed tools that can delete records, send messages, push code, or charge credit cards. The question is not whether to add approval gates to these tools — it's where to put them and how to make them impossible to bypass. System prompt instructions like "always ask before deleting" are soft constraints that jailbroken or confused LLMs can ignore. Hard approval gates, enforced in the tool handler itself, are the only reliable mechanism. This page covers tool risk classification, implementing approval dialogs via MCP elicitation, enforcing approvals server-side, showing change previews before confirmation, and keeping an audit trail of every approval and denial.
TL;DR
Gate destructive MCP tool calls behind human confirmation using MCP elicitation: the tool handler computes a preview of what will change, sends an elicitation request with the preview and a boolean confirmation field, and only executes the operation if the user accepts. Classify tools into risk tiers at registration time and apply the appropriate gate in each tier. Enforce the gate in the server-side handler — never rely on LLM self-restraint or host-side prompt instructions. Log every approval and denial to an audit log with timestamp, user ID, tool name, arguments, and action taken.
Tool risk classification
Not all tools need approval gates. Applying confirmation dialogs to read-only tools creates friction without adding safety. The right approach is to classify tools at registration time and apply the appropriate enforcement level.
| Risk tier | Characteristics | Gate | Examples |
|---|---|---|---|
| Read-only | No side effects; reversible by definition | None | get_record, list_files, search_users |
| Low-risk write | Creates new data; easy to undo | None or soft warning | create_draft, add_comment, create_tag |
| High-risk write | Modifies or deletes existing data; hard to undo | User confirmation with preview | delete_record, update_user_email, bulk_archive |
| Critical | Irreversible or high blast radius; external effects | User confirmation + admin approval | send_email_blast, charge_customer, push_to_production |
The classification is set at tool registration time and stored in tool metadata, not in the LLM's system prompt. The handler checks its own risk tier and enforces the gate unconditionally — the LLM has no mechanism to override it.
// tool-registry.ts
type RiskTier = 'read' | 'low-write' | 'high-write' | 'critical';
interface ToolMeta {
name: string;
riskTier: RiskTier;
description: string;
}
const TOOL_REGISTRY: ToolMeta[] = [
{ name: 'get_record', riskTier: 'read', description: 'Fetch a single record by ID' },
{ name: 'delete_record', riskTier: 'high-write', description: 'Permanently delete a record' },
{ name: 'send_email_blast',riskTier: 'critical', description: 'Send email to all subscribers' },
];
Implementing approval with elicitation
MCP elicitation (available in clients that support the elicitation capability) is the right protocol mechanism for approval gates. The tool handler computes a change preview, sends it to the user, and waits for a boolean confirm/deny response before executing.
// tools/delete-record.ts
import type { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js';
import { requireApproval } from '../lib/approval.js';
import { db } from '../lib/db.js';
export function registerDeleteRecord(server: McpServer) {
server.tool(
'delete_record',
{
record_id: { type: 'string', description: 'ID of the record to delete' },
},
async (args, extra) => {
// 1. Fetch what will be deleted (preview)
const record = await db.records.findById(args.record_id);
if (!record) {
return {
content: [{ type: 'text', text: `No record found with ID ${args.record_id}` }],
isError: true,
};
}
const preview = [
`Record: ${record.name} (${record.id})`,
`Type: ${record.type}`,
`Created: ${record.created_at}`,
`This action is permanent and cannot be undone.`,
].join('\n');
// 2. Gate on approval
const approved = await requireApproval(extra, {
action: 'delete_record',
preview,
warningLevel: 'high',
});
if (!approved) {
return { content: [{ type: 'text', text: 'Delete cancelled — record not deleted.' }] };
}
// 3. Execute only after approval
await db.records.delete(args.record_id);
await audit.log({ action: 'delete_record', recordId: args.record_id, approved: true });
return { content: [{ type: 'text', text: `Record ${args.record_id} deleted.` }] };
}
);
}
// lib/approval.ts
import type { RequestHandlerExtra } from '@modelcontextprotocol/sdk/server/mcp.js';
interface ApprovalOptions {
action: string;
preview: string;
warningLevel: 'low' | 'high' | 'critical';
}
export async function requireApproval(
extra: RequestHandlerExtra,
opts: ApprovalOptions
): Promise<boolean> {
if (!extra.clientCapabilities?.elicitation) {
throw new Error(
`Tool "${opts.action}" requires user approval but the connected client ` +
`does not support elicitation. Use a client with elicitation support.`
);
}
const result = await extra.requestElicitation({
message: `⚠️ Confirm ${opts.warningLevel === 'critical' ? 'CRITICAL ' : ''}action:\n\n${opts.preview}`,
requestedSchema: {
type: 'object',
properties: {
confirmed: {
type: 'boolean',
title: 'I confirm this action',
description: 'Check this box to proceed',
},
},
required: ['confirmed'],
},
});
if (result.action !== 'accept') return false;
return result.content.confirmed === true;
}
Why server-side enforcement is non-negotiable
It's tempting to put the approval requirement in the system prompt: "Before calling delete_record, always ask the user for confirmation." This approach has a critical flaw: the LLM controls whether the confirmation happens, not the server. Three failure modes:
| Failure mode | Soft (prompt-based) gate | Hard (handler-based) gate |
|---|---|---|
| LLM misinterprets user intent as implicit confirmation | Deletes without prompting | Prompts regardless |
| Adversarial prompt injection tricks LLM to skip confirmation | Deletes without prompting | Prompts regardless |
| Different LLM (model swap, API change) follows different instructions | Behaviour undefined | Prompts regardless |
| System prompt truncated by context limits | Gate instruction lost | Prompts regardless |
Server-side enforcement means the gate runs in the handler before any database write, regardless of what instructions the LLM did or didn't follow. The LLM cannot call delete_record without the approval dialog appearing — the server enforces it unconditionally.
Showing a diff preview before confirmation
For update operations, a flat string preview is less useful than a diff. Showing the user exactly what will change before they confirm reduces both false approvals (clicking through without reading) and false denials (declining because the preview was unclear).
// For record updates: compute a before/after diff
async function buildUpdatePreview(
current: Record<string, unknown>,
updates: Record<string, unknown>
): Promise<string> {
const lines: string[] = ['Changes to be applied:'];
for (const [key, newValue] of Object.entries(updates)) {
const oldValue = current[key];
if (oldValue !== newValue) {
lines.push(` ${key}:`);
lines.push(` before: ${JSON.stringify(oldValue)}`);
lines.push(` after: ${JSON.stringify(newValue)}`);
}
}
if (lines.length === 1) {
return 'No changes detected.';
}
return lines.join('\n');
}
// In the tool handler:
const preview = await buildUpdatePreview(existingRecord, args.updates);
const approved = await requireApproval(extra, {
action: 'update_record',
preview,
warningLevel: 'high',
});
Approval timeouts
Elicitation requests don't expire on their own — the tool handler will wait indefinitely for a response. For approval dialogs on critical operations, add a timeout that auto-denies if the user doesn't respond within a reasonable window.
export async function requireApprovalWithTimeout(
extra: RequestHandlerExtra,
opts: ApprovalOptions,
timeoutMs = 60_000
): Promise<boolean> {
const elicitationPromise = extra.requestElicitation({
message: `Confirm action (auto-denied in ${timeoutMs / 1000}s if no response):\n\n${opts.preview}`,
requestedSchema: {
type: 'object',
properties: {
confirmed: { type: 'boolean', title: 'Confirm' },
},
required: ['confirmed'],
},
});
const timeoutPromise = new Promise<never>((_, reject) =>
setTimeout(() => reject(new Error('Approval timed out')), timeoutMs)
);
try {
const result = await Promise.race([elicitationPromise, timeoutPromise]);
return result.action === 'accept' && result.content.confirmed === true;
} catch (err) {
await audit.log({ action: opts.action, approved: false, reason: 'timeout' });
return false;
}
}
Audit trail
Every approval and denial should be logged. The audit trail is your evidence that the system worked as intended, and it's essential for post-incident investigation if a destructive action is taken that the user claims they didn't authorize.
// lib/audit.ts
import { db } from './db.js';
interface ApprovalEvent {
timestamp: string;
userId: string;
sessionId: string;
toolName: string;
arguments: Record<string, unknown>;
action: 'approved' | 'denied' | 'timeout' | 'no_elicitation_support';
preview: string;
}
export const audit = {
async logApproval(event: ApprovalEvent): Promise<void> {
await db.approvalEvents.insert({
...event,
timestamp: event.timestamp ?? new Date().toISOString(),
});
},
};
The audit log should be append-only and stored separately from the tables the destructive tools operate on. If a tool deletes records from records, the approval log should be in a different table (or different database) so deleting a record doesn't also delete the approval evidence.
The audit logging guide covers retention, access control, and querying approval history in more detail.
Frequently asked questions
What if the client doesn't support elicitation?
You have two options. Option A: fail closed — return an error explaining that the tool requires elicitation-capable clients (list compatible clients in the error message). Option B: fail safe — if you have a secondary approval mechanism (e.g. a webhook that pings a Slack channel and waits for a reaction), use that as a fallback. Option A is almost always the right choice; building a secondary approval mechanism is significant complexity that's usually better spent on ensuring users access the server through an elicitation-capable client.
Should the approval gate be a separate tool (approve_action) or inline in the destructive tool?
Inline — always. A separate approve_action tool creates the same problem as prompt-based gates: the LLM controls whether it calls the approval tool before the destructive tool. A crafty or confused LLM can call delete_record without first calling approve_action. The approval gate in the handler cannot be bypassed this way.
Can I batch multiple destructive operations into a single approval?
Yes, but carefully. If a user asks "delete all records older than 30 days," showing a count ("This will delete 847 records. Confirm?") is better than 847 individual approval dialogs. Design bulk operations as single tools that compute the full impact first, then show a summary in the approval preview. Never loop over individual tool calls to build up a batch — use a dedicated bulk tool that enforces approval for the entire operation in one shot.
How does approval interact with the tool's error handling?
The approval check happens before the database write — so if approval fails (user denies, timeout, or elicitation error), the function returns early and no write occurs. The tool result isError field should be false for user-denied approvals (denial is a valid outcome, not an error) and true for timeout or elicitation failures. This distinction matters because the LLM treats isError: true as a tool malfunction that it might retry.
How do I test tools with approval gates?
Create a test harness that mocks extra.requestElicitation to return a canned response. Pass a mock extra object to the tool handler directly, bypassing the MCP protocol. Write separate test cases for each action: accept+confirmed, accept+not-confirmed, decline, and cancel. Also write a test for the path where the client lacks elicitation capability. The goal is 100% branch coverage on all approval outcomes before any code touches a real database.
Further reading
- MCP server elicitation — requesting user input mid-tool-call
- MCP server audit logging — structured event trails for compliance
- MCP server RBAC — role-based access control for tool calls
- MCP server prompt injection defense — protecting tools from adversarial output
- MCP server error handling — isError, McpError, and structured failures
- AliveMCP — continuous protocol monitoring for MCP servers