Guide · Agentic Patterns

Human-in-the-Loop for MCP Servers — approval gates, confirmation patterns, and safe agentic workflows

Fully autonomous agents are brittle at the edges where actions become irreversible: deleting customer records, sending emails to thousands of recipients, executing financial transactions, or pushing to production. MCP servers are where that irreversibility lives — the tool handler is the last code that runs before the action lands. Human-in-the-loop patterns interrupt the tool call sequence at that boundary and require explicit approval before proceeding. This guide covers synchronous confirmation (block the tool call, return a pending result), asynchronous approval workflows (dispatch and resume), tiered escalation based on action risk, and the rollback obligations that make approval gates meaningful rather than theatrical.

TL;DR

Store pending approvals in a database table with an expiry. When a tool call requires approval, insert a row, return a { status: "pending_approval", approval_id } result, and let the agent poll a check_approval_status tool. The approver approves or denies in a UI or Slack integration that writes to the same table. Build three tiers: auto-approve (low risk), human-approve (medium risk), deny-and-alert (high risk or expired). Wire AliveMCP to your approval server — an approval service that is down silently blocks every agentic action without any alert.

Why tool annotations alone are not enough

The MCP SDK's tool approval annotation (readOnly: false, destructive: true) signals intent to MCP clients and Claude Desktop — it lets the host application decide whether to show a confirmation dialog before calling your tool. This is the right first layer. The gap: annotation-based approval lives entirely in the client. If the agent bypasses the approval UI, calls the tool programmatically, or if the client does not implement the confirmation dialog, the annotation has no enforcement effect.

Server-side approval gates enforce at the tool handler boundary regardless of client. The agent cannot execute the action without an approval token — not because the client stopped it but because the server refused to run it.

Approach	Enforcement location	Bypassable by agent?	Works offline?
SDK annotations only	MCP client / host app	Yes — agent can call directly	Yes — no server round-trip
Server-side approval gate	MCP server handler	No — server rejects without token	No — requires approval service
Both combined	Client + server	No	No

The approval database schema

Approval state lives in a table that both the MCP server and the approval UI read and write:

-- approvals table
CREATE TABLE pending_approvals (
  id          UUID PRIMARY KEY DEFAULT gen_random_uuid(),
  tool_name   TEXT NOT NULL,
  parameters  JSONB NOT NULL,
  risk_tier   TEXT NOT NULL CHECK (risk_tier IN ('low', 'medium', 'high')),
  status      TEXT NOT NULL DEFAULT 'pending'
              CHECK (status IN ('pending', 'approved', 'denied', 'expired', 'executed')),
  requested_by TEXT,          -- agent session / user context
  decided_by  TEXT,           -- approver identity
  decision_at TIMESTAMPTZ,
  expires_at  TIMESTAMPTZ NOT NULL,
  created_at  TIMESTAMPTZ NOT NULL DEFAULT NOW()
);

CREATE INDEX ON pending_approvals (status, expires_at);

The expires_at column is critical. An approval that never expires accumulates indefinitely. Set it to your workflow's natural timeout — typically 4–24 hours for human approvals, 30 seconds for synchronous confirmation dialogs.

The approval middleware pattern

Wrap destructive tool handlers with an approval middleware that intercepts calls before execution:

// approval-middleware.ts
import { db } from './db.js';
import { classifyRisk, RiskTier } from './risk-classifier.js';

export type ApprovalResult =
  | { proceed: true }
  | { proceed: false; approvalId: string; status: 'pending' | 'denied' | 'expired' };

export async function requireApproval(
  toolName: string,
  parameters: Record<string, unknown>,
  requestedBy: string
): Promise<ApprovalResult> {
  const risk = classifyRisk(toolName, parameters);

  // Auto-approve low-risk actions
  if (risk === RiskTier.Low) {
    return { proceed: true };
  }

  // Auto-deny high-risk actions (require explicit policy exception)
  if (risk === RiskTier.High) {
    await alertOps(toolName, parameters, requestedBy);
    return { proceed: false, approvalId: 'blocked', status: 'denied' };
  }

  // Check for an existing approved token (agent calling with prior approval)
  const approvalId = (parameters as any).__approval_id as string | undefined;
  if (approvalId) {
    const existing = await db.query(
      `SELECT status, expires_at FROM pending_approvals WHERE id = $1`,
      [approvalId]
    );
    const row = existing.rows[0];
    if (!row) return { proceed: false, approvalId, status: 'denied' };
    if (row.status === 'approved' && row.expires_at > new Date()) {
      await db.query(
        `UPDATE pending_approvals SET status = 'executed' WHERE id = $1`,
        [approvalId]
      );
      return { proceed: true };
    }
    if (row.expires_at <= new Date()) {
      return { proceed: false, approvalId, status: 'expired' };
    }
    return { proceed: false, approvalId, status: row.status as any };
  }

  // Create a new approval request
  const expiresAt = new Date(Date.now() + 4 * 60 * 60 * 1000); // 4 hours
  const result = await db.query(
    `INSERT INTO pending_approvals (tool_name, parameters, risk_tier, requested_by, expires_at)
     VALUES ($1, $2, $3, $4, $5) RETURNING id`,
    [toolName, parameters, 'medium', requestedBy, expiresAt]
  );
  const newId = result.rows[0].id;

  await notifyApprovers(newId, toolName, parameters);
  return { proceed: false, approvalId: newId, status: 'pending' };
}

In each destructive tool handler, call requireApproval before executing. If proceed is false, return a structured result the agent can act on:

// delete_customer_records tool handler
server.tool('delete_customer_records', {
  customer_ids: z.array(z.string()).min(1).max(1000),
  __approval_id: z.string().optional()
}, async ({ customer_ids, __approval_id }, { meta }) => {
  const approval = await requireApproval(
    'delete_customer_records',
    { customer_ids, __approval_id },
    meta?.requestId ?? 'unknown'
  );

  if (!approval.proceed) {
    return {
      content: [{
        type: 'text',
        text: JSON.stringify({
          status: 'pending_approval',
          approval_id: approval.approvalId,
          message: approval.status === 'pending'
            ? `Approval requested for deleting ${customer_ids.length} customer records. Call this tool again with __approval_id when approved.`
            : `Approval ${approval.status}. Cannot proceed.`,
          check_status_with: 'check_approval_status'
        })
      }]
    };
  }

  // Execute the destructive action
  await customerDb.deleteMany(customer_ids);
  return { content: [{ type: 'text', text: `Deleted ${customer_ids.length} customer records.` }] };
});

The check_approval_status tool

The agent needs a way to poll for approval without re-triggering the destructive action:

server.tool('check_approval_status', {
  approval_id: z.string()
}, async ({ approval_id }) => {
  const result = await db.query(
    `SELECT tool_name, parameters, status, decision_at, expires_at,
            decided_by, risk_tier
     FROM pending_approvals WHERE id = $1`,
    [approval_id]
  );

  if (!result.rows.length) {
    return {
      content: [{ type: 'text', text: JSON.stringify({ error: 'Approval not found' }) }]
    };
  }

  const row = result.rows[0];
  const now = new Date();

  // Auto-expire stale pending approvals
  if (row.status === 'pending' && row.expires_at <= now) {
    await db.query(
      `UPDATE pending_approvals SET status = 'expired' WHERE id = $1`,
      [approval_id]
    );
    row.status = 'expired';
  }

  return {
    content: [{
      type: 'text',
      text: JSON.stringify({
        approval_id,
        status: row.status,
        tool_name: row.tool_name,
        risk_tier: row.risk_tier,
        expires_at: row.expires_at,
        decided_by: row.decided_by ?? null,
        decision_at: row.decision_at ?? null,
        next_action: row.status === 'approved'
          ? `Call ${row.tool_name} again with __approval_id: "${approval_id}"`
          : row.status === 'pending'
          ? 'Wait and poll again, or ask the user to approve'
          : 'Cannot proceed — request a new approval if needed'
      })
    }]
  };
});

Risk classification

The classifyRisk function maps tool names and parameter shapes to risk tiers. Keep it explicit — don't use heuristics that can be fooled by parameter values:

// risk-classifier.ts
export enum RiskTier {
  Low = 'low',       // auto-approve: reads, searches, non-destructive writes
  Medium = 'medium', // require human approval
  High = 'high'      // deny by default, require policy exception
}

const RISK_MAP: Record<string, RiskTier> = {
  // Reads — always low
  search_customers: RiskTier.Low,
  get_customer: RiskTier.Low,
  list_invoices: RiskTier.Low,

  // Non-destructive writes — low
  create_draft_email: RiskTier.Low,
  add_note: RiskTier.Low,

  // Writes that affect data — medium
  update_customer: RiskTier.Medium,
  send_email: RiskTier.Medium,
  create_invoice: RiskTier.Medium,
  cancel_subscription: RiskTier.Medium,

  // Bulk or irreversible — high
  delete_customer_records: RiskTier.High,
  bulk_email_send: RiskTier.High,
  refund_all_invoices: RiskTier.High,
};

export function classifyRisk(
  toolName: string,
  parameters: Record<string, unknown>
): RiskTier {
  const base = RISK_MAP[toolName] ?? RiskTier.Medium;

  // Escalate based on parameter shape (bulk operations)
  if (Array.isArray(parameters.ids) && parameters.ids.length > 100) {
    return RiskTier.High;
  }

  return base;
}

Approver notifications and the approval UI

Human approvers need to be notified promptly. A Slack integration is the lowest-friction path: send a message with the tool name, parameter summary, risk tier, expiry time, and Approve / Deny buttons that call your approval API:

// notify-approvers.ts
async function notifyApprovers(
  approvalId: string,
  toolName: string,
  parameters: Record<string, unknown>
): Promise<void> {
  const paramSummary = JSON.stringify(parameters, null, 2).slice(0, 500);
  await slackClient.chat.postMessage({
    channel: process.env.APPROVAL_SLACK_CHANNEL!,
    text: `Action requires approval`,
    blocks: [
      {
        type: 'section',
        text: {
          type: 'mrkdwn',
          text: `*Approval required:* \`${toolName}\`\n\`\`\`${paramSummary}\`\`\``
        }
      },
      {
        type: 'actions',
        elements: [
          {
            type: 'button',
            text: { type: 'plain_text', text: 'Approve' },
            style: 'primary',
            action_id: 'approve',
            value: approvalId
          },
          {
            type: 'button',
            text: { type: 'plain_text', text: 'Deny' },
            style: 'danger',
            action_id: 'deny',
            value: approvalId
          }
        ]
      }
    ]
  });
}

// Slack interaction endpoint
app.post('/slack/actions', express.urlencoded({ extended: true }), async (req, res) => {
  const payload = JSON.parse(req.body.payload);
  const action = payload.actions[0];
  const approvalId = action.value;
  const decidedBy = payload.user.username;
  const newStatus = action.action_id === 'approve' ? 'approved' : 'denied';

  await db.query(
    `UPDATE pending_approvals
     SET status = $1, decided_by = $2, decision_at = NOW()
     WHERE id = $3 AND status = 'pending'`,
    [newStatus, decidedBy, approvalId]
  );

  res.send({ text: `Marked as ${newStatus}` });
});

Rollback obligations

Approval gates are only meaningful if the approved actions are reversible when mistakes happen. Before adding an approval gate to a tool, decide on the rollback path:

Action type	Rollback approach	Implementation note
Database delete	Soft-delete with `deleted_at`; hard-delete on schedule	Add 30-day grace period before permanent delete
Email send	No rollback — draft step before approval is essential	Store draft; approval converts draft to send
Billing action	Stripe supports refunds programmatically	Log Stripe charge IDs for post-fact reversal
File/object delete	Move to "trash" bucket with TTL; S3 Versioning	Recovery window = TTL duration
Webhook dispatch	Log payload; replay via re-send endpoint	Idempotency key prevents double-execution on replay

Monitoring the approval service itself

The approval workflow introduces a new failure mode: the approval service goes down, and every agentic action silently stalls. The agent calls a destructive tool, gets back a pending_approval response, polls check_approval_status, and waits — while the notification that would tell a human to approve was never delivered because the Slack client failed, or because the database write returned an error after the HTTP 200 had already been sent.

Wire AliveMCP to two endpoints:

/health/approvals — checks that the approvals table is reachable, that the Slack notification client is healthy, and that the pending-approval queue depth is not growing unboundedly.
/health/approvals/stale — alerts when any approval has been in pending status for more than N hours without a notification delivered, indicating the notification path is broken.

app.get('/health/approvals', async (req, res) => {
  const staleCount = await db.query(
    `SELECT count(*) FROM pending_approvals
     WHERE status = 'pending' AND created_at < NOW() - INTERVAL '1 hour'`
  );

  const slackHealthy = await testSlackConnection();

  const status = slackHealthy && staleCount.rows[0].count < 10 ? 'ok' : 'degraded';
  res.status(status === 'ok' ? 200 : 503).json({
    status,
    pending_count: staleCount.rows[0].count,
    slack_reachable: slackHealthy
  });
});

Frequently asked questions

How does the agent know it needs to wait for approval?

The tool returns a structured { status: "pending_approval", approval_id, check_status_with: "check_approval_status" } JSON result. The LLM reads this result and, if following your system prompt's instructions about approval workflows, will call check_approval_status on a polling interval. Include explicit instructions in your system prompt: "If a tool returns pending_approval, call check_approval_status with the approval_id every 60 seconds until status is approved or denied. Do not re-call the original tool until approval is confirmed."

What happens if the agent just re-calls the destructive tool without waiting?

If the __approval_id parameter is not provided or the approval is still pending, the server creates another approval request (or returns the existing pending one if you track by idempotency key). The agent should not be able to bypass the gate — the server-side check means the action cannot execute regardless of what the agent does at the LLM level. Add duplicate detection: before inserting a new approval row, check if there is already a pending row for the same tool_name + parameters within the last hour, and return that existing approval ID instead of creating a new one.

How do I handle approvals for long-running agent sessions that may time out?

The agent session's timeout (typically 5–30 minutes) is usually shorter than the human's approval turnaround (minutes to hours). Store the approval state in the database — the agent can pick up the approval ID from a previous session context if the user resumes the conversation. Include the approval ID in the agent's context window so it can reference it. Design your orchestration layer to resume a paused agent session when an approval transitions from pending to approved — webhooks from your approval API to the orchestrator work well here. See long-running MCP tasks for the complementary async task patterns.

Can I use the MCP sampling API instead of a custom approval workflow?

The MCP sampling API lets MCP servers request LLM completions from the client — useful for summarization or classification within tool handlers, but not designed for human approval gates. Sampling invokes an LLM, not a human. For human-in-the-loop approval you need a separate approval service with a UI or messaging integration that a real person interacts with. The two complement each other: use sampling for automated risk classification, and the approval service for cases where risk classification returns "escalate to human".