Guide · MCP Security
MCP server audit logging
Audit logging records every significant action on your MCP server — which user called which tool, with what arguments, what the result was, and how long it took. This trail is indispensable for security reviews, incident forensics, compliance reporting, and diagnosing unexpected behavior in production. For MCP servers specifically, tool calls are the most important events to capture: they are the interface between an LLM agent and your backend, and they carry real authority (read, write, delete, send).
TL;DR
Wrap every tool handler in middleware that emits a structured JSON log line with: timestamp, actor (the authenticated user/token identity), tool name, args (PII-redacted), outcome (ok or error), durationMs, and requestId for correlation. Redact fields like email, password, token, and ssn before writing. Ship logs to a separate storage location so a compromised server process cannot erase its own trail. Retain for 90 days minimum; 1 year for compliance workloads.
Why MCP servers need audit logs
MCP tool calls are not ordinary HTTP requests. An agent can chain dozens of tool calls in a single session — reading files, querying databases, sending messages, triggering deploys — with minimal human review of each individual step. This autonomy makes audit logs more important, not less:
- Post-incident forensics — when a production record is deleted unexpectedly, audit logs tell you which agent session called which tool, with which arguments, at what time
- Compliance — SOC 2, HIPAA, and ISO 27001 controls require evidence that privileged actions are logged and monitored
- Abuse detection — anomalous tool call volumes, unusual argument patterns, or calls from unexpected IP ranges show up in logs before they show up in user complaints
- Debugging agent behavior — when an agent produces a surprising output, the audit log of its tool calls is the ground truth of what actually happened
What to capture per tool call
Every audit log entry should contain enough information to answer: who did what to what, when, and what happened? The minimum viable field set:
| Field | Type | Purpose |
|---|---|---|
timestamp | ISO 8601 UTC | When the tool call was received (not completed) |
requestId | UUID | Correlation ID — matches HTTP header or generated; ties log lines to the same session |
actor.id | string | Authenticated user ID, API key fingerprint, or token sub claim — never the raw token |
actor.ip | string | Client IP (trust X-Forwarded-For only behind a known proxy) |
tool | string | Exact tool name as registered (e.g. delete_file) |
args | object | Sanitized argument object — PII fields replaced with [REDACTED] |
outcome | ok | error | Whether the tool returned normally or threw |
error | string | null | Error message when outcome is error (truncate at 500 chars) |
durationMs | integer | Tool execution time in milliseconds |
serverVersion | string | Your server's version string — helps correlate behavior changes after deploys |
Middleware pattern
Rather than adding logging to each individual tool handler, wrap the tool registration at the SDK level. The MCP SDK does not provide a built-in middleware hook, but you can achieve the same result by wrapping each handler function:
import { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js';
import { randomUUID } from 'crypto';
function auditLog(entry: object) {
// Write to stdout as newline-delimited JSON (NDJSON)
// Caddy / Docker / systemd captures stdout and ships to your log sink
process.stdout.write(JSON.stringify(entry) + '\n');
}
const PII_KEYS = new Set(['email', 'password', 'token', 'secret', 'ssn', 'phone', 'creditCard']);
function redactArgs(args: Record<string, unknown>): Record<string, unknown> {
const result: Record<string, unknown> = {};
for (const [key, val] of Object.entries(args)) {
// Redact by key name match
if (PII_KEYS.has(key.toLowerCase())) {
result[key] = '[REDACTED]';
} else if (typeof val === 'string' && val.length > 500) {
// Truncate large blobs — likely file content, not useful in logs
result[key] = val.slice(0, 200) + ' ... [TRUNCATED]';
} else {
result[key] = val;
}
}
return result;
}
// Wrap a tool handler to emit audit log entries
function withAudit<TArgs extends object, TResult>(
toolName: string,
handler: (args: TArgs, context: any) => Promise<TResult>
): (args: TArgs, context: any) => Promise<TResult> {
return async (args, context) => {
const requestId = (context.requestId as string | undefined) ?? randomUUID();
const actor = context.actor ?? { id: 'anonymous', ip: 'unknown' };
const start = Date.now();
let outcome: 'ok' | 'error' = 'ok';
let error: string | null = null;
try {
const result = await handler(args, context);
return result;
} catch (err) {
outcome = 'error';
error = err instanceof Error ? err.message.slice(0, 500) : String(err);
throw err;
} finally {
auditLog({
timestamp: new Date().toISOString(),
requestId,
actor,
tool: toolName,
args: redactArgs(args as Record<string, unknown>),
outcome,
error,
durationMs: Date.now() - start,
serverVersion: process.env.SERVER_VERSION ?? 'unknown',
});
}
};
}
// Usage
const server = new McpServer({ name: 'my-server', version: '1.0.0' });
server.tool(
'delete_file',
'Permanently delete a file from disk',
{ path: z.string() },
withAudit('delete_file', async ({ path: filePath }, context) => {
await fs.unlink(filePath);
return { content: [{ type: 'text', text: `Deleted: ${filePath}` }] };
})
);
PII redaction patterns
Arguments passed to MCP tools often contain user-supplied data. Before writing to the audit log, redact fields that could contain personal information. Key-name matching covers most cases, but pattern matching on values catches data that arrives in generically-named fields:
const EMAIL_RE = /\b[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}\b/g;
const CREDIT_CARD_RE = /\b\d{4}[- ]?\d{4}[- ]?\d{4}[- ]?\d{4}\b/g;
const TOKEN_RE = /\b(ghp_|sk-|Bearer |xoxb-)\S+/g;
function redactStringValues(s: string): string {
return s
.replace(EMAIL_RE, '[EMAIL]')
.replace(CREDIT_CARD_RE, '[CARD]')
.replace(TOKEN_RE, '[TOKEN]');
}
function redactArgs(args: Record<string, unknown>): Record<string, unknown> {
const result: Record<string, unknown> = {};
for (const [key, val] of Object.entries(args)) {
if (PII_KEYS.has(key.toLowerCase())) {
result[key] = '[REDACTED]';
} else if (typeof val === 'string') {
result[key] = redactStringValues(val);
} else {
result[key] = val;
}
}
return result;
}
Never log raw JWT tokens, API keys, or passwords — even truncated. Log the sub claim from a decoded JWT, or the fingerprint (first 8 chars) of an API key, not the key itself.
Protecting the audit trail
An audit log that the compromised process can overwrite provides no forensic value. Several protective measures:
- Write to stdout, not a local file — your container runtime or systemd captures stdout and ships it to a central log store outside the application's reach. The process cannot retroactively modify captured stdout.
- Separate log store — ship to a log aggregation service (Loki, Elasticsearch, CloudWatch Logs) where the MCP server process has append-only credentials. Even if the server is fully compromised, past log entries remain intact.
- Immutable S3/GCS bucket — for compliance workloads, enable object lock on the log bucket so entries cannot be deleted within the retention window.
- Separate process for sensitive writes — a side-car process with append-only disk access can receive log events over a Unix socket and write them, preventing the main process from corrupting its own trail.
Log retention and volume
Audit log volume depends on your call rate. Each log entry is roughly 500 bytes of NDJSON. At 100 tool calls/minute (moderate agent workload) that's 3 MB/hour or ~2 GB/month — manageable for any log store.
| Workload type | Minimum retention | Recommended |
|---|---|---|
| Indie / hobby project | 30 days | 90 days |
| B2B SaaS / team plan | 90 days | 1 year |
| Healthcare / finance | 1 year (HIPAA) / 7 years (SOX) | 7 years + immutable |
Set a log rotation policy at your aggregation layer. Most log stores support TTL-based deletion that satisfies "retain for N days" without manual cleanup.
Querying audit logs for security review
If your logs are in a queryable store (e.g. Loki with LogQL, or a SQLite archive), useful security queries:
-- Destructive tool calls in the last 24 hours
SELECT timestamp, actor_id, tool, args
FROM audit_log
WHERE outcome = 'ok'
AND tool IN ('delete_file', 'drop_table', 'send_email')
AND timestamp > datetime('now', '-1 day')
ORDER BY timestamp DESC;
-- High-frequency callers (possible abuse)
SELECT actor_id, COUNT(*) AS call_count
FROM audit_log
WHERE timestamp > datetime('now', '-1 hour')
GROUP BY actor_id
HAVING call_count > 500
ORDER BY call_count DESC;
-- Error rate by tool (detect broken tools before users notice)
SELECT tool,
SUM(CASE WHEN outcome='error' THEN 1 ELSE 0 END) * 100.0 / COUNT(*) AS error_pct
FROM audit_log
WHERE timestamp > datetime('now', '-1 day')
GROUP BY tool
HAVING error_pct > 5
ORDER BY error_pct DESC;
Pair these queries with alerts: if destructive-tool call volume doubles in an hour, or any actor exceeds 1,000 calls in a minute, page the on-call engineer.
Correlating audit logs with uptime events
Your audit logs are most powerful when correlated with uptime events. When AliveMCP detects that your server went down, you can query the audit log for the last tool call executed before the failure — often revealing an unhandled exception, a memory-exhausting argument, or a destructive operation that corrupted internal state.
Store a requestId in every log line and propagate it to your structured application logs so you can reconstruct the full execution trace for any tool call that preceded an outage.
Further reading
- MCP server authentication — JWT, API keys, and session verification
- MCP server RBAC — role-based access control for tools and resources
- MCP server security monitoring — threat detection and alerting
- MCP server structured logging — JSON log format and correlation IDs
- MCP server input validation — Zod schemas and boundary checks
- MCP tool annotations — readOnlyHint, destructiveHint, and audit classification
- MCP server rate limiting — throttle tool call abuse
- MCP server incident response — using logs during an outage
- AliveMCP — uptime monitoring for HTTP-deployed MCP servers