Guide · MCP Protocol

MCP server session lifecycle

An MCP session is not a single HTTP request — it is a stateful connection that persists across multiple tool calls, notifications, and capability negotiations. Understanding how sessions are established, maintained, and cleaned up is the difference between a server that handles ten concurrent users cleanly and one that leaks database connections, accumulates zombie sessions, and crashes under load. The lifecycle has four phases: initialization (the two-message handshake that establishes capabilities), active operation (tool calls and notifications on a persistent connection), disconnection detection (transport close events and keepalive failures), and cleanup (releasing per-session resources and state). Each phase has specific failure modes that do not appear in unit tests — only in production or in integration tests that simulate real transport behavior.

TL;DR

An MCP session starts with initialize → initialized, then enters an active loop of tool calls and notifications on a persistent SSE connection. Store per-session state in a Map<sessionId, SessionContext> keyed by the transport's session ID. Hook transport.onclose to clean up that state — and always pair sessionContextMap.set() with a corresponding sessionContextMap.delete() in the close handler. Implement ping/pong keepalive to detect dead connections before the OS TCP stack notices. Set a session TTL to evict zombie sessions whose close events never fired.

Session establishment: the initialize handshake

Every MCP session begins with a two-message handshake. The client sends an initialize request; the server responds with its capabilities, protocol version, and server info; the client then sends an initialized notification to confirm it is ready. Only after this exchange can the client make tool calls.

// Client sends:
{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "initialize",
  "params": {
    "protocolVersion": "2024-11-05",
    "capabilities": { "sampling": {}, "roots": { "listChanged": true } },
    "clientInfo": { "name": "Claude Desktop", "version": "1.6.0" }
  }
}

// Server responds:
{
  "jsonrpc": "2.0",
  "id": 1,
  "result": {
    "protocolVersion": "2024-11-05",
    "capabilities": { "tools": {}, "resources": {}, "elicitation": {} },
    "serverInfo": { "name": "my-server", "version": "1.2.0" }
  }
}

// Client sends (no response expected):
{
  "jsonrpc": "2.0",
  "method": "notifications/initialized"
}

The MCP SDK handles this handshake automatically when you call server.connect(transport). You do not write the initialize handler manually. What you do handle is the post-initialize moment: a hook to extract identity, store session context, and prepare per-session resources.

Per-session context storage

For HTTP/SSE transport, each session has a unique sessionId assigned by the server when the SSE connection is established. This ID appears in the mcp-session-id HTTP header on subsequent POST requests. Use it as the key for a module-scope Map that holds per-session state.

import { StreamableHTTPServerTransport } from '@modelcontextprotocol/sdk/server/streamableHttp.js';
import express from 'express';

interface SessionContext {
  userId: string;
  tenantId: string;
  plan: 'free' | 'pro' | 'enterprise';
  connectedAt: Date;
  lastActivityAt: Date;
}

// Module-scope — persists across all sessions
const sessionContextMap = new Map<string, SessionContext>();

const app = express();

app.post('/mcp', async (req, res) => {
  const transport = new StreamableHTTPServerTransport({
    sessionIdGenerator: () => crypto.randomUUID(),
    onsessioninitialized: (sessionId) => {
      // Called after initialize/initialized handshake completes
      const identity = extractIdentityFromRequest(req); // from JWT or API key

      sessionContextMap.set(sessionId, {
        userId: identity.sub,
        tenantId: identity.tenantId,
        plan: identity.plan,
        connectedAt: new Date(),
        lastActivityAt: new Date(),
      });
    },
  });

  // CRITICAL: clean up on transport close
  transport.onclose = () => {
    const sessionId = transport.sessionId;
    if (sessionId) {
      sessionContextMap.delete(sessionId);
    }
  };

  await server.connect(transport);
  await transport.handleRequest(req, res, req.body);
});

The pairing rule: every sessionContextMap.set(sessionId, ...) must have a corresponding sessionContextMap.delete(sessionId) in the transport.onclose handler. Missing the delete creates a memory leak — each session's context stays in the Map forever, accumulating over time until the server runs out of memory.

Accessing session context in tool handlers

Tool handlers do not receive the session ID directly via the MCP SDK's standard API. The cleanest pattern is to store the session context in an AsyncLocalStorage at request handling time, so every async function in the call tree can access it without parameter threading.

import { AsyncLocalStorage } from 'node:async_hooks';

const sessionStorage = new AsyncLocalStorage<SessionContext>();

// In the request handler, wrap the MCP call in the context store
app.post('/mcp', async (req, res) => {
  const transport = new StreamableHTTPServerTransport({ ... });

  transport.onclose = () => {
    if (transport.sessionId) sessionContextMap.delete(transport.sessionId);
  };

  await server.connect(transport);

  // Retrieve context for this session and run in its AsyncLocalStorage scope
  const sessionId = req.headers['mcp-session-id'] as string | undefined;
  const ctx = sessionId ? sessionContextMap.get(sessionId) : undefined;

  if (ctx) {
    await sessionStorage.run(ctx, () => transport.handleRequest(req, res, req.body));
  } else {
    await transport.handleRequest(req, res, req.body);
  }
});

// Helper for tool handlers
function getSessionContext(): SessionContext | undefined {
  return sessionStorage.getStore();
}

// In a tool handler — no session ID parameter needed
server.tool('my_tool', { ... }, async (args) => {
  const ctx = getSessionContext();
  if (!ctx) throw new Error('No session context — was this called outside a session?');

  // Use ctx.userId, ctx.tenantId, ctx.plan
  const data = await db.query(
    'SELECT * FROM records WHERE tenant_id = $1',
    [ctx.tenantId]
  );
  return { content: [{ type: 'text', text: JSON.stringify(data.rows) }] };
});

Disconnection detection and keepalive

TCP connections can die without either side noticing — a network partition, a load balancer silently dropping the connection, a mobile client switching between WiFi and cell. The transport.onclose handler fires when the SDK detects a clean close, but a hard connection drop may not trigger it until the OS TCP stack times out (typically 10–15 minutes).

Two mechanisms detect dead connections faster:

SSE keep-alive comments. Send an SSE comment (a line starting with :) every 30–60 seconds. The write failure is detected immediately if the connection is dead, triggering the close event. Many proxy servers (Cloudflare, AWS ALB) also have connection idle timeouts that these comments reset.

// In the SSE connection handler — send a keep-alive comment every 30s
const keepAlive = setInterval(() => {
  try {
    res.write(': keepalive\n\n');
  } catch {
    clearInterval(keepAlive);
  }
}, 30_000);

res.on('close', () => clearInterval(keepAlive));

MCP ping. The MCP protocol has a ping request that the server can send to the client. If the client does not respond within a timeout, the session is considered dead and can be cleaned up.

// Server-side ping every 60 seconds — fire and forget
setInterval(async () => {
  for (const [sessionId, ctx] of sessionContextMap.entries()) {
    const staleMs = Date.now() - ctx.lastActivityAt.getTime();
    if (staleMs > 120_000) { // 2 minutes of inactivity
      // Consider this session zombie — clean up
      sessionContextMap.delete(sessionId);
    }
  }
}, 60_000);

Zombie session prevention

A zombie session is a session context that remains in sessionContextMap after the transport closed, because the onclose event never fired (hard network drop, OOM kill of the client process, etc.). Zombies accumulate memory and — if they hold database connections — exhaust connection pools.

Prevention mechanism	What it catches	Implementation
`transport.onclose`	Clean disconnects	Always implement — first line of defense
SSE write failure detection	Hard drops on HTTP/SSE	Catch write errors in keepalive loop
Session TTL eviction	All zombie types including OOM kills	Periodic scan of `sessionContextMap` against `lastActivityAt`
Max session count cap	Session table exhaustion before memory OOM	Return HTTP 503 when `sessionContextMap.size >= MAX_SESSIONS`

const SESSION_TTL_MS = 30 * 60 * 1000; // 30 minutes
const MAX_SESSIONS = 500;

// Eviction scan — run every minute
setInterval(() => {
  const now = Date.now();
  for (const [sessionId, ctx] of sessionContextMap.entries()) {
    if (now - ctx.lastActivityAt.getTime() > SESSION_TTL_MS) {
      sessionContextMap.delete(sessionId);
      logger.warn({ sessionId }, 'Evicted zombie session (TTL exceeded)');
    }
  }
}, 60_000);

// Session cap at initialize time
app.post('/mcp', async (req, res) => {
  if (sessionContextMap.size >= MAX_SESSIONS) {
    res.status(503).json({ error: 'Server at capacity — try again shortly' });
    return;
  }
  // ... proceed with session creation
});

Reconnection and session resumption

SSE clients reconnect automatically when the connection drops — this is built into the browser's EventSource API and most SSE client libraries. From the server's perspective, a reconnect looks like a brand-new session: a new initialize request with no prior state.

MCP does not currently have a built-in session resumption protocol. If your tool has in-progress operations when a client reconnects, you cannot automatically resume from where it left off. Design for this by:

Keeping long-running operations in a durable queue (see MCP server message queue) so they outlive the session and can be polled by a get_job_status tool from any session.
Making tool results idempotent — calling the same tool with the same arguments twice should produce the same result, so a reconnect-and-retry is safe.
Exposing a list_pending_jobs or get_recent_results tool that a reconnected client can call to discover what was in progress before the disconnect.

For load-balanced deployments, session affinity (sticky routing by mcp-session-id header) ensures that a reconnecting client lands on the same backend instance where its per-session data lives. Without sticky routing, the reconnect lands on a different instance that has no record of the prior session, and the client receives a fresh initialize response.

Session lifecycle and monitoring

The MCP session lifecycle is exactly what AliveMCP's probe exercises. Each probe run sends a complete initialize → initialized → tools/list sequence, then closes the connection cleanly. This exercises:

The initialize handshake (protocol version negotiation, capability exchange)
The session context creation path (onsessioninitialized callback)
The tool registration path (tools/list response)
The clean close path (transport.onclose → context cleanup)

A server that passes the probe but has a buggy close handler will accumulate session contexts over time. Track sessionContextMap.size as a metric in your Prometheus metrics and alert when it grows monotonically — that pattern indicates onclose is not firing or not deleting the entry.

Frequently asked questions

How is a stdio transport session different from an HTTP/SSE session?

A stdio transport is a one-to-one connection: one MCP server process handles exactly one client (the process that spawned it) for the server's entire lifetime. There is no sessionId because there is only one session. Per-session context can live in module scope. The session begins when the process starts, and ends when the process exits or stdin closes. There is no reconnection — a disconnect means the server process terminates and the client spawns a new one. Cleanup happens via SIGTERM and process exit handlers, not via transport.onclose.

Can multiple concurrent sessions access the same module-scope variable safely?

No, and this is the most common source of multi-tenant bugs. If a variable at module scope changes between sessions (anything that varies per user, per tenant, or per request), it must never live in module scope — it must live in the sessionContextMap or in AsyncLocalStorage. The V8 event loop is single-threaded, but async operations interleave: session A writes to a module-scope variable, yields to the event loop, session B reads the same variable, and session A's context is now corrupting session B's response. Use the sessionContextMap + AsyncLocalStorage pattern from the section above.

What happens to in-flight tool calls when the SSE connection drops?

In-flight tool calls whose handlers are still executing continue to run. The handler sees extra.signal.aborted === true (the SDK aborts the request signal when the connection closes). If the handler has cancellation support, it will notice the abort and clean up; otherwise it runs to completion and the result is discarded because there is no connection to send it to. The handler's finally blocks still run — resource cleanup happens regardless of connection state.

How do I implement server-side session invalidation (for logout)?

There is no MCP protocol message for server-initiated session termination. To invalidate a session from the server side, close the transport: call a method on the transport object to end the SSE stream, which triggers transport.onclose on both sides and causes the client to reconnect (as a new session without the old auth context). Alternatively, mark the session as invalidated in a set, and check that set at the start of every tool call — returning isError: true with a "session expired" message until the client reconnects and re-authenticates.

How should I monitor session count in production?

Export sessionContextMap.size as a Prometheus up-down-counter or gauge, updated on every add and delete. Alert on two thresholds: absolute count (e.g., > 80% of MAX_SESSIONS — add capacity soon) and rate of growth (size growing at > N new sessions per minute without corresponding deletes — indicates onclose is not firing). AliveMCP's initialize probe contributes a session open and close event per probe run (every 60 seconds) — these should appear as brief spikes of +1/−1 in the session count metric.