Protocol guide · 2026-06-21 · MCP Protocol Primitives

Beyond Tools: The Four MCP Protocol Primitives That Make Servers Production-Ready

The majority of MCP server tutorials stop at tools — define a handler, return some text, done. Tools are the correct starting point; they are the most versatile MCP primitive and the one every client supports. But the MCP protocol specification defines four additional surfaces beyond tools: resources (URI-addressable read-only data that LLMs consume as context), prompts (reusable message templates that seed multi-turn conversations), argument completions (autocomplete suggestions for tool and prompt parameters), and notifications (server-to-client push events for catalog changes, resource updates, and task progress). Each primitive solves a problem that tools alone cannot: resources give clients structured read access without tool invocations; prompts let clients inject pre-built context into conversations; completions eliminate invalid argument values before tool calls happen; notifications let servers push state changes instead of forcing clients to poll. Each primitive also has a characteristic silent failure mode — one where /health returns 200, the tool ping succeeds, and clients receive no error, but the primitive is not working. This guide synthesizes all four and shows you how to know when each one breaks.

The four primitives at a glance

Primitive	What it does	Silent failure mode	Health endpoint
Resources	Expose read-only data via URI-addressable catalog	Backend returns stale data; no error, wrong LLM context	`/health/resources`
Prompts	Seed multi-turn conversations with templated context	Data dependency broken; expansion silently returns empty or partial turns	`/health/prompts`
Completions	Autocomplete parameter values as users type	Unindexed query causes 3s response; client abandons, user types free-form invalid value	Latency check on `completion/complete`
Notifications	Push catalog changes and resource updates to clients without polling	SSE connection dies; server emits to /dev/null, client sees stale catalog	`/health/notifications`

How capabilities negotiation gates everything

Before examining each primitive, it is worth understanding the mechanism that controls which primitives a client and server can use. Every MCP session begins with a capabilities handshake: the client sends an initialize request declaring its protocol version and supported capabilities; the server responds with its own. Neither side should use a feature the other hasn't declared.

// Server constructor — declare all primitives you intend to use
const server = new Server(
  { name: 'my-mcp-server', version: '1.0.0' },
  {
    capabilities: {
      tools:     { listChanged: true },
      resources: { subscribe: true, listChanged: true },
      prompts:   { listChanged: true },
      logging:   {},
      completions: {}
    }
  }
);

Each capability sub-field is a contract. Declaring resources: { subscribe: true } tells clients they can send resources/subscribe requests and expect notifications/resources/updated events. Declaring a capability without implementing its handlers produces protocol errors. Not declaring a capability you do implement means clients never use that surface — the most common misconfiguration for servers that add features incrementally.

The handshake itself has a silent failure mode: a server that accepts TCP connections but hangs during the initialize exchange looks identical to a healthy server at the HTTP layer. AliveMCP's MCP-aware probe completes the full initialize → initialized three-step exchange on every check — because a server that never finishes the handshake cannot serve any primitive to any client, and standard HTTP probes cannot detect it.

Resources — the read-only data substrate

Resources are the substrate LLMs read before deciding what tools to call. A file server exposes file:///workspace/src/auth.ts; a database server exposes db://customers/customer-42; a monitoring server exposes metrics://uptime/alivemcp.com?window=7d. Where a tool performs an action, a resource exposes data for reading — and clients treat resources as safe to read without user confirmation, potentially pre-fetching and caching them in a resource browser.

The two handlers every resource layer needs:

import { Server, ResourceTemplate } from '@modelcontextprotocol/sdk/server/index.js';
import { ListResourcesRequestSchema, ReadResourceRequestSchema } from '@modelcontextprotocol/sdk/types.js';

// ListResources — return the catalog (aim for fewer than 50 items)
server.setRequestHandler(ListResourcesRequestSchema, async () => {
  const recent = await db.customers.findMany({
    select: { id: true, name: true },
    orderBy: { updatedAt: 'desc' },
    take: 20
  });
  return {
    resources: [
      { uri: 'config://mcp/server', name: 'Server configuration',
        mimeType: 'application/json' },
      ...recent.map(c => ({
        uri: `db://customers/${c.id}`,
        name: c.name,
        mimeType: 'application/json'
      }))
    ]
  };
});

// ReadResource — return content for a specific URI
server.setRequestHandler(ReadResourceRequestSchema, async ({ params }) => {
  const uri = params.uri;
  if (uri === 'config://mcp/server') {
    return { contents: [{ uri, mimeType: 'application/json',
      text: JSON.stringify(serverConfig, null, 2) }] };
  }
  const match = uri.match(/^db:\/\/customers\/(.+)$/);
  if (match) {
    const customer = await db.customers.findUnique({ where: { id: match[1] } });
    if (!customer) throw new Error(`Customer ${match[1]} not found`);
    return { contents: [{ uri, mimeType: 'application/json',
      text: JSON.stringify(customer, null, 2) }] };
  }
  throw new Error(`Unknown resource URI: ${uri}`);
});

Resource subscriptions let clients receive push notifications when content changes. The server holds a Map of URI → Set of subscribing session IDs and fans out notifications/resources/updated when data changes:

const subscriptions = new Map(); // uri → Set of session transport refs

server.setRequestHandler(SubscribeRequestSchema, async ({ params }, { sessionId }) => {
  if (!subscriptions.has(params.uri)) subscriptions.set(params.uri, new Set());
  subscriptions.get(params.uri).add(sessionId);
  return {};
});

// When a customer record changes:
async function notifyResourceUpdate(customerId) {
  const uri = `db://customers/${customerId}`;
  const sessions = subscriptions.get(uri) ?? new Set();
  for (const sessionId of sessions) {
    try {
      await server.sendNotification(sessionId, {
        method: 'notifications/resources/updated',
        params: { uri }
      });
    } catch {
      sessions.delete(sessionId); // clean up dead sessions
    }
  }
}

The silent failure for resources: a backend database goes read-only due to a replica promotion event. Every ReadResource call succeeds but returns data that is 4 hours stale. No error appears at the protocol layer. The LLM makes decisions on stale context. Wire AliveMCP to /health/resources that checks database reachability, file watcher heartbeat (if you serve file-backed resources), and subscription map size — a growing Map with no cleanup indicates leaked subscriptions from disconnected clients.

See the full MCP server resources guide for URI template design (RFC 6570 patterns), static vs. dynamic vs. file-backed vs. remote resource types, and the ListResources pagination cursor when your catalog exceeds 50 items.

Prompts — seed multi-turn conversations with server-defined context

An MCP prompt is not a system prompt sent at session start. It is a server-defined message template that clients can discover in a prompt browser, fill with arguments, and inject into the current conversation. A code review server exposes a review_pull_request prompt that accepts a PR number and returns a multi-turn message sequence pre-seeded with the diff, CI status, and review instructions. The LLM starts the review immediately with all context loaded — no preliminary tool calls needed.

The clearest way to decide between a tool and a prompt: if the operation produces a side effect, use a tool. If it populates context for a conversation that then uses tools, use a prompt.

import { z } from 'zod';

server.prompt(
  'review_pull_request',
  {
    description: 'Seed a code review with diff, CI status, and review criteria',
    argsSchema: z.object({
      pr_number: z.number().int().positive()
        .describe('GitHub pull request number'),
      focus: z.enum(['security', 'performance', 'style', 'all']).optional()
        .describe('Review focus area. Default: all.')
    })
  },
  async ({ pr_number, focus = 'all' }) => {
    // Fetch data in parallel — prompt expansion blocks the client UI
    const [diff, ciStatus] = await Promise.all([
      github.getPullRequestDiff(pr_number),
      github.getCIStatus(pr_number)
    ]);

    const instruction = focus === 'all'
      ? 'Review for security vulnerabilities, performance regressions, and style consistency.'
      : `Focus this review specifically on ${focus}.`;

    return {
      description: `Code review for PR #${pr_number}`,
      messages: [
        // Embedded resource content — loads diff directly into context
        { role: 'user', content: { type: 'resource',
            resource: { uri: `github://pr/${pr_number}/diff`,
              mimeType: 'text/plain', text: diff } } },
        // CI status as text
        { role: 'user', content: { type: 'text',
            text: `CI status: ${JSON.stringify(ciStatus)}` } },
        // The instruction turn that starts the actual review
        { role: 'user', content: { type: 'text', text: instruction } }
      ]
    };
  }
);

Three content types can appear in prompt messages: text (string), image (base64-encoded binary with mimeType), and resource (embedded resource content loaded directly into context rather than requiring a separate ReadResource call). The resource type is the most powerful: it lets a prompt pre-load file content, database records, or fetched remote data before the LLM generates its first token.

The silent failure for prompts: a GitHub API rate limit or a database timeout during expansion causes your GetPrompt handler to throw. The client receives a protocol error and either shows nothing or a generic failure. Meanwhile the client's prompt browser still lists the prompt as available — because ListPrompts returns the catalog independently of whether individual prompts can expand. The fix is a /health/prompts endpoint that performs a smoke-expansion of each registered prompt with fixed test arguments and a 5-second timeout race.

See the full MCP server prompts guide for the ListPrompts handler, argument description best practices (the argument description is the only documentation a user sees before invoking a prompt), prompt list change notifications, and handling long context strategies when diffs or documents exceed context window limits.

Argument completions — eliminate invalid values before tool calls happen

Argument completions let clients show an autocomplete dropdown as users type parameter values. When a user starts typing a customer ID in Claude Desktop or an IDE extension, the client sends a completion/complete request with the partial value. Your server queries its data and returns a ranked list of matches. The user selects the exact value, and the invalid-argument tool call failure class disappears entirely — invalid IDs are filtered at completion time, not at tool handler time after a round-trip.

import { CompleteRequestSchema } from '@modelcontextprotocol/sdk/types.js';

server.setRequestHandler(CompleteRequestSchema, async ({ params }) => {
  const ref = params.ref;           // { type: 'ref/tool', name, argument }
  const partial = params.argument.value; // what the user has typed

  if (ref.type === 'ref/tool' && ref.name === 'get_customer'
      && ref.argument === 'customer_id') {
    // ILIKE prefix query — the prefix index makes this fast
    const matches = await db.query(
      `SELECT id, name FROM customers
       WHERE id ILIKE $1 OR name ILIKE $1
       ORDER BY name LIMIT 11`,
      [`${partial}%`]
    );
    const hasMore = matches.rows.length > 10;
    return {
      completion: {
        values: matches.rows.slice(0, 10)
          .map(r => `${r.id} (${r.name})`),
        hasMore
      }
    };
  }

  return { completion: { values: [], hasMore: false } };
});

Three performance constraints that matter in practice:

100ms budget. Clients trigger completions on keystroke. A 300ms response means the dropdown lags one or two characters behind the cursor — users start ignoring it and type the value free-form. Keep every completion handler under 100ms at p99.
Index the prefix column. A full table scan on ILIKE '%partial%' is fast at 10,000 rows and slow at 1,000,000. Add a prefix-optimised index or a trigram index (pg_trgm) before going to production. An unindexed completion query at scale is the difference between 5ms and 3,000ms.
Return hasMore: true when truncating. Clients that see hasMore: false with 10 results treat those 10 as exhaustive and stop offering additional suggestions. When your query would return 11 or more, return 10 and set hasMore: true so clients know to refine the query with more characters.

Clients that don't support completions simply skip completion/complete requests — your server continues to work correctly without them. Declare completions: {} only when you have implemented the CompleteRequestSchema handler; declaring it with a missing handler produces a MethodNotFound protocol error on the first keystroke.

The monitoring angle: completion latency above 200ms is the failure signal. Wire AliveMCP to a synthetic completion request against your live endpoint — a slow completion path is invisible to all other monitoring but produces exactly the failure mode that causes users to stop using the completion feature and revert to typing values manually. See the full MCP server completions guide for database-backed dynamic completions, prompt argument completions, tag completion with usage-count ranking, and Vitest tests that verify prefix-match accuracy and sub-100ms latency.

Notifications — push state changes without polling

MCP notifications are one-way server-to-client messages that carry no response expectation. They are how servers tell clients that something has changed — without the client polling. When a plugin is loaded and your tool catalog grows by twelve tools, you emit notifications/tools/list_changed; the client re-fetches the tool list and the new tools appear. When a monitored resource's content changes and a client has subscribed, you emit notifications/resources/updated with the URI; the client re-reads just that resource.

All seven notification types in one reference:

Notification	Method	Capability required	What triggers it
Tool list changed	`notifications/tools/list_changed`	`tools.listChanged: true`	Tool added, removed, or renamed at runtime
Resource list changed	`notifications/resources/list_changed`	`resources.listChanged: true`	Resource catalog changes (new file, deleted row)
Resource updated	`notifications/resources/updated`	`resources.subscribe: true`	Content at a subscribed URI has changed
Prompt list changed	`notifications/prompts/list_changed`	`prompts.listChanged: true`	Prompt added, removed, or argument schema changed
Progress	`notifications/progress`	(no capability; tied to `progressToken`)	Long-running tool reports incremental progress
Log message	`notifications/message`	`logging: {}`	Server emits a structured log event at or above client's set level
Cancelled	`notifications/cancelled`	(no capability)	Server acknowledges a cancelled request

Coalescing list-change notifications. A burst of events — ten files written in a loop, twelve tool registrations at startup — produces ten or twelve list-changed notifications. Each notification causes the client to re-fetch the full list. Use a 500ms debounce to coalesce burst events into a single notification:

let listChangeTimer = null;

function scheduleToolListChanged() {
  if (listChangeTimer) clearTimeout(listChangeTimer);
  listChangeTimer = setTimeout(async () => {
    listChangeTimer = null;
    await server.sendNotification({
      method: 'notifications/tools/list_changed',
      params: {}
    });
  }, 500);
}

SSE heartbeats. On HTTP+SSE transports, dead connections are not detected until the next write. A client disconnects, the kernel buffers fill, and the server's next notification write either blocks or throws — minutes after the client left. A 30-second comment heartbeat forces an error on the next write cycle and triggers your cleanup path:

const heartbeat = setInterval(() => {
  try {
    res.write(': heartbeat\n\n');
  } catch {
    clearInterval(heartbeat);
    cleanup(sessionId);
  }
}, 30_000);

The silent failure for notifications: the SSE transport connection dies (network blip, proxy timeout, container restart) and the server's session map still contains the old session entry. The server keeps emitting notifications that go nowhere. The client reconnects on a new session, re-establishes subscriptions, and starts receiving again — but in the gap between the disconnect and the reconnect, every notification was lost silently. The server's /health/notifications endpoint should expose sent/failed counters and an error rate:

app.get('/health/notifications', async (req, res) => {
  const errorRate = notificationCounters.failed /
    (notificationCounters.sent + notificationCounters.failed || 1);

  const status = errorRate > 0.05
    || subscriptionMap.size > 10_000 ? 'degraded' : 'ok';

  res.status(status === 'ok' ? 200 : 503).json({
    status,
    sent: notificationCounters.sent,
    failed: notificationCounters.failed,
    error_rate: errorRate.toFixed(4),
    active_subscriptions: subscriptionMap.size
  });
});

A notification error rate above 5% indicates sessions that are being retained in the subscription map after the transport closes — the most common source of notification delivery failures. See the full MCP server notifications guide for progress notifications (monotonically increasing progress values, never after return, skip for operations under 1 second), logging notifications (setLevel handler, severity order, fail-safe wrapper), and client notifications the server should handle (notifications/initialized, notifications/cancelled, notifications/roots/list_changed).

Unified monitoring: health endpoints for all four primitives

Each primitive produces a distinct class of silent failure. Together, the four health endpoints give AliveMCP a complete picture of the MCP protocol surface — not just whether the server is reachable, but whether each primitive is functioning correctly for the clients that depend on it.

Endpoint	Check interval	What it catches
`/health` (MCP-aware probe)	60s	Process alive; full initialize/initialized handshake completes; no hang on capabilities negotiation
`/health/resources`	2m	DB reachable; file watcher alive; subscription map size within bounds; no leaked sessions
`/health/prompts`	5m	Smoke-expansion of each prompt with test args completes within 5s; data dependencies (GitHub API, DB) reachable
Completion latency check	5m	Synthetic `completion/complete` round-trip under 200ms; prefix index still healthy
`/health/notifications`	2m	Notification error rate below 5%; active subscription map not growing unboundedly

The implementation order that works in practice: capabilities negotiation first (declare exactly what you implement, nothing more), then resources (they are the simplest primitive and immediately useful — a read-only data catalog requires no side-effect handling), then notifications (wire list-changed and resource-updated alongside the resource layer since both are declared in the same capability block), then prompts (useful once you have resources to embed in prompt messages), then completions last (only add them when your tool surface is stable — completion handler maintenance cost scales with argument count). The health endpoints can be added at any step; they are cheap to implement and valuable from day one.

The single thread connecting all four: each primitive has a path where the protocol returns success but the client receives wrong information — stale resource content, a broken prompt template, an abandoned completion dropdown, a notification delivered to a dead connection. Standard uptime monitoring catches none of these. The health endpoints above catch all of them. AliveMCP polls each endpoint on configurable intervals so you know which layer broke and when, not just that something is wrong.

Frequently asked questions

Do I need to implement all four primitives, or can I pick just the ones I need?

Each primitive is optional. Declare only what you implement: don't include resources: {} in your capabilities if you have no resource handlers, or clients will attempt ListResources and receive a MethodNotFound error. The four primitives are additive — most production servers start with tools, add resources when they have structured data clients should read before calling tools, add notifications when the resource catalog or content changes frequently enough that polling is inefficient, and add prompts when they have multi-turn interaction patterns worth packaging as reusable templates. Completions are worthwhile any time a tool parameter accepts a value from a bounded set (IDs, slugs, enum values) that users would benefit from seeing as suggestions.

What is the difference between returning data as a resource vs. returning it from a tool?

Direction and intent. A tool is an imperative action — the client explicitly requests a side-effecting or complex operation. A resource is a declarative read — the client fetches data it wants to include in context, potentially without user interaction. Practically: clients treat resources as safe to cache and pre-fetch; they surface resources in browser UIs where users can examine them before any LLM sees them; and they may include resource content in context automatically. Tool results appear only when explicitly requested. If your data is read-only and clients might want to reference it before calling any tool, a resource is the right model. If fetching the data is itself an operation with meaningful cost or side effects, a tool is more appropriate even if the result is data.

Can a prompt embed resources that are dynamically fetched, not pre-registered with server.resource()?

Yes. Embedded resource content in a prompt message is not required to correspond to a registered resource URI. The { type: 'resource', resource: { uri, mimeType, text } } content block is inline data — the client does not attempt to re-fetch it from your resource catalog. This means a prompt can fetch data from external APIs, compose multiple sources, and embed the result directly without those sources being discoverable via ListResources. The tradeoff: embedded resources are not subscribable and clients cannot independently refresh them. Use pre-registered resources when clients need live updates via subscription; use inline embedded content in prompts when the data is request-scoped and the prompt is the natural discovery surface.

How should I handle notifications when the server restarts and existing subscriptions are lost?

Subscriptions in MCP are ephemeral — they are not persisted across server restarts. Clients re-establish subscriptions at the start of each session, so a clean server restart causes no permanent subscription loss: the next client connection triggers a fresh capabilities handshake and the client re-subscribes to whatever it previously subscribed to (most clients persist their subscription preferences locally). The practical issue is in-session restarts: if your server restarts while a client session is active, the client does not automatically know to re-subscribe. Emit notifications/resources/list_changed on server startup to prompt clients to re-sync the resource catalog, which implicitly signals that subscription state may have been lost. Clients that re-fetch the catalog will typically re-subscribe to resources they care about as part of their catalog processing logic.

Does adding completions require changes to existing tools, or is it additive?

Completions are purely additive. You add a single CompleteRequestSchema handler that routes by tool name and argument name. Existing tool handlers are unchanged — the completion handler runs before tool invocation and is entirely independent of it. Clients that don't support completions ignore the completions: {} capability declaration and never send completion/complete requests. The only coupling is declaring the capability correctly: declare completions: {} in your server constructor and implement the handler. Declaring without implementing produces a MethodNotFound error; implementing without declaring means clients never send completion requests. Neither case affects tool functionality, only completion availability.