Protocol guide · 2026-06-10 · Production MCP servers

MCP Protocol Features Beyond Tools: Resources, Prompts, Sampling, Roots, and Annotations

Most MCP servers start the same way: register a few tools, connect a transport, ship. That covers about one fifth of what the MCP protocol actually offers. The other four primitives — resources, prompts, sampling, and roots — plus the tool annotation system, each enable a qualitatively different category of capability. A server that knows about all five is an entirely different class of server than a tool executor: it can push live data into LLM context, control the interaction pattern from the server side, reason autonomously without a user prompting each step, discover the user's workspace without asking for file paths as arguments, and declare its own safety profile to agentic workflows. This guide covers when each primitive is the right choice, how each one works at the API level, and how they compose into production-ready server architectures.

TL;DR

The five-primitive capability map

Each MCP primitive answers a different question about how a server participates in an LLM conversation:

PrimitiveDirectionClient callsWhat it returnsBest for
ToolsClient → Servertools/callText result or structured dataActions, writes, computations, external API calls
ResourcesClient → Serverresources/readContent with MIME type and URIFiles, DB records, config snapshots, live data feeds
PromptsClient → Serverprompts/getArray of user/assistant messagesGuided workflows, reusable interaction patterns
SamplingServer → Client → LLM(server initiates)LLM response back to serverAgentic loops, self-verification, sub-task reasoning
RootsServer → Clientroots/listWorkspace URIsWorkspace-aware tools, auto-scoped file operations
Annotations(tool metadata)(no call needed)Behavioral hints on tools/listSafe auto-calling, confirmation dialogs, retry safety

Tools and resources are the two primitives that answer "what can the LLM get from the server." The distinction is intent: tools execute actions and may have side effects; resources expose data for reading and are expected to be side-effect-free. Prompts are server-authored interaction patterns that shift some UX control from the client to the server. Sampling and roots both involve the server reaching back toward the client — sampling to request LLM inference, roots to request workspace context. Annotations are metadata on tools that influence how clients handle them.

Resources: passive data for LLM context

The MCP Resources protocol is the mechanism for servers to expose readable data artifacts. A database row, a configuration file, a recent log slice, an API response snapshot — any piece of data the LLM should read for context but not modify is a resource candidate. Resources are registered with server.resource() using a URI and a read handler:

server.resource(
  'app-config',
  'config://app/settings',
  { name: 'Application Settings', mimeType: 'application/json' },
  async (uri) => ({
    contents: [{
      uri: uri.href,
      mimeType: 'application/json',
      text: JSON.stringify(await db.config.getAll(), null, 2),
    }],
  })
);

URI schemes identify the data domain. The convention is a descriptive scheme prefix — db:// for database records, config:// for configuration, git:// for version control state, logs:// for log content — followed by a path that locates the specific resource. Use a ResourceTemplate when a single registration covers many URIs (a parameterized pattern like db://users/{userId} that matches any user row). The template's list handler enables resources/list enumeration, which is how clients discover what resources are available before deciding what to fetch.

MIME types determine whether content goes in the text field (plain text, Markdown, JSON, CSV — any UTF-8) or the blob field (base64-encoded binary for images, PDFs, audio). Clients use the MIME type to decide how to display or process the content in the UI.

For live data that changes while the client has an open session, resources support subscriptions. When a client calls resources/subscribe on a URI, it signals that it wants change notifications. Call server.sendResourceUpdated(uri) whenever the underlying data changes; the client will re-call resources/read to get the fresh version. For changes to which resources exist (new rows added, files created), call server.sendResourceListChanged(). These notifications let the LLM's context window stay current without polling.

The failure mode to understand: resources and tools share the same server process and transport. A crashed server makes both tools and resources unavailable simultaneously — but the LLM may not surface this symmetrically. An LLM that was pulling configuration context from a resource endpoint may silently fail to load context without an explicit error if the resource is unreachable. External protocol monitoring that probes the full initialize handshake catches this at the server level, before any individual resource read fails.

Prompts: server-controlled interaction patterns

The MCP Prompts protocol lets servers publish reusable, parameterized message templates. Unlike tools (which run code and return results) or resources (which return data artifacts), prompts return a messages array — an ordered sequence of user and assistant turns that the client injects directly into the LLM conversation. The server controls the message structure; the client handles delivery. This inversion of control is what makes prompts useful: a server that knows its domain — a code review tool, a database query assistant, a customer support bot — can ship the optimal interaction pattern as a protocol primitive that any compatible client can invoke by name.

import { z } from 'zod';

server.prompt(
  'code-review',
  'Structured code review for a specific language and focus',
  {
    language: z.string().describe('Programming language of the code'),
    focus: z.enum(['security', 'performance', 'readability'])
             .describe('Primary review dimension'),
  },
  async ({ language, focus }) => ({
    messages: [
      {
        role: 'user',
        content: {
          type: 'text',
          text: `You are an expert ${language} reviewer. Focus on ${focus}. ` +
                `Structure your feedback as: Summary, Critical Issues, Suggestions.`,
        },
      },
    ],
  })
);

Argument schemas are declared with Zod. The Zod schema generates the argument definitions that appear in the prompts/list response, so clients know what parameters to collect from the user. Arguments are strings in the MCP protocol — the client coerces user input to strings before calling prompts/get. Required arguments that are missing cause the SDK to throw before your handler runs, converted to a protocol error.

Prompt handlers are async and can call your database or external APIs to generate dynamic content. A prompt that includes the current state of a database record — fetched fresh at call time — is more useful than one that embeds static template strings. You can also embed resource references in prompt messages using type: 'resource' content, which instructs the client to pull a live resource into the message when delivering it to the LLM.

Multi-turn prompts include assistant-role messages to pre-populate conversational context. This is how you prime the LLM into a specific persona or establish a stylistic baseline before the user's first real turn. The messages array flows into the LLM conversation exactly as structured — user/assistant interleaving is preserved.

When your prompt catalog changes at runtime (new prompt registered, old one retired), call server.sendPromptListChanged(). Clients listening for notifications/prompts/list_changed will re-call prompts/list to refresh their UI. This is how dynamic prompt catalogs — ones that adapt to the current user's permissions or workspace context — stay synchronized across open sessions.

Sampling: the server asks the LLM

MCP sampling is the most architecturally unusual primitive: it inverts the normal request direction. Instead of the LLM calling your server via a tool call, your server asks the LLM a question by routing a message array through the client using sampling/createMessage. The client may present the request to the user for approval before forwarding it to the model. The model's response comes back to your tool handler as a structured result. This is what enables agentic loops, self-verification, and multi-step reasoning without requiring the user to prompt each step individually.

The canonical use cases for sampling are: self-verification (generate a result with one tool call, then sample the LLM to verify or critique it within the same call), structured extraction (receive unstructured text as tool input, use sampling to parse it into a schema, continue the handler with the structured result), and agentic sub-tasks (break a complex request into steps and use sampling to resolve intermediate decisions before assembling the final result).

server.tool(
  'analyze_and_verify',
  'Analyze code and sample LLM for a self-review',
  { code: z.string() },
  async ({ code }, context) => {
    // Check that the client supports sampling
    const caps = context.server.getClientCapabilities();
    if (!caps?.sampling) {
      return { content: [{ type: 'text', text: 'Client does not support sampling.' }], isError: true };
    }

    const analysis = await runStaticAnalysis(code);

    const review = await context.server.server.createMessage({
      messages: [
        { role: 'user', content: { type: 'text',
          text: `Review this analysis for false positives:\n${JSON.stringify(analysis)}` } },
      ],
      maxTokens: 512,
      modelPreferences: {
        hints: [{ name: 'claude-3-5' }],
        intelligencePriority: 0.8,
        speedPriority: 0.2,
      },
    });

    return {
      content: [
        { type: 'text', text: `Analysis:\n${JSON.stringify(analysis, null, 2)}` },
        { type: 'text', text: `Verification:\n${review.content.text}` },
      ],
    };
  }
);

Three things to get right with sampling. First, always check the capability before using it — not all clients implement sampling/createMessage, and calling it on an unsupported client produces a protocol error. Check context.server.getClientCapabilities()?.sampling at the start of any handler that uses sampling, and return an isError: true response if it is absent rather than throwing. Second, cap agentic loops at a fixed iteration limit (three is a reasonable default) and check context.signal?.aborted on each iteration to handle client-side cancellation. An unbounded loop that keeps sampling until it reaches a conclusion will eventually hit token limits, timeout, or exhaust the client's approval patience. Third, handle sampling denial gracefully — the client can refuse to forward a sampling request if the user declines. Catch this and return a degraded-but-useful result rather than propagating an unhandled exception.

Model preferences are hints, not requirements. The modelPreferences.hints array lists model name substrings in preference order; the client selects the actual model from its available options. The three priority weights — costPriority, speedPriority, intelligencePriority — are values between 0 and 1 that express relative preferences. For a self-verification step, high intelligencePriority and low costPriority is appropriate. For a quick classification, high speedPriority and low intelligencePriority is right. The client normalizes and maps these to an actual model.

Roots: workspace context from the client

MCP roots solve a specific friction point in file-system-aware servers: without roots, a tool that operates on files must require the user to provide a file path as an argument every time. With roots, the server can ask the client "what directories does the user have open?" and use that context to scope operations automatically. This is how Claude Desktop and other MCP hosts share workspace context — the folders and URIs visible in the IDE or editor — with servers that need to know where to look.

Declare roots support in server capabilities and subscribe to change notifications:

const server = new McpServer(
  { name: 'workspace-server', version: '1.0.0' },
  { capabilities: { roots: { listChanged: true } } }
);

// On connect, fetch the initial root list
let currentRoots: Root[] = [];

server.server.setNotificationHandler(
  RootsListChangedNotificationSchema,
  async () => {
    const result = await server.server.sendRequest(
      { method: 'roots/list', params: {} },
      ListRootsResultSchema
    );
    currentRoots = result.roots;
    server.sendResourceListChanged(); // rebuild resource catalog
  }
);

Each root in the roots/list response has a uri (typically a file:// URI) and an optional human-readable name. In tools that operate on files, convert the file:// URI to a filesystem path with fileURLToPath() from Node's built-in url module, then search across all current roots. When the user opens an additional workspace folder, the notifications/roots/list_changed notification fires and you re-fetch the roots list to get the updated set.

Security scoping matters when using roots for write operations. Before any write that targets a path derived from tool arguments, validate that the resolved path is actually inside one of the known root directories — use path.relative(rootPath, targetPath) and check that the result does not start with ../. This prevents a malicious or poorly-formed prompt from using a tool to write outside the workspace scope the user has authorized. Roots are advisory (a client may provide any URI scheme or none at all), so always design roots-aware tools to degrade gracefully when no roots are provided — fall back to requiring the path as an explicit tool argument.

Roots and resources compose naturally. When roots change, rebuild your resource catalog by calling server.sendResourceListChanged() and updating the list of resources your server exposes. A file-browser server might dynamically populate its resource list with one entry per file in the current workspace roots, refreshing whenever the root set changes. This gives the LLM an up-to-date inventory of what files are available to read without manually enumerating them.

Tool annotations: behavioral hints for safe workflows

MCP tool annotations are metadata attached to tool definitions that tell clients what each tool does to the world. They are hints — advisory declarations that help clients make better auto-approval decisions — not an enforcement mechanism. The four behavioral flags are:

AnnotationDefaultWhat it signalsClient behavior
readOnlyHintfalseTool makes no state changesSafe to auto-call in loops without confirmation
destructiveHinttrueTool may irreversibly delete or overwriteRequire user confirmation before executing
idempotentHintfalseRepeated calls produce the same resultSafe to retry automatically on failure
openWorldHinttrueHas side effects outside your serverElevates confirmation requirement

The defaults are conservative: every unannotated tool is treated as potentially destructive, non-idempotent, and open-world. This means an agentic framework running tools in a loop will pause and ask the user before calling any tool unless you explicitly annotate it. Adding readOnlyHint: true to tools that genuinely make no writes — search, lookup, fetch, compute — is the most impactful annotation to add first. It removes friction from read-heavy agentic workflows without requiring the user to approve every retrieval.

server.tool(
  'search_database',
  'Search records by query string',
  { query: z.string() },
  { annotations: { readOnlyHint: true, idempotentHint: true, openWorldHint: false } },
  async ({ query }) => {
    const results = await db.search(query);
    return { content: [{ type: 'text', text: JSON.stringify(results) }] };
  }
);

server.tool(
  'delete_record',
  'Permanently delete a record by ID',
  { id: z.string(), confirm: z.literal(true) },
  { annotations: { destructiveHint: true, idempotentHint: false, openWorldHint: false } },
  async ({ id }) => {
    await db.records.delete(id);
    return { content: [{ type: 'text', text: `Deleted record ${id}` }] };
  }
);

The title field is a human-readable display name for the tool, separate from the machine-readable name used in tools/call. Claude Desktop and other UI-surfacing clients display title in permission dialogs and tool listings, so a clear title directly improves user comprehension of what the LLM is requesting.

Annotations are not a security boundary. A malicious client can ignore them entirely, and the server has no way to enforce that the client respected a readOnlyHint before auto-calling. Use readOnlyHint to match client behavior to tool intent; use authentication, RBAC, and input validation to actually control what operations are permitted.

Composing the primitives: a worked example

The primitives are most powerful when they work together. Consider a code repository assistant that uses all five:

None of these five requires the others to be present. A server that only uses resources is valid. A server that only annotates tools is valid. But the value compounds: roots eliminate argument friction, resources eliminate discovery friction, prompts eliminate client-side UX work, sampling eliminates multi-call orchestration, and annotations eliminate unnecessary confirmation dialogs.

Protocol-level monitoring covers all five primitives

Each primitive is a separate protocol handler registered on the same server. A server that crashes, becomes unreachable, or fails the initialize handshake loses all five capabilities simultaneously — but the failure may surface differently at the LLM layer. A missing resource might cause the LLM to proceed without context it was expecting, without an explicit error. A missing prompt might cause the client's UI to silently hide a feature. A sampling failure during an agentic loop might produce a degraded but uncommunicated result.

The monitoring consequence: testing whether the initialize handshake succeeds is a better proxy for server health than checking any individual protocol method. A probe that completes the full three-message initialize sequence confirms that the transport is reachable, the process is running, and all registered protocol handlers are active. A probe that only sends an HTTP GET to a health endpoint confirms the HTTP server is up but says nothing about whether tools, resources, prompts, sampling, or roots will function.

AliveMCP runs this kind of deep protocol probe — a full initializetools/list round trip — every 60 seconds per monitored server. The probe validates the entire MCP handshake, not just HTTP reachability, and fires an alert within 60 seconds of any protocol failure regardless of which primitive is affected. This is the production monitoring gap that in-process profiling, unit tests, and health-check endpoints cannot fill — they are all inside the process that may be failing to respond.

Where to go from here

Each primitive has a dedicated technical reference with full code examples, error handling patterns, and protocol flows:

For the broader production picture — transport selection, authentication, deployment, and uptime monitoring — see the full set of production guides in the AliveMCP blog.