Guide · Python · FastMCP

Python MCP server — FastMCP SDK, tools, resources, and deployment

The Model Context Protocol Python SDK ships a high-level API called FastMCP that mirrors the decorator-based style of FastAPI. You define tools, resources, and prompts as annotated Python functions and FastMCP handles JSON schema generation, protocol framing, transport setup, and error mapping automatically. This guide covers everything from a minimal working server to production SSE deployment: tool definitions, Pydantic model inputs, both transport modes, client-side config for Claude Desktop and Cursor, and how to monitor your server once it is deployed.

TL;DR

Install with pip install mcp or uv add mcp. Import FastMCP, create an instance, decorate async functions with @mcp.tool(), and call mcp.run() for stdio or mcp.run(transport="sse") for HTTP/SSE. Python type annotations on function parameters become the tool's JSON schema automatically — no separate schema definition needed. For complex inputs, use Pydantic BaseModel subclasses as parameter types. Once deployed, add your SSE URL to AliveMCP to monitor the full initialize → tools/list handshake from outside your network.

Installation

The official Python SDK is the mcp package on PyPI. The FastMCP class is included in the same package — no separate install is needed:

# pip
pip install mcp

# uv (recommended for development and deployment)
uv add mcp

# with SSE transport dependencies (uvicorn + starlette)
pip install "mcp[cli]"

Python 3.10 or later is required. The SDK depends on anyio for async I/O and pydantic v2 for schema generation. If your project already uses Pydantic v2, you will not have a version conflict.

Minimal server

A complete, working Python MCP server is five lines:

from mcp.server.fastmcp import FastMCP

mcp = FastMCP("my-server")

@mcp.tool()
async def add(a: int, b: int) -> int:
    """Add two integers and return the result."""
    return a + b

if __name__ == "__main__":
    mcp.run()

FastMCP reads the function's type annotations (a: int, b: int) and return type (int) to generate the tool's JSON schema. The docstring becomes the tool's description — the text the LLM reads when deciding whether to call the tool. mcp.run() starts stdio transport by default, reading from stdin and writing to stdout.

Run it directly to verify:

python server.py

The process waits for JSON-RPC input. Use the MCP inspector to interact with it without writing raw protocol messages:

npx @modelcontextprotocol/inspector python server.py

Tool definitions and type annotations

FastMCP maps Python types to JSON schema types automatically:

Python type	JSON schema type
`str`	`string`
`int`	`integer`
`float`	`number`
`bool`	`boolean`
`list[str]`	`array` of strings
`dict[str, Any]`	`object`
`Optional[str]`	`string`, nullable
Pydantic `BaseModel`	`object` with all fields

Default values make parameters optional in the schema:

@mcp.tool()
async def search_docs(
    query: str,
    limit: int = 10,
    include_archived: bool = False
) -> list[dict]:
    """Search the documentation index.

    Args:
        query: Full-text search query string.
        limit: Maximum number of results to return (default 10).
        include_archived: Whether to include archived pages (default False).
    """
    results = await run_search(query, limit, include_archived)
    return [{"title": r.title, "url": r.url, "snippet": r.snippet} for r in results]

The docstring's Args: block populates per-field descriptions in the schema, which helps the LLM understand each parameter's purpose.

Pydantic models as tool inputs

For tools with more than three or four parameters, or for inputs with validation constraints, use a Pydantic BaseModel as the parameter type. FastMCP calls .model_json_schema() on the model and uses the result as the tool's input schema:

from pydantic import BaseModel, Field, model_validator

class CreateIssueInput(BaseModel):
    title: str = Field(..., min_length=5, max_length=200,
                       description="Issue title, 5–200 characters")
    body: str = Field(..., description="Issue body in Markdown")
    labels: list[str] = Field(default_factory=list,
                               description="Label names to apply")
    priority: str = Field("normal",
                          pattern="^(low|normal|high|critical)$",
                          description="Priority level")

    @model_validator(mode="after")
    def critical_requires_body(self) -> "CreateIssueInput":
        if self.priority == "critical" and len(self.body) < 50:
            raise ValueError("Critical issues require a body of at least 50 characters")
        return self

@mcp.tool()
async def create_issue(issue: CreateIssueInput) -> dict:
    """Create a new issue in the project tracker."""
    result = await tracker.create(issue.model_dump())
    return {"id": result.id, "url": result.url}

The @model_validator runs before the tool handler receives the input. If validation fails, FastMCP returns an isError: true tool result with the validation message — the LLM can read the error and correct the input rather than crashing the session.

See MCP server Pydantic validation for nested models, discriminated unions, and custom validators.

Resources and prompts

FastMCP supports all three MCP primitive types — tools, resources, and prompts:

# Resource: read-only data exposed at a URI
@mcp.resource("config://app/{key}")
async def get_config(key: str) -> str:
    """Expose application configuration as a readable resource."""
    value = config.get(key)
    if value is None:
        raise KeyError(f"Config key not found: {key}")
    return str(value)

# Prompt: server-controlled message template
@mcp.prompt()
async def code_review_prompt(pr_url: str, focus: str = "security") -> list[dict]:
    """Generate a structured code review prompt for a pull request."""
    diff = await fetch_pr_diff(pr_url)
    return [
        {"role": "user", "content": f"Review this PR for {focus} issues:\n\n{diff}"}
    ]

Resources are accessed via their URI template. The MCP client can subscribe to resource changes if your server calls mcp.resource_updated("config://app/rate_limit") — the client receives a notification and can refetch.

Stdio transport (local servers)

Stdio is the right choice for servers that run locally on the same machine as the client. The client spawns your Python script as a subprocess:

if __name__ == "__main__":
    mcp.run()  # stdio is the default

Client configuration for Claude Desktop:

{
  "mcpServers": {
    "my-server": {
      "command": "python",
      "args": ["/absolute/path/to/server.py"],
      "env": {
        "DATABASE_URL": "postgresql://localhost/mydb",
        "API_KEY": "sk-..."
      }
    }
  }
}

Critical: use the full absolute path to the Python interpreter, not just python or python3. Claude Desktop's subprocess does not inherit your shell's PATH, virtual environment activation, or pyenv shims. Get the correct path with:

# In your virtual environment
which python
# → /Users/you/.venv/bin/python (use this in the config)

# With uv
which uv
# Use uv as the command to let it manage the venv automatically

With uv, the config becomes simpler because uv run activates the project's virtual environment before running the script:

{
  "mcpServers": {
    "my-server": {
      "command": "/Users/you/.cargo/bin/uv",
      "args": ["run", "--directory", "/absolute/path/to/project", "python", "server.py"]
    }
  }
}

Never write to stdout from a stdio MCP server — print statements, logging to stdout, or any output on stdout other than JSON-RPC messages breaks the protocol silently. Configure your logger to write to stderr or a file:

import logging
import sys

logging.basicConfig(
    level=logging.INFO,
    stream=sys.stderr,  # MCP clients read stderr but not as protocol data
    format="%(asctime)s %(name)s %(levelname)s %(message)s"
)

SSE transport (remote servers)

For servers deployed remotely — accessible to multiple clients, or requiring network-level monitoring — use SSE transport:

from mcp.server.fastmcp import FastMCP
import uvicorn

mcp = FastMCP("remote-server")

@mcp.tool()
async def fetch_data(resource_id: str) -> dict:
    """Fetch resource from the data API."""
    return await data_api.get(resource_id)

if __name__ == "__main__":
    # SSE transport: starts Starlette app on port 8000
    uvicorn.run(mcp.sse_app(), host="0.0.0.0", port=8000)

The mcp.sse_app() method returns a Starlette ASGI application with the SSE endpoint at /sse and the message POST endpoint at /messages. Start it with uvicorn:

uvicorn server:mcp --factory --host 0.0.0.0 --port 8000
# or with the app object directly
uvicorn server:app --host 0.0.0.0 --port 8000

Environment variables for configuration:

import os
from mcp.server.fastmcp import FastMCP

mcp = FastMCP(
    "remote-server",
    host=os.getenv("MCP_HOST", "0.0.0.0"),
    port=int(os.getenv("PORT", "8000")),
)

Set PORT from the environment — Railway, Render, and Fly.io all inject this variable automatically. Do not hardcode port 8000 in production.

Once deployed, add your SSE URL (https://yourserver.example.com/sse) to AliveMCP to monitor the full MCP handshake from an external network. AliveMCP probes the initialize → tools/list sequence continuously, alerting you when the protocol fails even if your HTTP health check endpoint returns 200.

Error handling

FastMCP catches exceptions in tool handlers and converts them to isError: true tool results — the protocol-level mechanism for reporting recoverable failures to the LLM. The LLM receives your error message and can decide to retry, fall back, or inform the user:

@mcp.tool()
async def get_user(user_id: str) -> dict:
    """Fetch user profile by ID."""
    if not user_id.startswith("usr_"):
        raise ValueError(f"Invalid user ID format: {user_id!r}. Expected 'usr_' prefix.")
    user = await db.users.find(user_id)
    if user is None:
        raise KeyError(f"User not found: {user_id}")
    return user.model_dump()

For unexpected errors (database connection failure, external API timeout), FastMCP still returns isError: true with the exception message. Add structured logging around tool boundaries to capture these for your observability pipeline:

import logging
logger = logging.getLogger(__name__)

@mcp.tool()
async def risky_tool(param: str) -> str:
    try:
        return await external_api.call(param)
    except TimeoutError:
        logger.error("external_api timeout", extra={"param": param})
        raise RuntimeError("External API timed out — try again in 30 seconds")

See MCP server error handling for the two-tier error model (protocol errors vs. tool errors) and the right error type for each failure mode.

Environment variables and secrets

FastMCP integrates with python-dotenv for local development:

from dotenv import load_dotenv
load_dotenv()  # loads .env before FastMCP reads os.environ

import os
from mcp.server.fastmcp import FastMCP

DATABASE_URL = os.environ["DATABASE_URL"]  # fail at startup if missing
API_KEY = os.environ["API_KEY"]

mcp = FastMCP("my-server")

Using os.environ["KEY"] (not os.getenv("KEY")) causes the server to fail at startup with a KeyError if a required variable is missing. This is better than silently starting and failing on the first tool call.

For stdio servers, pass secrets in the env block of the client's config file rather than baking them into the script. For SSE servers, inject via platform environment variables (Railway variables, Render env groups, Fly.io secrets). See MCP server secrets management for rotation and audit patterns.

Monitoring a deployed Python MCP server

A Python MCP server running in SSE mode is observable from outside your network — the same way any HTTP service is. AliveMCP probes your /sse endpoint every 60 seconds, running the full initialize → tools/list handshake and checking that your registered tools are advertised correctly. If the server crashes, runs out of memory, or starts returning error responses, AliveMCP alerts you before your users encounter the failure.

Internal monitoring complements external probing. Log tool invocations with duration and outcome to stderr (or a log aggregator), and track error rates with a simple counter:

import time, logging
logger = logging.getLogger("mcp.tools")

@mcp.tool()
async def monitored_tool(param: str) -> str:
    start = time.monotonic()
    try:
        result = await do_work(param)
        logger.info("tool.ok", extra={"tool": "monitored_tool", "ms": (time.monotonic()-start)*1000})
        return result
    except Exception as exc:
        logger.error("tool.error", extra={"tool": "monitored_tool", "error": str(exc)})
        raise

See MCP server observability for combining AliveMCP external probing with internal structured logging and metrics.