Guide · FastAPI · Python MCP

FastAPI MCP server — mounting SSE transport alongside REST routes

FastAPI and FastMCP solve adjacent problems — FastAPI exposes REST endpoints to browsers and programmatic HTTP clients, FastMCP exposes tools to AI agents via the Model Context Protocol. In many production services you need both: a REST API for your web frontend and a companion MCP interface for AI integrations. Rather than running two separate processes, you can mount FastMCP's SSE application as a Starlette sub-application inside FastAPI. Both interfaces share the same process, the same database connections, and the same Pydantic models. This guide shows how to wire them together, add authentication, and operate the combined server in production.

TL;DR

Create a FastMCP instance and a FastAPI app separately, then mount the MCP SSE app inside FastAPI: app.mount("/mcp", mcp.sse_app()). Your REST endpoints live at /api/... and the MCP SSE endpoint lives at /mcp/sse. Share Pydantic models between both. Run with uvicorn app:app --host 0.0.0.0 --port 8000. Add your /mcp/sse URL to AliveMCP to monitor the MCP protocol layer independently of your REST health checks.

Installation

pip install fastapi mcp uvicorn
# or with uv
uv add fastapi mcp uvicorn

FastMCP's SSE transport uses Starlette internally. FastAPI is built on Starlette too, so there is no dependency conflict — both share the same Starlette version.

Minimal combined server

The core pattern is straightforward: create both instances, register tools on the MCP instance, and mount it inside FastAPI:

from fastapi import FastAPI
from mcp.server.fastmcp import FastMCP
import uvicorn

# FastAPI for REST endpoints
app = FastAPI(title="My Service API")

# FastMCP for MCP tools
mcp = FastMCP("my-service")

# Mount MCP SSE app under /mcp
app.mount("/mcp", mcp.sse_app())

# REST endpoint
@app.get("/api/status")
async def status():
    return {"status": "ok"}

# MCP tool
@mcp.tool()
async def get_status() -> dict:
    """Get service status."""
    return {"status": "ok", "version": "1.0.0"}

if __name__ == "__main__":
    uvicorn.run(app, host="0.0.0.0", port=8000)

After starting, your service exposes:

GET /api/status — REST endpoint for browsers and HTTP clients
GET /mcp/sse — MCP SSE connection endpoint for AI clients
POST /mcp/messages — MCP message endpoint for tool calls
GET /docs — FastAPI's automatic Swagger UI for the REST API

Sharing Pydantic models between REST and MCP

The main advantage of combining FastAPI and FastMCP in one process is sharing your Pydantic models. Define them once and use them as both FastAPI request/response bodies and FastMCP tool input schemas:

from pydantic import BaseModel, Field
from fastapi import FastAPI
from mcp.server.fastmcp import FastMCP

app = FastAPI()
mcp = FastMCP("inventory-service")

class ProductQuery(BaseModel):
    sku: str = Field(..., description="Product stock-keeping unit identifier")
    warehouse_id: str = Field("default", description="Warehouse to query")
    include_variants: bool = Field(False, description="Include product variants")

class ProductResult(BaseModel):
    sku: str
    name: str
    quantity: int
    location: str

# Same model used in FastAPI endpoint
@app.post("/api/products/query", response_model=ProductResult)
async def query_product_rest(query: ProductQuery) -> ProductResult:
    return await inventory.query(query)

# And in MCP tool — identical input shape, same validation
@mcp.tool()
async def query_product(query: ProductQuery) -> dict:
    """Query product inventory by SKU."""
    result: ProductResult = await inventory.query(query)
    return result.model_dump()

app.mount("/mcp", mcp.sse_app())

When you update ProductQuery — adding a field, changing a constraint — both the REST endpoint and the MCP tool pick up the change simultaneously. No duplicate schema maintenance.

Shared database connections

Use FastAPI's lifespan context manager to open database connections at startup and close them at shutdown. Both REST and MCP handlers access the same connection pool via a module-level variable:

from contextlib import asynccontextmanager
from fastapi import FastAPI
from mcp.server.fastmcp import FastMCP
import asyncpg

db_pool: asyncpg.Pool | None = None

@asynccontextmanager
async def lifespan(app: FastAPI):
    global db_pool
    db_pool = await asyncpg.create_pool(
        dsn=os.environ["DATABASE_URL"],
        min_size=2,
        max_size=10
    )
    yield
    await db_pool.close()

app = FastAPI(lifespan=lifespan)
mcp = FastMCP("db-service")

@app.get("/api/users/{user_id}")
async def get_user_rest(user_id: str):
    row = await db_pool.fetchrow("SELECT * FROM users WHERE id=$1", user_id)
    return dict(row)

@mcp.tool()
async def get_user(user_id: str) -> dict:
    """Fetch a user by their unique ID."""
    row = await db_pool.fetchrow("SELECT id, name, email FROM users WHERE id=$1", user_id)
    if row is None:
        raise KeyError(f"User not found: {user_id}")
    return dict(row)

app.mount("/mcp", mcp.sse_app())

The connection pool opens once and is shared across both REST requests and MCP tool calls. Size the pool for the combined peak concurrency, not just the REST or MCP load separately.

Authentication middleware

Add API key or Bearer token authentication as FastAPI middleware. The middleware runs on all routes including the mounted MCP sub-app:

from fastapi import FastAPI, Request, HTTPException
from fastapi.responses import JSONResponse
import hmac, os

VALID_KEYS = set(os.environ["API_KEYS"].split(","))

app = FastAPI()

@app.middleware("http")
async def require_api_key(request: Request, call_next):
    # Allow health checks without auth
    if request.url.path in ("/health", "/"):
        return await call_next(request)

    auth = request.headers.get("Authorization", "")
    if not auth.startswith("Bearer "):
        return JSONResponse({"error": "Missing Authorization header"}, status_code=401)

    token = auth.removeprefix("Bearer ").strip()
    # Constant-time comparison to prevent timing attacks
    if not any(hmac.compare_digest(token, key) for key in VALID_KEYS):
        return JSONResponse({"error": "Invalid API key"}, status_code=401)

    return await call_next(request)

mcp = FastMCP("auth-service")
app.mount("/mcp", mcp.sse_app())

MCP clients pass the Authorization header in the SSE connection request and in every subsequent POST. The middleware validates it on both. For more granular tool-level authorization (different permissions per tool), check the key's associated scopes inside the tool handler rather than in the middleware.

See MCP server authentication for JWT validation, OAuth2 token introspection, and JWKS-based key rotation patterns.

Health checks that cover both layers

FastAPI's /health route covers the HTTP layer. Add a separate probe for the MCP protocol layer so you can detect cases where HTTP is up but MCP initialization is broken:

@app.get("/health")
async def health():
    """HTTP health check for load balancers and orchestrators."""
    try:
        # Verify database connectivity
        await db_pool.fetchval("SELECT 1")
        return {"status": "ok", "db": "ok"}
    except Exception as exc:
        return JSONResponse({"status": "degraded", "db": str(exc)}, status_code=503)

@app.get("/health/mcp")
async def health_mcp():
    """Verify MCP tools are registered (shallow check)."""
    tools = await mcp.list_tools()
    return {"status": "ok", "tools": len(tools)}

For true external MCP protocol validation — testing that the initialize handshake succeeds from outside your network — use AliveMCP. It probes /mcp/sse every 60 seconds and alerts when the protocol layer fails even if /health returns 200. This catches cases where uvicorn is up but the FastMCP layer has a startup error or tool registration bug.

Production deployment with uvicorn

Run the combined app with uvicorn in production. Use multiple workers for CPU-bound REST routes, but note that multiple workers means each worker has its own MCP instance — if you need shared state across MCP sessions, store it in Redis or Postgres, not in memory:

# Single worker (shared in-memory state works)
uvicorn app:app --host 0.0.0.0 --port $PORT

# Multiple workers (no shared in-memory state between workers)
uvicorn app:app --host 0.0.0.0 --port $PORT --workers 4

# With gunicorn as process manager (production standard)
gunicorn app:app \
  -k uvicorn.workers.UvicornWorker \
  --workers 4 \
  --bind 0.0.0.0:$PORT \
  --timeout 120

SSE connections are long-lived — a single MCP session holds a persistent GET /mcp/sse connection for the duration of the agent's session. Set --timeout 120 (or higher) in gunicorn to prevent it from killing long-running SSE connections. Nginx or Caddy in front should also have proxy_read_timeout 300 for the /mcp path.

# Caddy reverse proxy with extended timeout for SSE
handle /mcp/* {
    reverse_proxy localhost:8000 {
        transport http {
            read_buffer 4096
        }
        flush_interval -1
    }
}

See MCP server deployment guide for platform-specific configuration on Railway, Render, and Fly.io.

Separating REST and MCP into different path prefixes

By convention, mount MCP at /mcp and keep REST routes under /api/v1. This makes routing configuration cleaner and lets you apply different middleware or rate limits to each path group:

from fastapi import APIRouter

api_router = APIRouter(prefix="/api/v1")

@api_router.get("/products")
async def list_products():
    return await db.products.all()

app.include_router(api_router)
app.mount("/mcp", mcp.sse_app())

# Caddy can route /mcp/* to the same backend
# or to a separate MCP-only process if you want independent scaling

If MCP tool call volume is very high independently of REST traffic, consider splitting them into separate uvicorn processes with a reverse proxy routing by path. The FastMCP instance can run standalone with uvicorn mcp_app:mcp --factory while the REST API runs separately. This is usually not necessary until MCP traffic becomes a significant fraction of total service load.