Guide · FastAPI · Python MCP
FastAPI MCP server — mounting SSE transport alongside REST routes
FastAPI and FastMCP solve adjacent problems — FastAPI exposes REST endpoints to browsers and programmatic HTTP clients, FastMCP exposes tools to AI agents via the Model Context Protocol. In many production services you need both: a REST API for your web frontend and a companion MCP interface for AI integrations. Rather than running two separate processes, you can mount FastMCP's SSE application as a Starlette sub-application inside FastAPI. Both interfaces share the same process, the same database connections, and the same Pydantic models. This guide shows how to wire them together, add authentication, and operate the combined server in production.
TL;DR
Create a FastMCP instance and a FastAPI app separately, then mount the MCP SSE app inside FastAPI: app.mount("/mcp", mcp.sse_app()). Your REST endpoints live at /api/... and the MCP SSE endpoint lives at /mcp/sse. Share Pydantic models between both. Run with uvicorn app:app --host 0.0.0.0 --port 8000. Add your /mcp/sse URL to AliveMCP to monitor the MCP protocol layer independently of your REST health checks.
Installation
pip install fastapi mcp uvicorn
# or with uv
uv add fastapi mcp uvicorn
FastMCP's SSE transport uses Starlette internally. FastAPI is built on Starlette too, so there is no dependency conflict — both share the same Starlette version.
Minimal combined server
The core pattern is straightforward: create both instances, register tools on the MCP instance, and mount it inside FastAPI:
from fastapi import FastAPI
from mcp.server.fastmcp import FastMCP
import uvicorn
# FastAPI for REST endpoints
app = FastAPI(title="My Service API")
# FastMCP for MCP tools
mcp = FastMCP("my-service")
# Mount MCP SSE app under /mcp
app.mount("/mcp", mcp.sse_app())
# REST endpoint
@app.get("/api/status")
async def status():
return {"status": "ok"}
# MCP tool
@mcp.tool()
async def get_status() -> dict:
"""Get service status."""
return {"status": "ok", "version": "1.0.0"}
if __name__ == "__main__":
uvicorn.run(app, host="0.0.0.0", port=8000)
After starting, your service exposes:
GET /api/status— REST endpoint for browsers and HTTP clientsGET /mcp/sse— MCP SSE connection endpoint for AI clientsPOST /mcp/messages— MCP message endpoint for tool callsGET /docs— FastAPI's automatic Swagger UI for the REST API
Sharing Pydantic models between REST and MCP
The main advantage of combining FastAPI and FastMCP in one process is sharing your Pydantic models. Define them once and use them as both FastAPI request/response bodies and FastMCP tool input schemas:
from pydantic import BaseModel, Field
from fastapi import FastAPI
from mcp.server.fastmcp import FastMCP
app = FastAPI()
mcp = FastMCP("inventory-service")
class ProductQuery(BaseModel):
sku: str = Field(..., description="Product stock-keeping unit identifier")
warehouse_id: str = Field("default", description="Warehouse to query")
include_variants: bool = Field(False, description="Include product variants")
class ProductResult(BaseModel):
sku: str
name: str
quantity: int
location: str
# Same model used in FastAPI endpoint
@app.post("/api/products/query", response_model=ProductResult)
async def query_product_rest(query: ProductQuery) -> ProductResult:
return await inventory.query(query)
# And in MCP tool — identical input shape, same validation
@mcp.tool()
async def query_product(query: ProductQuery) -> dict:
"""Query product inventory by SKU."""
result: ProductResult = await inventory.query(query)
return result.model_dump()
app.mount("/mcp", mcp.sse_app())
When you update ProductQuery — adding a field, changing a constraint — both the REST endpoint and the MCP tool pick up the change simultaneously. No duplicate schema maintenance.
Shared database connections
Use FastAPI's lifespan context manager to open database connections at startup and close them at shutdown. Both REST and MCP handlers access the same connection pool via a module-level variable:
from contextlib import asynccontextmanager
from fastapi import FastAPI
from mcp.server.fastmcp import FastMCP
import asyncpg
db_pool: asyncpg.Pool | None = None
@asynccontextmanager
async def lifespan(app: FastAPI):
global db_pool
db_pool = await asyncpg.create_pool(
dsn=os.environ["DATABASE_URL"],
min_size=2,
max_size=10
)
yield
await db_pool.close()
app = FastAPI(lifespan=lifespan)
mcp = FastMCP("db-service")
@app.get("/api/users/{user_id}")
async def get_user_rest(user_id: str):
row = await db_pool.fetchrow("SELECT * FROM users WHERE id=$1", user_id)
return dict(row)
@mcp.tool()
async def get_user(user_id: str) -> dict:
"""Fetch a user by their unique ID."""
row = await db_pool.fetchrow("SELECT id, name, email FROM users WHERE id=$1", user_id)
if row is None:
raise KeyError(f"User not found: {user_id}")
return dict(row)
app.mount("/mcp", mcp.sse_app())
The connection pool opens once and is shared across both REST requests and MCP tool calls. Size the pool for the combined peak concurrency, not just the REST or MCP load separately.
Authentication middleware
Add API key or Bearer token authentication as FastAPI middleware. The middleware runs on all routes including the mounted MCP sub-app:
from fastapi import FastAPI, Request, HTTPException
from fastapi.responses import JSONResponse
import hmac, os
VALID_KEYS = set(os.environ["API_KEYS"].split(","))
app = FastAPI()
@app.middleware("http")
async def require_api_key(request: Request, call_next):
# Allow health checks without auth
if request.url.path in ("/health", "/"):
return await call_next(request)
auth = request.headers.get("Authorization", "")
if not auth.startswith("Bearer "):
return JSONResponse({"error": "Missing Authorization header"}, status_code=401)
token = auth.removeprefix("Bearer ").strip()
# Constant-time comparison to prevent timing attacks
if not any(hmac.compare_digest(token, key) for key in VALID_KEYS):
return JSONResponse({"error": "Invalid API key"}, status_code=401)
return await call_next(request)
mcp = FastMCP("auth-service")
app.mount("/mcp", mcp.sse_app())
MCP clients pass the Authorization header in the SSE connection request and in every subsequent POST. The middleware validates it on both. For more granular tool-level authorization (different permissions per tool), check the key's associated scopes inside the tool handler rather than in the middleware.
See MCP server authentication for JWT validation, OAuth2 token introspection, and JWKS-based key rotation patterns.
Health checks that cover both layers
FastAPI's /health route covers the HTTP layer. Add a separate probe for the MCP protocol layer so you can detect cases where HTTP is up but MCP initialization is broken:
@app.get("/health")
async def health():
"""HTTP health check for load balancers and orchestrators."""
try:
# Verify database connectivity
await db_pool.fetchval("SELECT 1")
return {"status": "ok", "db": "ok"}
except Exception as exc:
return JSONResponse({"status": "degraded", "db": str(exc)}, status_code=503)
@app.get("/health/mcp")
async def health_mcp():
"""Verify MCP tools are registered (shallow check)."""
tools = await mcp.list_tools()
return {"status": "ok", "tools": len(tools)}
For true external MCP protocol validation — testing that the initialize handshake succeeds from outside your network — use AliveMCP. It probes /mcp/sse every 60 seconds and alerts when the protocol layer fails even if /health returns 200. This catches cases where uvicorn is up but the FastMCP layer has a startup error or tool registration bug.
Production deployment with uvicorn
Run the combined app with uvicorn in production. Use multiple workers for CPU-bound REST routes, but note that multiple workers means each worker has its own MCP instance — if you need shared state across MCP sessions, store it in Redis or Postgres, not in memory:
# Single worker (shared in-memory state works)
uvicorn app:app --host 0.0.0.0 --port $PORT
# Multiple workers (no shared in-memory state between workers)
uvicorn app:app --host 0.0.0.0 --port $PORT --workers 4
# With gunicorn as process manager (production standard)
gunicorn app:app \
-k uvicorn.workers.UvicornWorker \
--workers 4 \
--bind 0.0.0.0:$PORT \
--timeout 120
SSE connections are long-lived — a single MCP session holds a persistent GET /mcp/sse connection for the duration of the agent's session. Set --timeout 120 (or higher) in gunicorn to prevent it from killing long-running SSE connections. Nginx or Caddy in front should also have proxy_read_timeout 300 for the /mcp path.
# Caddy reverse proxy with extended timeout for SSE
handle /mcp/* {
reverse_proxy localhost:8000 {
transport http {
read_buffer 4096
}
flush_interval -1
}
}
See MCP server deployment guide for platform-specific configuration on Railway, Render, and Fly.io.
Separating REST and MCP into different path prefixes
By convention, mount MCP at /mcp and keep REST routes under /api/v1. This makes routing configuration cleaner and lets you apply different middleware or rate limits to each path group:
from fastapi import APIRouter
api_router = APIRouter(prefix="/api/v1")
@api_router.get("/products")
async def list_products():
return await db.products.all()
app.include_router(api_router)
app.mount("/mcp", mcp.sse_app())
# Caddy can route /mcp/* to the same backend
# or to a separate MCP-only process if you want independent scaling
If MCP tool call volume is very high independently of REST traffic, consider splitting them into separate uvicorn processes with a reverse proxy routing by path. The FastMCP instance can run standalone with uvicorn mcp_app:mcp --factory while the REST API runs separately. This is usually not necessary until MCP traffic becomes a significant fraction of total service load.
Related questions
Can I use FastAPI's dependency injection inside MCP tool handlers?
Not directly — FastMCP tool functions are plain async functions, not FastAPI request handlers, so FastAPI's Depends() system does not apply. The workaround is to access shared resources (database pools, clients) via module-level variables initialized in the FastAPI lifespan context manager, as shown in the shared connections example above. For more complex dependency graphs, create a simple service locator or use a module-level singleton pattern.
Does mounting FastMCP inside FastAPI add latency to MCP requests?
The overhead is negligible — Starlette's ASGI routing adds microseconds. The SSE connection for MCP is long-lived, so the mount overhead is amortized over the entire session. If you observe high latency, the cause is almost always the tool handler itself (slow database query, external API call) rather than the routing layer.
How do I handle CORS for the MCP SSE endpoint in a browser-based context?
FastAPI's CORSMiddleware applies to all routes including mounted sub-apps. Add it before mounting FastMCP to cover both REST and MCP paths. Be explicit with origins — never use allow_origins=["*"] with allow_credentials=True. See MCP server CORS configuration for the correct setup.
Can I use the FastAPI app directly as the MCP transport without mounting?
FastMCP's SSE transport is a Starlette app — it needs to be mounted or run independently. You cannot register MCP routes as regular FastAPI routes because the SSE protocol requires specific response handling (streaming, event framing) that differs from standard FastAPI response types. Always use app.mount("/mcp", mcp.sse_app()) rather than trying to replicate the SSE routing manually.
Further reading
- Python MCP server — FastMCP SDK overview and getting started
- Pydantic MCP server validation — BaseModel schemas and cross-field validators
- Python MCP server asyncio — concurrent tool execution and resource limits
- MCP server authentication — API key, JWT, and Bearer token patterns
- MCP server CORS — origin allowlists and credentials mode pitfalls
- MCP server deployment guide — Railway, Render, Fly.io, and Docker
- MCP server health checks — HTTP vs. protocol-layer validation
- AliveMCP — external uptime monitoring for FastAPI MCP servers