Guide · Database & Event Architecture
PostgreSQL Connection Patterns for MCP Servers — pooling, PgBouncer, and health checks
MCP tool calls are short-lived and can arrive in parallel bursts. Each call needs a database connection, holds it for a few milliseconds, and releases it. That pattern interacts badly with PostgreSQL's per-connection process model: too many simultaneous connections exhaust max_connections, new connections are rejected with ECONNREFUSED, and the MCP server returns isError:true responses — with no visible protocol failure. This guide covers how to size a pg.Pool correctly, when to introduce PgBouncer in transaction pooling mode, how to detect pool exhaustion before it cascades, and how to write health checks that validate the full path from MCP client to PostgreSQL row.
TL;DR
Use pg.Pool with max set to floor((max_connections - reserved) / instance_count). Set connectionTimeoutMillis to 3000 — fail fast instead of letting tool calls queue indefinitely. Add a /health endpoint that runs SELECT 1 and checks pool.waitingCount === 0. For more than ~5 instances sharing one PostgreSQL server, add PgBouncer in transaction mode between the MCP servers and PostgreSQL. Register your server with AliveMCP: a canary tool call that hits the database validates the full stack, catching connection pool failures that the protocol-level probe misses.
Why PostgreSQL connections are different for MCP servers
A traditional web server handles browser requests: each request is typically initiated by a human, arrives at moderate concurrency, and the application holds a database connection for tens to hundreds of milliseconds. Connection pools are sized to match observed browser traffic.
MCP server traffic has a different profile. An AI agent running a multi-step plan can issue dozens of tool calls in rapid succession — a planning loop, a search phase, a refinement phase. Each tool call arrives as a separate MCP request, each needs a database connection, and the agent may parallelize them. An agent orchestrating five sub-agents, each calling three tools, generates fifteen concurrent tool calls in the same second. A connection pool sized for browser-style traffic will saturate under this load.
The failure mode is dangerous because it is invisible at the protocol layer. When pg.Pool exhausts its connections and a new checkout times out, it throws an Error: timeout exceeded when trying to connect. The MCP server catches this and returns a valid MCP response with isError: true and a message like "database unavailable." The MCP protocol layer — the initialize handshake, the tools/list response — remains healthy. External uptime monitors that probe only the protocol layer see a green status while tool calls are failing.
Sizing a pg.Pool for MCP server workloads
PostgreSQL has a hard limit on simultaneous connections set by the max_connections parameter (default: 100). Every connection consumes approximately 5–10 MB of server RAM. The pool on each MCP server instance holds open connections that count against this limit.
Pool sizing formula
Start with the total available connections and divide by the number of MCP server instances:
// Pool sizing calculation
const MAX_PG_CONNECTIONS = 100; // PostgreSQL max_connections
const RESERVED_CONNECTIONS = 10; // superuser + migrations + admin + replication
const INSTANCE_COUNT = 3; // MCP server replicas
const poolMax = Math.floor(
(MAX_PG_CONNECTIONS - RESERVED_CONNECTIONS) / INSTANCE_COUNT
);
// = Math.floor((100 - 10) / 3) = 30
const pool = new Pool({
host: process.env.PGHOST,
database: process.env.PGDATABASE,
user: process.env.PGUSER,
password: process.env.PGPASSWORD,
max: poolMax, // 30 per instance
idleTimeoutMillis: 30_000, // release idle connections after 30s
connectionTimeoutMillis: 3_000, // fail fast if no connection available
statement_timeout: 10_000, // prevent long-running queries blocking pool
});
The connectionTimeoutMillis setting is critical. Without it, tool calls queue indefinitely waiting for a pool slot. A single slow query that holds a connection for 30 seconds causes a cascade: new tool calls queue, the queue grows, users see 30-second hangs. With connectionTimeoutMillis: 3000, tool calls fail fast and return a clear error instead of blocking the server.
Headroom rule
Size the pool at 70–80% of the mathematically available limit, not 100%. Leave headroom for:
- Temporary spikes from agent parallel fan-out
- Health check connections (discussed below)
- Administrative queries during incidents
- Replication connections if using streaming replication
A pool at 100% capacity has no slack. The first agent fan-out that exceeds it starts failing. A pool at 75% capacity absorbs burst traffic without error.
Monitoring pool utilization
// Pool utilization metrics — expose via /metrics or structured log
function poolMetrics(pool: Pool) {
return {
total: pool.totalCount, // connections in pool
idle: pool.idleCount, // available connections
waiting: pool.waitingCount, // requests waiting for a connection
utilization: pool.totalCount > 0
? (pool.totalCount - pool.idleCount) / pool.totalCount
: 0,
};
}
// Log on every tool call, or expose via metrics endpoint
setInterval(() => {
const m = poolMetrics(pool);
if (m.waiting > 0 || m.utilization > 0.8) {
console.log(JSON.stringify({
level: m.waiting > 0 ? 'warn' : 'info',
event: 'pool_utilization',
...m,
}));
}
}, 5000);
Alert when waitingCount > 0 for more than a few seconds — that is the leading indicator of pool exhaustion. Alert when utilization > 0.85 sustained — that signals it is time to either reduce pool consumers or add PgBouncer.
PgBouncer in transaction pooling mode
Once you have more than five MCP server instances sharing a single PostgreSQL server, direct connection pools from each instance multiply quickly. Three instances × 30 connections = 90, already near the default limit. PgBouncer sits between the MCP servers and PostgreSQL, multiplexing many application connections into a smaller number of PostgreSQL connections.
Why transaction mode is the correct fit for MCP servers
PgBouncer supports three pooling modes: session (one PostgreSQL connection per application connection for its entire lifetime), transaction (one PostgreSQL connection per transaction, released immediately after), and statement (one PostgreSQL connection per statement). MCP tool calls map perfectly to transaction mode: each tool call typically runs one or a few queries in a discrete transaction, then releases the connection.
| Pooling mode | Connection held for | Suitable for MCP servers? |
|---|---|---|
| Session | Entire application connection lifetime | No — defeats the purpose; one connection per pool slot |
| Transaction | One transaction (BEGIN → COMMIT/ROLLBACK) | Yes — matches the short-lived tool call pattern exactly |
| Statement | One SQL statement | Avoid — breaks multi-statement transactions |
With PgBouncer in transaction mode, 10 MCP server instances each with a pool of 50 connections (500 application-side connections total) multiplex down to 30–50 actual PostgreSQL connections. The PostgreSQL server sees a manageable load; the MCP servers get fast connection checkout from a large application-side pool.
PgBouncer caveat: prepared statements
Transaction pooling mode is incompatible with server-side prepared statements (pg's default for cached queries). Disable prepared statements when using PgBouncer:
// PgBouncer-compatible connection string
const pool = new Pool({
connectionString: process.env.DATABASE_URL, // points to PgBouncer port
max: 50,
// Disable statement caching — required for transaction pooling mode
statement_cache_size: 0,
});
Or use Drizzle ORM with pg and disable prepared statements in its configuration. Prisma handles this automatically when the connection string points to PgBouncer with ?pgbouncer=true appended.
Health checks that validate the full database path
A process-level health check that returns 200 OK if the Node.js process is alive is not sufficient. The MCP server process can be running, accepting connections, and passing protocol-layer probes while the database pool is exhausted and all tool calls are failing.
A correct health endpoint for an MCP server backed by PostgreSQL validates three things:
- The pool has an available connection (not fully saturated)
- PostgreSQL is reachable and responsive (not just that the pool object exists)
- There is no growing backlog of waiting connection requests
// Health endpoint that validates the full DB path
app.get('/health', async (req, res) => {
const metrics = poolMetrics(pool);
// Check 1: pool not saturated
if (metrics.waiting > 5) {
return res.status(503).json({
status: 'degraded',
reason: 'connection_pool_backlog',
pool: metrics,
});
}
// Check 2: PostgreSQL is actually reachable
let client;
try {
client = await pool.connect();
await client.query('SELECT 1');
} catch (err) {
return res.status(503).json({
status: 'unhealthy',
reason: 'database_unreachable',
error: err.message,
pool: metrics,
});
} finally {
client?.release();
}
// Check 3: connection checkout was fast
res.json({
status: 'ok',
pool: metrics,
});
});
The SELECT 1 query validates the connection to PostgreSQL is live, not just that a connection object exists in the pool. The pool metrics in the response body let you see utilization trends when debugging incidents.
Separating /live from /ready
In Kubernetes deployments, separate the liveness probe (is the process alive?) from the readiness probe (is the server ready to serve traffic?). The /live endpoint should always return 200 as long as the process is running — it controls restart behavior. The /ready endpoint includes the database check — it controls load balancer routing. A server that cannot reach PostgreSQL should be removed from rotation, not restarted.
// /live — for Kubernetes liveness probe (controls restart)
app.get('/live', (req, res) => res.json({ status: 'ok' }));
// /ready — for Kubernetes readiness probe (controls traffic routing)
app.get('/ready', async (req, res) => {
// ... full database validation as above
});
Read/write splitting with two pools
Many MCP tool implementations have a natural read/write split: list, search, and get tools are read-only; create, update, and delete tools require a write to the primary. Read tools can be served by a read replica, reducing load on the primary and improving throughput.
// Two pools: primary for writes, replica for reads
const primaryPool = new Pool({
host: process.env.PGHOST_PRIMARY,
max: 20,
connectionTimeoutMillis: 3000,
});
const replicaPool = new Pool({
host: process.env.PGHOST_REPLICA,
max: 30, // Replicas can serve more connections
connectionTimeoutMillis: 3000,
});
// Route by operation type
const READ_TOOLS = new Set(['search_items', 'list_items', 'get_item', 'count_items']);
function dbPool(toolName: string): Pool {
return READ_TOOLS.has(toolName) ? replicaPool : primaryPool;
}
// Usage in tool handler
server.setRequestHandler(CallToolRequestSchema, async (request) => {
const pool = dbPool(request.params.name);
const client = await pool.connect();
try {
// ... execute query
} finally {
client.release();
}
});
Be aware that async replication means the replica may be slightly behind the primary. Avoid routing a read that immediately follows a write to the replica — the write may not have propagated yet. If a tool creates a record and then returns a list that should include it, route both operations to the primary, or introduce a short delay before the read.
Replica lag monitoring
// Check replica lag on the replica connection
async function replicaLagSeconds(pool: Pool): Promise {
const client = await pool.connect();
try {
const res = await client.query<{ lag: string }>(
`SELECT extract(epoch FROM (now() - pg_last_xact_replay_timestamp()))
AS lag`
);
return parseFloat(res.rows[0]?.lag ?? 'null');
} catch {
return null;
} finally {
client.release();
}
}
// In health endpoint: fail readiness if replica is too far behind
const lagSeconds = await replicaLagSeconds(replicaPool);
if (lagSeconds === null || lagSeconds > 30) {
return res.status(503).json({
status: 'degraded',
reason: 'replica_lag_exceeded',
lag_seconds: lagSeconds,
});
}
Graceful shutdown: draining the pool
When the MCP server process receives a shutdown signal (SIGTERM from Kubernetes, PM2 restart, systemd stop), in-flight tool calls may hold database connections. If the process exits immediately, those connections are abruptly closed, leaving PostgreSQL in a state where it must wait for the TCP keepalive timeout to detect the dead connection before freeing it — potentially blocking future connections for up to a minute.
// Graceful pool shutdown
process.on('SIGTERM', async () => {
console.log(JSON.stringify({ event: 'shutdown_started' }));
// Stop accepting new MCP connections
server.close();
// Wait for in-flight tool calls to complete, then drain pool
// Pool.end() waits for all checked-out connections to be released
await pool.end();
console.log(JSON.stringify({ event: 'pool_drained' }));
process.exit(0);
});
In Kubernetes, set terminationGracePeriodSeconds to a value longer than your longest expected tool call duration (typically 30–60 seconds). The sequence is: SIGTERM → MCP server stops accepting new requests → in-flight requests complete → pool drains → process exits cleanly.
AliveMCP and database-layer monitoring
AliveMCP's protocol probe (TCP → initialize → tools/list) verifies that the MCP server is accepting connections and returning a valid tool manifest. It does not validate that tool calls succeed — a server with an exhausted connection pool will pass the protocol probe while failing every actual tool call.
The two mechanisms to extend AliveMCP's coverage to the database layer:
- Custom health URL: point AliveMCP's health check at
/healthor/ready. AliveMCP will alert if the endpoint returns non-2xx, which happens when the database is unreachable or the pool is saturated. - Canary tool call: add a
health_checktool to your MCP server that does aSELECT 1and returns the pool metrics. Configure AliveMCP to call this tool on each probe cycle and alert if it returnsisError: true. This validates the full path: MCP protocol → connection pool → PostgreSQL → response.
// health_check tool — validates DB connectivity from a tool call
server.tool('health_check', {}, async () => {
const client = await pool.connect().catch(err => {
throw new Error(`pool_exhausted: ${err.message}`);
});
try {
await client.query('SELECT 1');
const m = poolMetrics(pool);
return {
content: [{ type: 'text', text: JSON.stringify({ ok: true, pool: m }) }],
};
} finally {
client.release();
}
});
This approach catches the most common silent failure for database-backed MCP servers: the pool is healthy but PostgreSQL is unreachable due to a firewall rule change, credential rotation, or PostgreSQL restart. The protocol probe stays green; the canary tool call goes red.
Frequently asked questions
What is the right pool size for a single MCP server instance with a dedicated PostgreSQL server?
For a dedicated PostgreSQL server (not shared with other applications), set max to approximately 20–30 connections for a single MCP server instance. PostgreSQL performs best below 100 total connections — above that, locking overhead and process scheduling degrade throughput even if connections are available. If your tool calls are fast (under 10ms average), 20 connections can serve over 2,000 tool calls per second via multiplexing. If calls are slow (100ms average), you may need more. Monitor waitingCount to determine if you've sized too small.
When does PgBouncer stop being optional and become necessary?
Add PgBouncer when: (1) you have more than 5–6 MCP server instances, and each instance × pool size approaches max_connections; (2) your PostgreSQL server is showing CPU pressure from connection overhead even at moderate concurrency; (3) you need zero-downtime restarts of MCP server instances (PgBouncer keeps connections alive during the restart). For a single-instance deployment on a dedicated PostgreSQL server, a well-tuned pg.Pool is sufficient.
What happens to in-flight queries when I set statement_timeout?
statement_timeout cancels any single query that runs longer than the threshold (sending a PostgreSQL cancellation signal). The connection is returned to the pool in a clean state. The tool call receives a QueryCanceledError, which you should catch and return as an isError: true response with a user-readable message. This is intentional: a tool call that holds a database connection for 10+ seconds is either running an unindexed query or operating on a table under lock contention. Failing fast with a clear error is better than letting it queue.
How do I detect connection leaks (connections checked out but never released)?
A connection leak shows as pool.idleCount continuously decreasing while traffic is normal. The pool eventually saturates even during low traffic. To detect leaks: log poolMetrics() every 30 seconds and alert when idleCount === 0 persists for more than 60 seconds. To prevent leaks: always use a try/finally block that calls client.release() even when the query throws. If using Drizzle or Prisma, their built-in connection management handles this. For raw pg queries, never await a query outside of a try/finally.
Should I use a single shared Pool instance or create a Pool per tool?
A single shared Pool instance per MCP server process is strongly preferred. Multiple pools add up to PostgreSQL's max_connections limit independently — three pools of 10 connections each consume 30 PostgreSQL connections even if each pool is mostly idle. A single pool shares all connections across all tool handlers, providing better utilization and a clear view of total connection consumption in the pool metrics.
Further reading
- MCP Server Connection Pooling — pool patterns and saturation detection
- Read Replica Routing for MCP Servers — write splitting, lag detection, and health checks
- Building Database Query Tools for MCP Servers — safe queries, ORM patterns
- MCP Server Health Checks — liveness, readiness, and custom probes
- Backpressure and Bounded Queues for MCP Servers — preventing pool exhaustion
- Graceful Shutdown for MCP Servers — draining connections without losing requests