Guide · Multi-Tenant SaaS

Row-Level Security for MCP Servers — PostgreSQL RLS for multi-tenant data isolation

Schema-per-tenant isolation gives each tenant their own PostgreSQL schema, but it complicates migrations and caps tenant count at a few hundred before operational overhead becomes unmanageable. Row-level security (RLS) offers a different tradeoff: all tenants share the same tables, but PostgreSQL enforces at the database level that each query only sees rows belonging to the current tenant. This means one schema, one migration, and thousands of tenants in the same database — with isolation guaranteed by the database engine, not by application code. The catch is that RLS must be set up correctly: one missed policy, a superuser bypass, or a misconfigured application role can expose all tenant data. This guide covers RLS policy design, session variable injection via PostgreSQL connection settings, PgBouncer compatibility, and how to test that your policies actually work.

TL;DR

Enable RLS on every shared table, add a tenant_id column, create a non-superuser application role, and set app.current_tenant_id as a session variable before each query. Write policies that filter on current_setting('app.current_tenant_id'). Never use the superuser role for application queries — superusers bypass RLS. Test every policy with SET ROLE app_user; SET app.current_tenant_id = 'other-tenant'; SELECT * FROM tool_calls; — you should get zero rows.

Table design and RLS setup

Start by adding a tenant_id column to every table that holds tenant data and enabling RLS on those tables:

-- Schema setup (run once, not per-tenant)

-- 1. Add tenant_id to shared tables
ALTER TABLE tool_calls ADD COLUMN tenant_id UUID NOT NULL;
ALTER TABLE tool_events ADD COLUMN tenant_id UUID NOT NULL;
ALTER TABLE tenant_config ADD COLUMN tenant_id UUID NOT NULL;

-- 2. Index on tenant_id — critical for performance with RLS
CREATE INDEX CONCURRENTLY idx_tool_calls_tenant ON tool_calls (tenant_id);
CREATE INDEX CONCURRENTLY idx_tool_events_tenant ON tool_events (tenant_id);
CREATE INDEX CONCURRENTLY idx_tenant_config_tenant ON tenant_config (tenant_id);

-- 3. Enable RLS on each table
ALTER TABLE tool_calls ENABLE ROW LEVEL SECURITY;
ALTER TABLE tool_events ENABLE ROW LEVEL SECURITY;
ALTER TABLE tenant_config ENABLE ROW LEVEL SECURITY;

-- 4. Create a non-superuser application role
CREATE ROLE app_user NOINHERIT;
GRANT CONNECT ON DATABASE your_db TO app_user;
GRANT USAGE ON SCHEMA public TO app_user;
GRANT SELECT, INSERT, UPDATE, DELETE ON tool_calls TO app_user;
GRANT SELECT, INSERT, UPDATE, DELETE ON tool_events TO app_user;
GRANT SELECT, INSERT, UPDATE, DELETE ON tenant_config TO app_user;
-- Superusers bypass RLS — never use superuser for app queries

RLS policies

Policies define which rows a role can see and modify. Use PostgreSQL session variables (app.current_tenant_id) to pass the current tenant's ID into the policy without embedding it in the query itself:

-- SELECT policy: only see your own rows
CREATE POLICY tenant_isolation_select ON tool_calls
  FOR SELECT
  TO app_user
  USING (tenant_id = current_setting('app.current_tenant_id', true)::UUID);

-- INSERT policy: can only insert rows for your own tenant
CREATE POLICY tenant_isolation_insert ON tool_calls
  FOR INSERT
  TO app_user
  WITH CHECK (tenant_id = current_setting('app.current_tenant_id', true)::UUID);

-- UPDATE policy
CREATE POLICY tenant_isolation_update ON tool_calls
  FOR UPDATE
  TO app_user
  USING (tenant_id = current_setting('app.current_tenant_id', true)::UUID)
  WITH CHECK (tenant_id = current_setting('app.current_tenant_id', true)::UUID);

-- DELETE policy
CREATE POLICY tenant_isolation_delete ON tool_calls
  FOR DELETE
  TO app_user
  USING (tenant_id = current_setting('app.current_tenant_id', true)::UUID);

-- Apply the same four policies to every tenant-data table
-- (Automate with a stored procedure or migration script for large schema counts)

The true second argument to current_setting prevents an error when the variable is not set (returns NULL instead). When the variable is NULL, tenant_id = NULL evaluates to NULL (not TRUE), so no rows are returned — a safe default that makes unset context fail-closed.

Setting tenant context in the MCP server

Before executing any query, set the session variable to the current tenant's ID. In a connection pool, this must happen on every connection checkout — not once at pool creation — because connections are shared across tenants.

// db.ts — shared connection pool with RLS context injection
import { Pool } from 'pg';
import { AsyncLocalStorage } from 'node:async_hooks';

const pool = new Pool({
  connectionString: process.env.DATABASE_URL,
  max: 20,
  // IMPORTANT: connect as app_user, never as superuser
  // The connection string user should be app_user
});

// AsyncLocalStorage carries tenantId through async call chains
const tenantStorage = new AsyncLocalStorage<string>();

export function getCurrentTenantId(): string {
  const id = tenantStorage.getStore();
  if (!id) throw new Error('No tenant context — call withTenantContext() first');
  return id;
}

export function withTenantContext<T>(tenantId: string, fn: () => Promise<T>): Promise<T> {
  return tenantStorage.run(tenantId, fn);
}

// Every query must go through this function
export async function query<T = unknown>(
  sql: string,
  params: unknown[] = [],
): Promise<{ rows: T[] }> {
  const tenantId = getCurrentTenantId();
  const client = await pool.connect();

  try {
    // Set tenant context for this connection before any query
    // Use a transaction to ensure atomicity: SET + query run together
    await client.query('BEGIN');
    await client.query(
      `SELECT set_config('app.current_tenant_id', $1, true)`,  // true = transaction-local
      [tenantId]
    );
    const result = await client.query<T>(sql, params);
    await client.query('COMMIT');
    return result;
  } catch (err) {
    await client.query('ROLLBACK');
    throw err;
  } finally {
    client.release();
  }
}

The true third argument to set_config makes the variable transaction-local: it reverts when the transaction ends. This is safer than session-local settings because the connection returns to the pool with a clean state — no risk of one tenant's context leaking to the next tenant that checks out the connection.

PgBouncer compatibility

PgBouncer in transaction pooling mode releases connections back to the pool after each transaction. Session-level settings (SET app.current_tenant_id) do not survive the pool release in transaction mode — which is exactly why you must use transaction-local set_config(..., true) rather than session-level SET. A transaction-local setting is automatically cleaned up at COMMIT or ROLLBACK, so PgBouncer in transaction mode is compatible.

Do not use PgBouncer session pooling mode with RLS — in session mode, the same PostgreSQL session can be reused across multiple MCP tool calls from different tenants, and if the context variable isn't reset (e.g., due to a bug or crash), the next tenant inherits the previous tenant's context. Transaction pooling + transaction-local set_config is the safe combination.

Testing RLS policies

Testing RLS requires verifying that policies work as the application role, not as the test runner's superuser. Most test frameworks run as the database owner (superuser), which bypasses RLS. Explicitly switch roles in tests:

-- Manual verification (run as superuser, but switch to app_user to test)
BEGIN;
SET ROLE app_user;
SET LOCAL app.current_tenant_id = 'tenant-a-uuid';

-- Should return only tenant-a rows
SELECT COUNT(*) FROM tool_calls;

-- Attempt to see tenant-b rows — should return 0
SET LOCAL app.current_tenant_id = 'tenant-b-uuid';
SELECT * FROM tool_calls WHERE tenant_id = 'tenant-a-uuid';
-- Should be empty

ROLLBACK;

// Integration test (Node.js + pg)
it('RLS prevents cross-tenant reads', async () => {
  // Insert rows for tenant A
  await withTenantContext('tenant-a', () =>
    query('INSERT INTO tool_calls (tenant_id, tool_name, called_at, success) VALUES ($1, $2, NOW(), true)', ['tenant-a', 'search'])
  );

  // Query as tenant B — should see zero rows even though tenant-a row exists
  const result = await withTenantContext('tenant-b', () =>
    query('SELECT * FROM tool_calls')
  );

  expect(result.rows).toHaveLength(0);

  // Verify tenant A can see their own row
  const ownResult = await withTenantContext('tenant-a', () =>
    query('SELECT * FROM tool_calls')
  );

  expect(ownResult.rows).toHaveLength(1);
});

it('RLS prevents cross-tenant writes', async () => {
  // Insert row for tenant A
  await withTenantContext('tenant-a', () =>
    query('INSERT INTO tool_calls (tenant_id, tool_name, called_at, success) VALUES ($1, $2, NOW(), true)', ['tenant-a', 'search'])
  );

  // Attempt to insert a row with tenant-a's ID while running as tenant-b
  // The WITH CHECK policy should reject this
  await expect(
    withTenantContext('tenant-b', () =>
      query('INSERT INTO tool_calls (tenant_id, tool_name, called_at, success) VALUES ($1, $2, NOW(), true)', ['tenant-a', 'forge'])
    )
  ).rejects.toThrow(/new row violates row-level security policy/);
});

RLS health monitoring

RLS failures are silent: if the session variable is not set (due to a code path that skips context injection), queries return zero rows rather than an error. An agent that receives empty results from a tool that should return data may misinterpret this as "no matching records" rather than "context injection bug." Monitor for unexpectedly empty query results on canary tool calls:

// In /health: verify RLS context injection is working
async function checkRlsContextHealth(): Promise<'ok' | 'broken'> {
  // Use the canary tenant (a known tenant with seed data)
  const CANARY_TENANT = process.env.HEALTH_CHECK_TENANT_ID!;

  const result = await withTenantContext(CANARY_TENANT, () =>
    query('SELECT COUNT(*) AS count FROM tenant_config')
  );

  const count = parseInt(result.rows[0].count);

  // Canary tenant always has seed config rows — if count is 0, RLS context is broken
  if (count === 0) return 'broken';
  return 'ok';
}

AliveMCP probing your /health endpoint catches RLS context injection failures within one probe cycle (60 seconds) — before multiple tenant tool calls fail with empty results that look like data issues rather than infrastructure bugs.

Frequently asked questions

Does RLS add significant query overhead compared to schema-per-tenant?

With a proper index on tenant_id, RLS adds near-zero overhead — PostgreSQL appends the policy condition to the WHERE clause, and the query planner uses the index just like an explicit WHERE tenant_id = $1. Without the index, queries do a full table scan applying the policy row-by-row, which degrades O(total rows) rather than O(tenant rows). Always create an index on tenant_id before enabling RLS on tables with significant row counts. For a 10-million-row tool_calls table across 1,000 tenants, an indexed lookup for one tenant returns in the same time as if it were a dedicated table of 10,000 rows.

How do I run admin queries that need to see all tenants?

Admin queries (billing reconciliation, platform analytics, support debugging) run as a separate database role that does not have RLS policies applied — or as a superuser, which bypasses RLS entirely. Create a dedicated admin_user role with no RLS policies (policies are role-specific via the TO clause) and use it exclusively for admin operations. Never use the admin role in tool handler code. Alternatively, use SECURITY DEFINER functions that run as the definer (superuser) for specific admin operations, keeping the privilege escalation scoped to a named function rather than a role used in connection strings.

What happens if a tool handler forgets to set the tenant context?

With transaction-local set_config('app.current_tenant_id', ..., true), missing context means current_setting('app.current_tenant_id', true) returns NULL. The policy condition tenant_id = NULL evaluates to NULL, not TRUE — so the policy filters out all rows. The tool handler gets an empty result set. This is fail-closed behavior: a missing context leaks no data, but the tool returns empty results rather than an error. Add an explicit check at the start of every tool handler — if getCurrentTenantId() throws, return isError: true with a clear missing_tenant_context error code rather than letting it surface as an unexplained empty response.

Can I mix RLS (shared tables) and schema-per-tenant in the same deployment?

Yes. Use schema-per-tenant for large, tenant-specific datasets where isolation between tenants matters most for performance (e.g., a tenant's private product catalog with millions of rows). Use RLS shared tables for small, uniform data that all tenants share the same structure for (e.g., audit logs, tool call records, config). Many production multi-tenant deployments use a hybrid: shared schema with RLS for operational metadata + per-tenant schemas or databases for tenant business data that varies in schema or volume.

How do I handle tenant deletion with shared RLS tables?

With RLS on shared tables, deleting a tenant's data is a single DELETE FROM tool_calls WHERE tenant_id = $1 per table — much simpler than dropping a schema. Run deletions as the admin role (bypasses RLS so you can explicitly target the tenant's rows). Add ON DELETE CASCADE from the tenants table to child tables where appropriate so a single DELETE FROM tenants WHERE id = $1 cascades to all tenant data. For GDPR/data-retention compliance, log the deletion event with a timestamp and retain the tenant's UUID (but not PII) in a deleted_tenants table for audit purposes.