Guide · Contract Testing

MCP server contract testing

An MCP client hardcodes the tool names it calls, the argument shapes it sends, and the output fields it reads. When an MCP server renames a tool, adds a required argument, or removes a response field, every client depending on that server breaks — and nothing in the usual test pyramid catches it. Consumer-driven contract testing flips the dependency: the client writes down what it expects from the server ("I call get_user with {id: string} and expect {name: string, email: string}"), and the server's CI pipeline verifies it can still satisfy that contract before every deploy. The non-obvious part: this requires verifying JSON Schema compatibility, not just field presence — because adding a required field is a breaking change even when the field name is brand new.

TL;DR

Consumer-driven contract testing for MCP has three moving parts: the consumer (your MCP client) writes a JSON contract file declaring which tools it uses, what inputs it sends, and what output fields it reads; the provider (your MCP server) reads that contract in its test suite and verifies that tools/list contains the expected tools with compatible schemas; CI exchanges the contract as a shared artifact so the server cannot deploy if it breaks any published consumer expectation. The subtle rule: JSON Schema backward compatibility means you can add optional properties but not add required ones — any server that promotes an existing optional field to required or introduces a new required field breaks every client that was calling the tool without that field. Use a JSON Schema compatibility checker in your provider verification step, not just a structural diff.

The MCP compatibility problem

MCP clients and servers communicate through a JSON-RPC protocol where tools are first-class named entities. A client that wants to call the search_documents tool must know: the exact tool name, the shape of arguments the tool accepts, and the structure of the content array it returns. These facts are discovered at runtime via tools/list, but the client code that uses the tool is written ahead of time and compiled into the agent or application.

This creates a fragile coupling. Consider what happens when a server team makes what looks like an incremental improvement:

They rename search_documents to search_docs to match a new naming convention.
They add a required workspace_id argument because the server is now multi-tenant.
They change the output from { title, body } to { title, content, metadata }.

Each change is invisible to the server's own test suite. The server tests pass because they call the tool with the new arguments and read the new output shape. The client fails at runtime, usually with a cryptic error: the MCP SDK reports "tool not found" for the old name, the tool call returns isError: true because the required argument is missing, or the client crashes reading a field that no longer exists.

Unit tests on the client side do not help either — they mock the server. Integration tests help only if the client and server test suites share the same deployed instance, which is rarely true across repository boundaries. Contract testing is the mechanism that closes this gap.

Contract testing model for MCP

Contract testing was popularised by Pact for REST microservices. The same pattern adapts directly to MCP's JSON-RPC tool call model. There are two roles:

Role	Responsibility	When it runs
Consumer	Publishes a contract JSON file declaring tool expectations	During client CI build
Provider	Reads the contract and verifies each expectation is satisfied	Before server deploy

A contract file for an MCP interaction captures three things:

Tool existence: the consumer names the tools it depends on and asserts they appear in tools/list.
Input schema compatibility: the consumer declares the argument shape it sends; the provider verifies the tool's inputSchema is backward-compatible (i.e., accepts that shape without error).
Output shape compatibility: the consumer declares the fields it reads from the response; the provider sends the declared input and verifies the response contains those fields.

Unlike Pact for HTTP, MCP contracts do not record HTTP request/response pairs. They record tool call interactions: a tools/list assertion plus one or more tools/call interaction records. The MCP SDK's InMemoryTransport is the right harness for running these verifications — the server boots in-process, the contract test calls tools through the real MCP protocol, and the transport adds no network latency. See MCP server unit testing for the InMemoryTransport setup pattern.

Writing consumer expectations

The consumer writes a contract file — a plain JSON document — as part of its own test suite. The test that generates this file acts as executable documentation: if the client code changes to call a different tool or read a different field, the contract file must be regenerated and re-published.

// contract-generator/src/generate-contract.ts
import * as fs from 'node:fs/promises';
import * as path from 'node:path';

// This type describes what the consumer cares about — not the full MCP schema.
// Keep it minimal: only the fields your client code actually reads.
interface ToolContract {
  toolName: string;
  // A sample of arguments the consumer sends in real use
  exampleInput: Record<string, unknown>;
  // Minimum output fields the consumer reads — absence of any of these is a breaking change
  requiredOutputFields: string[];
}

interface McpConsumerContract {
  consumerName: string;
  providerName: string;
  generatedAt: string;
  tools: ToolContract[];
}

const contract: McpConsumerContract = {
  consumerName: 'agent-app-v2',
  providerName: 'docs-mcp-server',
  generatedAt: new Date().toISOString(),
  tools: [
    {
      toolName: 'get_user',
      exampleInput: { id: 'u_test_123' },
      requiredOutputFields: ['name', 'email'],
    },
    {
      toolName: 'search_documents',
      exampleInput: { query: 'contract testing', limit: 5 },
      requiredOutputFields: ['results'],
    },
    {
      toolName: 'create_document',
      exampleInput: { title: 'Test Doc', body: 'Hello world' },
      requiredOutputFields: ['id', 'created_at'],
    },
  ],
};

const outputPath = path.resolve(process.cwd(), 'contracts', 'agent-app-v2--docs-mcp-server.json');
await fs.mkdir(path.dirname(outputPath), { recursive: true });
await fs.writeFile(outputPath, JSON.stringify(contract, null, 2));
console.log(`Contract written to ${outputPath}`);

The consumer also needs to assert that the contract matches its own code. The right place for this is a test that imports both the contract generator and the actual client code and checks they agree. If the client calls a tool named get_user but the contract says getUser, the mismatch surfaces in the consumer's own CI before it ever reaches the provider.

// contract-generator/src/generate-contract.test.ts
import { describe, it, expect } from 'vitest';
import { TOOLS_USED } from '../../client/src/tool-registry.js';
import { contract } from './generate-contract.js';

describe('consumer contract self-consistency', () => {
  it('contract tool names match the tools the client actually calls', () => {
    const contractNames = new Set(contract.tools.map(t => t.toolName));
    for (const toolName of TOOLS_USED) {
      expect(contractNames).toContain(toolName);
    }
  });

  it('contract has no tool names the client never calls', () => {
    for (const tool of contract.tools) {
      expect(TOOLS_USED).toContain(tool.toolName);
    }
  });
});

Verifying contracts on the server side

The provider verification test reads the contract file, boots the MCP server in-process via InMemoryTransport, and checks each contract clause. There are two distinct checks: a tools/list check for tool existence and schema compatibility, and a tools/call check that sends the example input and inspects the response fields.

// src/contract-verification.test.ts
import { describe, it, expect, beforeAll, afterAll } from 'vitest';
import { Client } from '@modelcontextprotocol/sdk/client/index.js';
import { InMemoryTransport } from '@modelcontextprotocol/sdk/inMemory.js';
import { readFile } from 'node:fs/promises';
import { resolve } from 'node:path';
import { createServer } from './server.js';

// Load contract from shared artifact location (populated by CI)
const contractPath = resolve(process.env.CONTRACT_DIR ?? './contracts', 'agent-app-v2--docs-mcp-server.json');
const contractRaw = await readFile(contractPath, 'utf-8');
const contract = JSON.parse(contractRaw);

let client: Client;

beforeAll(async () => {
  const [serverTransport, clientTransport] = InMemoryTransport.createLinkedPair();
  const server = createServer(); // real server, no mocks
  await server.connect(serverTransport);

  client = new Client(
    { name: 'contract-verifier', version: '1.0.0' },
    { capabilities: {} }
  );
  await client.connect(clientTransport);
});

afterAll(async () => {
  await client.close();
});

describe(`Provider verification — ${contract.providerName} satisfies ${contract.consumerName}`, () => {
  it('all contracted tools appear in tools/list', async () => {
    const { tools } = await client.listTools();
    const toolNames = new Set(tools.map(t => t.name));

    for (const expected of contract.tools) {
      expect(toolNames, `Tool "${expected.toolName}" is missing from tools/list`).toContain(expected.toolName);
    }
  });

  it('contracted tool inputSchemas are backward-compatible with consumer example inputs', async () => {
    const { tools } = await client.listTools();
    const toolMap = new Map(tools.map(t => [t.name, t]));

    for (const expected of contract.tools) {
      const tool = toolMap.get(expected.toolName);
      if (!tool) continue; // already caught above

      const compat = checkInputCompatibility(tool.inputSchema, expected.exampleInput);
      expect(compat.compatible, `Tool "${expected.toolName}" inputSchema is incompatible: ${compat.reason}`).toBe(true);
    }
  });

  for (const expected of contract.tools) {
    it(`"${expected.toolName}" response contains required output fields`, async () => {
      const result = await client.callTool({
        name: expected.toolName,
        arguments: expected.exampleInput,
      });

      expect(result.isError, `"${expected.toolName}" returned isError with input ${JSON.stringify(expected.exampleInput)}`).toBeFalsy();

      // Parse the text content as JSON to check output fields
      const text = (result.content as Array<{ type: string; text: string }>)
        .filter(c => c.type === 'text')
        .map(c => c.text)
        .join('');

      let parsed: Record<string, unknown>;
      try {
        parsed = JSON.parse(text);
      } catch {
        // Non-JSON text responses: skip field checks (text presence alone is sufficient)
        return;
      }

      for (const field of expected.requiredOutputFields) {
        expect(parsed, `Field "${field}" missing from "${expected.toolName}" response`).toHaveProperty(field);
      }
    });
  }
});

The checkInputCompatibility helper is the key function — it validates that the server's inputSchema can accept the consumer's example input without rejecting it as invalid. We implement it using JSON Schema validation:

// src/contract-verification.ts
import Ajv from 'ajv';

const ajv = new Ajv({ allErrors: true });

interface CompatResult {
  compatible: boolean;
  reason?: string;
}

export function checkInputCompatibility(
  schema: Record<string, unknown>,
  exampleInput: Record<string, unknown>
): CompatResult {
  let validate: ReturnType<typeof ajv.compile>;
  try {
    validate = ajv.compile(schema);
  } catch (err) {
    return { compatible: false, reason: `Schema is not valid JSON Schema: ${err}` };
  }

  const valid = validate(exampleInput);
  if (!valid) {
    const errors = (validate.errors ?? []).map(e => `${e.instancePath} ${e.message}`).join('; ');
    return { compatible: false, reason: errors };
  }

  return { compatible: true };
}

JSON Schema compatibility rules

The subtlety in MCP contract testing is that "the tool still exists" is not sufficient. What matters is whether the server's current inputSchema is backward-compatible with what the consumer sends. JSON Schema has precise rules about what constitutes a breaking change. Developers who are not familiar with schema evolution regularly introduce breaking changes while believing they are making additive improvements.

Change	Breaking?	Why
Add an optional property (`required` list unchanged)	No	Existing inputs still validate; the new property has a default or is ignored
Add a new `required` property	Yes	Existing inputs that omit the property now fail schema validation
Promote an existing optional property to `required`	Yes	Same as above — clients that omit it are now invalid
Remove a property from `properties`	Yes	Clients sending that property may fail if `additionalProperties: false`
Narrow a property type (`string` → `string \| null` reversed)	Yes	Inputs that were valid are now invalid
Widen a property type (`string` → `string \| number`)	No	All previously valid inputs are still valid
Add an enum value (with open "any other" handling)	No	Existing inputs still match; new value is additive
Remove an enum value	Yes	Clients sending the removed value now fail
Rename a tool	Yes	The old name disappears from `tools/list`
Change `additionalProperties` from `true` to `false`	Yes	Clients sending extra fields are now rejected

The most common mistake is adding a required field during a "non-breaking" refactor. A developer adds a required tenant_id argument to search_documents to support multi-tenancy. The server tests all pass because they always provide tenant_id. But every client calling search_documents without it — which is all of them, because the argument did not exist before — now receives a validation error or a tool call failure. Contract testing catches this because the consumer's example input does not include tenant_id, so the provider verification step fails on checkInputCompatibility.

For output schema changes, the symmetric rule applies: the provider must continue to return at least the fields the consumer declared it reads. Removing a field from the output, even renaming it, is a breaking change. Adding new fields is safe — the consumer ignores fields it does not know about.

// Detecting breaking changes programmatically before writing the contract
// Run this as part of a server schema changelog tool

import Ajv from 'ajv';

const ajv = new Ajv();

export function isBackwardCompatible(
  oldSchema: Record<string, unknown>,
  newSchema: Record<string, unknown>
): { compatible: boolean; breakingChanges: string[] } {
  const breakingChanges: string[] = [];

  const oldRequired = new Set<string>((oldSchema.required as string[]) ?? []);
  const newRequired = new Set<string>((newSchema.required as string[]) ?? []);

  // New required fields are always breaking
  for (const field of newRequired) {
    if (!oldRequired.has(field)) {
      breakingChanges.push(`New required field "${field}" — clients that omit it will fail`);
    }
  }

  // Removed properties are breaking if additionalProperties is false
  const oldProps = Object.keys((oldSchema.properties as object) ?? {});
  const newProps = new Set(Object.keys((newSchema.properties as object) ?? {}));
  for (const prop of oldProps) {
    if (!newProps.has(prop)) {
      breakingChanges.push(`Property "${prop}" removed — clients sending it may fail with additionalProperties:false`);
    }
  }

  return { compatible: breakingChanges.length === 0, breakingChanges };
}

CI workflow: contract exchange

Contract testing only works if the server's CI pipeline actually reads the latest contracts published by the consumer. There are three common exchange mechanisms, in increasing order of sophistication: committing the contract file to the provider repository, uploading it to S3 or a GCS bucket, or using a Pact Broker instance. The simplest approach that works across repositories is a shared S3 bucket.

The consumer workflow publishes the contract after its tests pass:

# .github/workflows/publish-contract.yml  (consumer repository)
name: Publish MCP consumer contract

on:
  push:
    branches: [main]
    paths:
      - 'src/**'
      - 'contracts/**'

jobs:
  publish:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - uses: actions/setup-node@v4
        with:
          node-version: '22'
          cache: 'npm'

      - run: npm ci

      - name: Run consumer self-consistency tests
        run: npm test -- --reporter=verbose

      - name: Generate contract file
        run: npx tsx contract-generator/src/generate-contract.ts

      - name: Upload contract to shared artifact bucket
        env:
          AWS_ACCESS_KEY_ID: ${{ secrets.CONTRACT_S3_KEY_ID }}
          AWS_SECRET_ACCESS_KEY: ${{ secrets.CONTRACT_S3_SECRET }}
          AWS_DEFAULT_REGION: us-east-1
        run: |
          aws s3 cp \
            contracts/agent-app-v2--docs-mcp-server.json \
            s3://my-org-mcp-contracts/agent-app-v2--docs-mcp-server.json \
            --metadata "commit=${{ github.sha }},consumer=agent-app-v2"

The provider workflow downloads all contracts for its service and runs verification before every deploy:

# .github/workflows/verify-contracts.yml  (provider repository)
name: Verify MCP provider contracts

on:
  push:
    branches: [main]
  pull_request:

jobs:
  verify:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4

      - uses: actions/setup-node@v4
        with:
          node-version: '22'
          cache: 'npm'

      - run: npm ci

      - name: Download all consumer contracts for this provider
        env:
          AWS_ACCESS_KEY_ID: ${{ secrets.CONTRACT_S3_KEY_ID }}
          AWS_SECRET_ACCESS_KEY: ${{ secrets.CONTRACT_S3_SECRET }}
          AWS_DEFAULT_REGION: us-east-1
        run: |
          mkdir -p contracts
          # Download any contract file that ends with "--docs-mcp-server.json"
          aws s3 cp s3://my-org-mcp-contracts/ contracts/ \
            --recursive \
            --exclude "*" \
            --include "*--docs-mcp-server.json"

      - name: Run contract verification tests
        env:
          CONTRACT_DIR: ./contracts
        run: npm test -- --reporter=verbose src/contract-verification.test.ts

      - name: Block deploy if verification fails
        if: failure()
        run: |
          echo "::error::Contract verification failed. One or more consumers depend on a tool schema you changed."
          exit 1

  deploy:
    needs: [verify]
    runs-on: ubuntu-latest
    if: github.ref == 'refs/heads/main'
    steps:
      - run: echo "Deploying — all contracts verified"
      # ... your actual deploy steps

The needs: [verify] dependency on the deploy job is the enforcement gate. A server that breaks a consumer contract cannot reach production. The consumer's published contract is the binding agreement, and the provider's CI is the place where that agreement is enforced.

Contract tests vs AliveMCP monitoring

Contract testing and runtime monitoring solve different problems, and both are required for production reliability.

Concern	Contract tests	AliveMCP monitoring
Breaking schema changes	Detected at build time, before deploy	Not detected — schema compatibility is not checked at runtime
Server is up and reachable	Not checked — tests run in-process	Probed every 60 seconds via network
MCP protocol handshake succeeds	Tested in-process via InMemoryTransport	Tested over real network to production URL
tools/list returns expected tools	Verified against contract at build time	Verified against expected set in monitor config
Deployment failure (server crash on startup)	Not detected	Detected within 60 seconds via health check failure
Dependency outage (DB down, external API unreachable)	Not detected	Detected if tool calls return errors or server fails liveness checks
Gradual performance degradation	Not detected	Tracked via response time metrics and alerting

The split is clean: contract tests are your build-time guarantee that your server can still fulfill its client obligations. AliveMCP is your runtime guarantee that the server is actually running and reachable. A server can pass every contract test in CI and then fail to start in production because of a missing environment variable or a database migration that did not complete. AliveMCP catches that within a minute and pages your on-call rotation.

The combination eliminates two distinct failure modes: silent schema drift (caught by contracts before deploy) and silent runtime failures (caught by AliveMCP after deploy). Neither layer substitutes for the other. See also MCP server health checks for the endpoint patterns AliveMCP uses to probe your server.