Guide · Property Testing
MCP server property-based testing
Example-based tests verify specific inputs you thought to write down. Property-based tests verify claims that must hold for every input in a defined domain — and they generate hundreds of random inputs automatically to check them. For MCP tool handlers, property testing is particularly valuable because LLM clients send unpredictable arguments: empty strings, very long text, Unicode emoji, nulls inside objects, integers at the boundaries of 32-bit ranges. A handler that works on your three example inputs can still crash on input 47. fast-check is the standard property-testing library for TypeScript/JavaScript; this guide covers how to integrate it with Vitest, derive arbitraries from Zod schemas, and verify the four invariants every MCP handler must satisfy.
TL;DR
Install fast-check. Write fc.assert(fc.property(arbitrary, handler => { /* assert invariant */ })) rather than one it() per example. The four invariants worth testing: (1) the handler never throws — always returns a valid CallToolResult, (2) when content[0].type === 'text' and the content is supposed to be structured JSON, JSON.parse() succeeds, (3) read-only tools are idempotent — same args produce the same output, (4) error results always have isError: true, content[0].type === 'text', and a non-empty message. When fast-check finds a failing input it automatically shrinks it to the simplest still-failing example — read that output first, not the original random seed.
What property tests catch that example tests miss
Consider a search_documents tool that accepts a query string and a limit integer. An example-based test suite might look like this:
// Example-based — only tests what you thought to write
it('returns results for a normal query', async () => {
const result = await client.callTool({
name: 'search_documents',
arguments: { query: 'typescript', limit: 10 },
});
expect(result.isError).toBeFalsy();
});
it('returns isError when limit is zero', async () => {
const result = await client.callTool({
name: 'search_documents',
arguments: { query: 'typescript', limit: 0 },
});
expect(result.isError).toBe(true);
});
These pass. But the handler contains this bug buried inside a helper:
// Inside the handler — a subtle crash waiting to happen
function buildSqlLike(query: string): string {
// Crashes when query contains a backslash followed by a quote
return `'%${query}%'`;
}
A property test catches it on the first run:
import fc from 'fast-check';
it('never throws for any valid string query and positive limit', async () => {
await fc.assert(
fc.asyncProperty(
fc.string(), // any string — including '', '\0', '\\\'', emoji
fc.integer({ min: 1, max: 1000 }),
async (query, limit) => {
const result = await client.callTool({
name: 'search_documents',
arguments: { query, limit },
});
// Invariant: always returns a result object — never throws
expect(result).toHaveProperty('content');
expect(Array.isArray(result.content)).toBe(true);
}
),
{ numRuns: 200 }
);
});
fast-check generates 200 random (query, limit) pairs. Within the first few runs it will produce a string containing a backslash-quote sequence, the handler throws instead of returning a result, and the test fails. fast-check then shrinks the failing input: the final reported counterexample might be query: "\\'" — the minimal string that reproduces the crash, not the original 47-character random string that first triggered it.
The inputs that most often expose bugs in MCP handlers: empty string (""), string with only whitespace (" "), string containing SQL metacharacters (%_\), very long string (50,000 characters), Unicode combining characters and right-to-left marks, null bytes (" "), integers at Number.MAX_SAFE_INTEGER, negative integers where only positive is documented, and arrays with zero elements.
Setting up fast-check with Vitest
npm install --save-dev fast-check
fast-check works with both Jest and Vitest without any additional configuration — it is a plain TypeScript library that exports fc.assert() and fc.property(). For async tool calls, use fc.asyncProperty():
// src/search.property.test.ts
import { describe, it, beforeEach, afterEach } from 'vitest';
import fc from 'fast-check';
import { Client } from '@modelcontextprotocol/sdk/client/index.js';
import { InMemoryTransport } from '@modelcontextprotocol/sdk/inMemory.js';
import { createServer } from './server.js';
import { createFakeDeps } from './test-helpers.js';
describe('search_documents — property invariants', () => {
let client: Client;
beforeEach(async () => {
const [serverTransport, clientTransport] = InMemoryTransport.createLinkedPair();
const server = createServer(createFakeDeps());
await server.connect(serverTransport);
client = new Client({ name: 'prop-test-client', version: '1.0.0' }, { capabilities: {} });
await client.connect(clientTransport);
});
afterEach(async () => {
await client.close();
});
it('never crashes for any non-empty string query', async () => {
await fc.assert(
fc.asyncProperty(
fc.string({ minLength: 0, maxLength: 10_000 }),
fc.integer({ min: 1, max: 500 }),
async (query, limit) => {
// Must not throw — always return a result
const result = await client.callTool({
name: 'search_documents',
arguments: { query, limit },
});
return Array.isArray(result.content) && result.content.length > 0;
}
),
{ numRuns: 300, verbose: true }
);
});
});
The verbose: true option prints each failing counterexample along with the seed that generated it. During CI you can suppress it; during local debugging it is invaluable.
Generating valid MCP tool arguments from a Zod schema
If your tool validates arguments with Zod, you want to generate inputs that satisfy the schema — not random garbage. fast-check does not read Zod schemas automatically, but you can build a matching arbitrary with a small helper:
import fc, { Arbitrary } from 'fast-check';
import { z } from 'zod';
// Maps a flat Zod object schema to a fast-check arbitrary
function arbitraryFromZodObject<T extends z.ZodRawShape>(
shape: T
): Arbitrary<z.infer<z.ZodObject<T>>> {
const fields: Record<string, Arbitrary<unknown>> = {};
for (const [key, schema] of Object.entries(shape)) {
fields[key] = arbitraryFromZodType(schema);
}
return fc.record(fields) as Arbitrary<z.infer<z.ZodObject<T>>>;
}
function arbitraryFromZodType(schema: z.ZodTypeAny): Arbitrary<unknown> {
if (schema instanceof z.ZodString) return fc.string({ maxLength: 200 });
if (schema instanceof z.ZodNumber) return fc.float({ noNaN: true, noDefaultInfinity: true });
if (schema instanceof z.ZodBoolean) return fc.boolean();
if (schema instanceof z.ZodEnum) return fc.constantFrom(...(schema.options as string[]));
if (schema instanceof z.ZodArray) return fc.array(arbitraryFromZodType(schema.element), { maxLength: 20 });
if (schema instanceof z.ZodOptional) return fc.option(arbitraryFromZodType(schema.unwrap()), { nil: undefined });
if (schema instanceof z.ZodObject) return arbitraryFromZodObject(schema.shape);
// Fallback for unsupported types — fast-check will skip those runs
return fc.constant(null);
}
Use it to generate valid arguments directly from your existing schema:
const SearchArgsSchema = z.object({
query: z.string().min(1).max(500),
limit: z.number().int().min(1).max(100),
category: z.enum(['docs', 'issues', 'prs']).optional(),
});
const validArgsArbitrary = arbitraryFromZodObject(SearchArgsSchema.shape);
Four key invariants to test in every MCP handler
1. Never throws — always returns a valid CallToolResult
A handler that throws causes the MCP SDK to emit a JSON-RPC error response, which is a different shape from a tool result. LLM clients handle these differently, and often not gracefully. The invariant is that for any well-formed input, the handler returns an object with a content array — it never throws.
it('invariant: never throws for any valid args', async () => {
await fc.assert(
fc.asyncProperty(validArgsArbitrary, async (args) => {
let result: Awaited<ReturnType<typeof client.callTool>>;
try {
result = await client.callTool({ name: 'search_documents', arguments: args });
} catch (e) {
// A thrown error means the handler threw — this is the failure
throw new Error(`Handler threw instead of returning isError:true — ${String(e)}`);
}
// Must have content array
if (!Array.isArray(result.content) || result.content.length === 0) {
throw new Error('Result has no content array');
}
}),
{ numRuns: 300 }
);
});
2. Structured text content is always valid JSON
Many MCP tools return structured data serialised as a JSON string inside a text content block. If your handler sometimes produces truncated or malformed JSON, the LLM cannot parse it. The property: whenever content[0].type === 'text' and the result is not an error, JSON.parse(content[0].text) must succeed.
it('invariant: non-error text content is always valid JSON', async () => {
await fc.assert(
fc.asyncProperty(validArgsArbitrary, async (args) => {
const result = await client.callTool({ name: 'search_documents', arguments: args });
if (result.isError) return; // error path is tested separately
const block = result.content[0];
if (block.type !== 'text') return; // image/resource blocks are exempt
try {
JSON.parse((block as { type: string; text: string }).text);
} catch {
throw new Error(
`Non-error text content is not valid JSON.\nArgs: ${JSON.stringify(args)}\nContent: ${(block as { type: string; text: string }).text.slice(0, 200)}`
);
}
}),
{ numRuns: 300 }
);
});
3. Read tools are idempotent — same input, same output
A read-only tool (one that does not mutate state) should return identical content for identical arguments on consecutive calls. If it does not — if the order of results changes, or a timestamp leaks in — the LLM gets inconsistent answers when it retries. The property: calling the tool twice with the same args returns the same content[0].text.
it('invariant: read tool returns same result on repeated calls', async () => {
await fc.assert(
fc.asyncProperty(validArgsArbitrary, async (args) => {
const first = await client.callTool({ name: 'search_documents', arguments: args });
const second = await client.callTool({ name: 'search_documents', arguments: args });
if (first.isError !== second.isError) {
throw new Error('isError differs between calls with same args');
}
const text1 = (first.content[0] as { type: string; text: string }).text;
const text2 = (second.content[0] as { type: string; text: string }).text;
if (text1 !== text2) {
throw new Error(
`Non-deterministic output for args: ${JSON.stringify(args)}\nFirst: ${text1.slice(0, 100)}\nSecond: ${text2.slice(0, 100)}`
);
}
}),
{ numRuns: 150 } // fewer runs — two calls per iteration
);
});
If your tool has inherent non-determinism (random IDs, current time), inject a controllable clock and a seeded random source via your dependency injection layer so the invariant holds in tests.
4. Error results are always well-formed
When a handler encounters an error it cannot recover from (invalid arguments, upstream failure), it should return a result with isError: true, a content array with at least one text block, and a message with enough detail for the LLM to understand what went wrong. A malformed error result — empty content, missing isError, numeric text — is almost as bad as a thrown error.
// Arbitraries that SHOULD trigger the error path — violate schema constraints
const invalidArgsArbitrary = fc.oneof(
fc.record({ query: fc.constant(''), limit: fc.integer({ min: 1, max: 10 }) }),
fc.record({ query: fc.string(), limit: fc.integer({ min: -1000, max: 0 }) }),
fc.record({ query: fc.string({ maxLength: 10_000 }), limit: fc.constant(9999) }),
);
it('invariant: error results are always well-formed', async () => {
await fc.assert(
fc.asyncProperty(invalidArgsArbitrary, async (args) => {
const result = await client.callTool({ name: 'search_documents', arguments: args });
if (!result.isError) return; // handler accepted the input — fine, not an error path
// isError: true must come with a text content block
const block = result.content[0] as { type: string; text: string } | undefined;
if (!block) throw new Error('isError result has empty content array');
if (block.type !== 'text') throw new Error(`isError result has non-text content[0]: ${block.type}`);
if (!block.text || block.text.trim().length === 0) throw new Error('isError result has empty message');
}),
{ numRuns: 200 }
);
});
Shrinking and MCP debugging
Shrinking is fast-check's most valuable feature for MCP debugging. When a property fails, fast-check does not just report the first random input that caused it — it iteratively reduces that input, removing parts and retrying, until it finds the simplest input that still fails. For string arbitraries, shrinking means the string gets shorter. For objects, fields get removed or their values collapse toward zero/empty. For arrays, elements are removed.
A real example. fast-check finds that the search_documents handler crashes. The original failing input is:
// Original failing input — random, hard to read
{
query: "hello worldtest ",
limit: 47
}
After shrinking, fast-check reports:
// Shrunk counterexample — minimal, diagnostic
{
query: " ",
limit: 1
}
// Counterexample seed — paste into your test to reproduce exactly
fc.assert(fc.property(...), { seed: 1718272841, path: "3:1:0", endOnFailure: true })
The shrunk counterexample tells you immediately: the bug involves a null byte in the query string, not Unicode RTL marks, not long strings. You open the handler and find that query.trim() does not remove