How to Build a Production MCP Server (I Added One to My Site)
How to build a production MCP server: a hands-on guide to JSON-RPC, the Streamable HTTP transport, tools, and discovery — from one I shipped on Cloudflare.

Most sites are built for humans to read and for crawlers to scrape. But the agents showing up now — Claude, ChatGPT, Cursor — don’t want your HTML. They want to call you. Parsing a page to extract three facts is wasteful and fragile; calling a typed tool that returns those three facts is neither.
That’s what the Model Context Protocol (MCP) is for. And the fastest way to understand it is to build one. So I added a production MCP server to this site — it lets an agent search my posts, fetch one as clean Markdown, list my topic hubs, and read my profile — and this is exactly how I did it, with the real code.
No framework, no database, about 300 lines on a Cloudflare Worker.
TL;DR
- An MCP server exposes tools (functions) that AI agents call over JSON-RPC 2.0 — turning your site from agent-readable into agent-callable.
- Use the Streamable HTTP transport: one endpoint,
POST /mcp, that speaks JSON-RPC. A stateless server that returns plain JSON is fully spec-compliant and the easiest to run. - You need exactly four method handlers:
initialize,tools/list,tools/call, andping— plus a no-op for notifications. - You don’t need new infrastructure. Back your tools with assets you already publish (a JSON feed, your Markdown pages). One source of truth, nothing to sync.
- Make it discoverable with a manifest at a well-known URL, an entry in your API catalog, and a
Linkheader.
What is an MCP server?
An MCP server is a small service that exposes tools an AI agent can invoke over a standard protocol. The protocol is JSON-RPC 2.0; the “tools” are named functions with a JSON-Schema for their arguments. When an agent connects, it asks the server “what can you do?” (tools/list), gets back a list of tools, then calls them (tools/call) and receives structured results.
Think of it as a typed API designed specifically for language models. Where a REST API is built for your frontend, an MCP server is built for an agent’s reasoning loop: the descriptions are written for a model to read, the inputs are schema-validated, and errors are reported in a way the model can recover from.
💡 Key insight: REST is for your app. MCP is for the agent. The difference isn’t the wire format — it’s that every field is written to be understood by a model, not a developer.
Why build one for your own site
Search and chat are moving inside agents. When someone asks Claude or ChatGPT about a topic you’ve written about, the model is far more likely to use you well if it can call a search_posts tool than if it has to guess your URL structure and scrape rendered HTML.
Three concrete wins:
- Precision over scraping. A tool returns exactly the fields the agent needs — title, URL, summary — with no markup noise.
- You control the surface. You decide what’s callable and what each tool returns. That’s a far stronger signal than hoping a crawler parses your page correctly.
- It compounds with the rest of your AI-readiness. An MCP server sits naturally alongside
llms.txt, structured data, and an API catalog as part of making your site first-class for agents.
It will not, on its own, make every agent “pick” your site — that still depends on relevance and authority. But it removes every technical reason an agent couldn’t use you well.
What we’re building
Four read-only tools:
| Tool | What it does | Backed by |
|---|---|---|
search_posts | Ranked search over blog posts | a JSON feed I already publish |
get_post | Returns one post as clean Markdown | prerendered /blog/<slug>.md |
list_topics | Lists curated topic hubs | a small constant |
get_profile | Returns the author profile | my existing llms.txt |
The whole thing runs on a Cloudflare Worker as a stateless JSON-RPC handler. Stateless matters: with no session to track, every request is self-contained, which is the simplest possible thing to host and scale.
Step 1 — The transport
MCP defines two transports. For local tools you use stdio; for a remote server you use Streamable HTTP — a single endpoint that accepts JSON-RPC messages over POST. The spec lets the server reply with either an SSE stream or a plain JSON body. A read-only server has no streaming notifications to push, so plain JSON is the right call and the simplest.
Every MCP message is JSON-RPC 2.0. Two tiny helpers cover all our responses:
function rpcResult(id: unknown, result: unknown) {
return { jsonrpc: '2.0', id, result };
}
function rpcError(id: unknown, code: number, message: string) {
return { jsonrpc: '2.0', id, error: { code, message } };
} The endpoint parses the POST body, routes on method, and returns the JSON-RPC response. Requests carry an id; notifications don’t — and a notification gets no response body, just a 202 Accepted.
Step 2 — Define your tools
A tool is metadata plus an input schema. The description is not for you — it’s the prompt the model reads to decide whether and how to call the tool. Write it like you’re briefing a smart colleague who can’t see your code:
const TOOLS = [
{
name: 'search_posts',
title: 'Search blog posts',
description:
'Full-text search across the blog (titles, summaries, tags). Returns matching ' +
'posts with slug, title, URL, summary, tags and publish date. Use for topics ' +
'like AI engineering, LLMs, RAG, Claude Code, or web development.',
inputSchema: {
type: 'object',
properties: {
query: { type: 'string', description: 'Search terms.' },
limit: { type: 'integer', description: 'Max results (default 10, max 30).' }
},
required: ['query']
}
}
// get_post, list_topics, get_profile ...
]; 💡 Key insight: Tool descriptions are prompt engineering. A vague description means the model calls the wrong tool or skips it. Spell out when to use it and what it returns.
Step 3 — Back tools with data you already have
This is the part most tutorials overcomplicate. You don’t need a database. I back every tool with assets the site already prerenders:
search_postsfetches my existing/feed.json(a JSON Feed of every post) and ranks it.get_postfetches the already-generated/blog/<slug>.mdMarkdown variant.get_profilereturns myllms.txt.
On a Cloudflare Worker you reach those via the assets binding, so there’s one source of truth and nothing to keep in sync:
async function searchPosts(assets, origin, query, limit) {
const res = await assets.fetch(new URL('/feed.json', origin));
if (!res.ok) throw new Error('Post index unavailable');
const { items = [] } = await res.json();
const terms = query.toLowerCase().split(/s+/).filter(Boolean);
return items
.map((item) => {
const hay = `${item.title} ${(item.tags || []).join(' ')} ${item.summary}`.toLowerCase();
// weight title hits over tags over summary
const score = terms.reduce((s, t) => s + (item.title.toLowerCase().includes(t) ? 3 : 0)
+ ((item.tags || []).join(' ').toLowerCase().includes(t) ? 2 : 0)
+ (hay.includes(t) ? 1 : 0), 0);
return { item, score };
})
.filter((x) => x.score > 0)
.sort((a, b) => b.score - a.score)
.slice(0, limit)
.map(({ item }) => ({ title: item.title, url: item.url, summary: item.summary }));
} Always validate inputs before using them. get_post takes a slug straight from the model, so it gets a strict regex check before it ever touches a path:
const SLUG_RE = /^[a-z0-9][a-z0-9-]{0,120}$/;
if (!SLUG_RE.test(slug)) {
throw new Error(`Invalid slug "${slug}". Use a slug from search_posts.`);
} Step 4 — Handle the protocol
The router is small. Four real methods, plus notification handling:
async function handleRpc(msg, assets, origin) {
const { id, method, params } = msg;
const isNotification = id === undefined || id === null;
switch (method) {
case 'initialize':
return rpcResult(id, {
protocolVersion: '2025-06-18',
capabilities: { tools: { listChanged: false } },
serverInfo: { name: 'my-site', version: '1.0.0' },
instructions: 'Tools for querying my blog and profile.'
});
case 'ping':
return rpcResult(id, {});
case 'tools/list':
return rpcResult(id, { tools: TOOLS });
case 'tools/call': {
const { name, arguments: args = {} } = params || {};
try {
const text = await callTool(assets, origin, name, args);
return rpcResult(id, { content: [{ type: 'text', text }], isError: false });
} catch (err) {
// Report tool errors IN-BAND so the model can see and react to them.
return rpcResult(id, { content: [{ type: 'text', text: err.message }], isError: true });
}
}
default:
if (isNotification) return null; // ignore unknown notifications
return rpcError(id, -32601, `Method not found: ${method}`);
}
} Three things people get wrong here, and they all live in this function:
initializemust echo aprotocolVersionthe client understands and declare yourcapabilities. Skip it and the handshake fails before any tool runs.- Tool failures are not protocol errors. A bad slug returns a normal result with
isError: trueand a message — so the model reads the failure and retries — not a JSON-RPCerror. Reserveerror(-32601,-32700, etc.) for malformed protocol. - Notifications get no response. If
notifications/initializedarrives, acknowledge with202and an empty body. Returning a JSON-RPC object for a notification breaks strict clients.
Step 5 — Make it discoverable
A server nobody can find is useless. Advertise it three ways:
- A manifest at
/.well-known/mcp— name, endpoint, transport, and the tool list. - An entry in your API catalog (
/.well-known/api-catalog, RFC 9727) pointing at the manifest. - A
Linkheader on your HTML responses:Link: </.well-known/mcp>; rel="service-desc"; type="application/json".
Then point an MCP client straight at https://yoursite.com/mcp.
Testing your MCP server
You don’t need a fancy client to test — curl speaks JSON-RPC fine. List the tools:
curl -s -X POST https://yoursite.com/mcp
-H 'Content-Type: application/json'
-d '{"jsonrpc":"2.0","id":1,"method":"tools/list"}' Call one:
curl -s -X POST https://yoursite.com/mcp
-H 'Content-Type: application/json'
-d '{"jsonrpc":"2.0","id":2,"method":"tools/call",
"params":{"name":"search_posts","arguments":{"query":"RAG","limit":3}}}' Work through the lifecycle: initialize → tools/list → tools/call, then confirm the edges — an invalid slug returns isError: true, a notification returns 202 with no body, an unknown method returns -32601, and a GET returns 405. If all of those behave, real clients will too.
Common mistakes
- Treating tool errors as protocol errors. The single most common bug. Use
isError: truein the result; keep JSON-RPCerrorfor malformed requests only. - Building stateful sessions you don’t need. A read-only server should be stateless. Sessions add complexity and a scaling headache for zero benefit here.
- Thin tool descriptions. “Search” tells the model nothing. Say what it searches, what it returns, and when to reach for it.
- Duplicating your data. Don’t copy your content into the server. Point tools at what you already publish so there’s nothing to keep in sync.
- Forgetting CORS. Browser-based MCP clients need it. Handle
OPTIONSand allow theMcp-Session-Id/Mcp-Protocol-Versionheaders.
Best practices
- Stateless first. Reach for sessions only when a tool genuinely needs continuity.
- Validate every argument. Treat tool inputs like any untrusted input — schema plus a guard.
- Write descriptions as prompts. They’re the only thing the model sees when deciding to call a tool.
- Reuse existing assets. Your feed, your Markdown, your profile file — one source of truth.
- Advertise it. Manifest + API catalog +
Linkheader, so agents can find it without being told. - Test the edges, not just the happy path. Notifications, unknown methods, invalid inputs, wrong HTTP verb.
Conclusion
An MCP server is less code than you expect — a JSON-RPC router, four well-described tools, and a thin layer over content you already ship. The mental shift is the real work: stop thinking of your site as pages to be read and start thinking of it as capabilities to be called. That’s the interface agents actually want.
I built mine on a Cloudflare Worker in an afternoon, and it now sits alongside the rest of this site’s agent-readiness as a first-class surface. If you’ve already got a JSON feed and Markdown pages, you’re most of the way there.
If this was useful, go deeper next: see how the pieces fit together across LLM Engineering — RAG, Fine-Tuning & Production LLMs and AI Coding Agents — Agentic AI for Developers, or read the official MCP specification for the full protocol.
Explore more: AI Coding Agents · LLM Engineering · Claude Code
About the Author
Software engineer writing about AI, Claude Code, LLMs, OpenAI, Anthropic, and developer tooling. 5+ years building production systems at Expedia Group, Tekion, and BYJU'S.
Related Articles

AI & Developer Experience
Deploy an MCP Server on Cloudflare Workers (Free, Stateless, at the Edge)
Deploy an MCP server on Cloudflare Workers: wrangler.toml, the run_worker_first model, routing /mcp, local testing, and going live on the free tier.

AI & Developer Experience
Claude Code Review by Anthropic: Multi-Agent PR Reviews, Pricing, Setup Guide, and Limits (2026)
How Claude Code Review works: multi-agent PR reviewer, pricing, REVIEW.md customization, and where it beats static analyzers. Complete guide for 2026.

AI & Developer Experience
Cursor vs Claude Code vs Copilot (2026): Which AI Coding Tool, for What
Cursor vs Claude Code vs GitHub Copilot in 2026 — how they actually differ in model, workflow, and autonomy, and which to use for what (I use all three).