Skip to main content

How to Build a Production MCP Server (I Added One to My Site)

How to build a production MCP server: a hands-on guide to JSON-RPC, the Streamable HTTP transport, tools, and discovery — from one I shipped on Cloudflare.

12 min read
Architecture of a production Model Context Protocol server on Cloudflare Workers

Most sites are built for humans to read and for crawlers to scrape. But the agents showing up now — Claude, ChatGPT, Cursor — don’t want your HTML. They want to call you. Parsing a page to extract three facts is wasteful and fragile; calling a typed tool that returns those three facts is neither.

That’s what the Model Context Protocol (MCP) is for. And the fastest way to understand it is to build one. So I added a production MCP server to this site — it lets an agent search my posts, fetch one as clean Markdown, list my topic hubs, and read my profile — and this is exactly how I did it, with the real code.

No framework, no database, about 300 lines on a Cloudflare Worker.

TL;DR

  • An MCP server exposes tools (functions) that AI agents call over JSON-RPC 2.0 — turning your site from agent-readable into agent-callable.
  • Use the Streamable HTTP transport: one endpoint, POST /mcp, that speaks JSON-RPC. A stateless server that returns plain JSON is fully spec-compliant and the easiest to run.
  • You need exactly four method handlers: initialize, tools/list, tools/call, and ping — plus a no-op for notifications.
  • You don’t need new infrastructure. Back your tools with assets you already publish (a JSON feed, your Markdown pages). One source of truth, nothing to sync.
  • Make it discoverable with a manifest at a well-known URL, an entry in your API catalog, and a Link header.

What is an MCP server?

An MCP server is a small service that exposes tools an AI agent can invoke over a standard protocol. The protocol is JSON-RPC 2.0; the “tools” are named functions with a JSON-Schema for their arguments. When an agent connects, it asks the server “what can you do?” (tools/list), gets back a list of tools, then calls them (tools/call) and receives structured results.

Think of it as a typed API designed specifically for language models. Where a REST API is built for your frontend, an MCP server is built for an agent’s reasoning loop: the descriptions are written for a model to read, the inputs are schema-validated, and errors are reported in a way the model can recover from.

💡 Key insight: REST is for your app. MCP is for the agent. The difference isn’t the wire format — it’s that every field is written to be understood by a model, not a developer.

Why build one for your own site

Search and chat are moving inside agents. When someone asks Claude or ChatGPT about a topic you’ve written about, the model is far more likely to use you well if it can call a search_posts tool than if it has to guess your URL structure and scrape rendered HTML.

Three concrete wins:

  1. Precision over scraping. A tool returns exactly the fields the agent needs — title, URL, summary — with no markup noise.
  2. You control the surface. You decide what’s callable and what each tool returns. That’s a far stronger signal than hoping a crawler parses your page correctly.
  3. It compounds with the rest of your AI-readiness. An MCP server sits naturally alongside llms.txt, structured data, and an API catalog as part of making your site first-class for agents.

It will not, on its own, make every agent “pick” your site — that still depends on relevance and authority. But it removes every technical reason an agent couldn’t use you well.

What we’re building

Four read-only tools:

ToolWhat it doesBacked by
search_postsRanked search over blog postsa JSON feed I already publish
get_postReturns one post as clean Markdownprerendered /blog/<slug>.md
list_topicsLists curated topic hubsa small constant
get_profileReturns the author profilemy existing llms.txt

The whole thing runs on a Cloudflare Worker as a stateless JSON-RPC handler. Stateless matters: with no session to track, every request is self-contained, which is the simplest possible thing to host and scale.

Step 1 — The transport

MCP defines two transports. For local tools you use stdio; for a remote server you use Streamable HTTP — a single endpoint that accepts JSON-RPC messages over POST. The spec lets the server reply with either an SSE stream or a plain JSON body. A read-only server has no streaming notifications to push, so plain JSON is the right call and the simplest.

Every MCP message is JSON-RPC 2.0. Two tiny helpers cover all our responses:

function rpcResult(id: unknown, result: unknown) {
  return { jsonrpc: '2.0', id, result };
}
function rpcError(id: unknown, code: number, message: string) {
  return { jsonrpc: '2.0', id, error: { code, message } };
}

The endpoint parses the POST body, routes on method, and returns the JSON-RPC response. Requests carry an id; notifications don’t — and a notification gets no response body, just a 202 Accepted.

Step 2 — Define your tools

A tool is metadata plus an input schema. The description is not for you — it’s the prompt the model reads to decide whether and how to call the tool. Write it like you’re briefing a smart colleague who can’t see your code:

const TOOLS = [
  {
    name: 'search_posts',
    title: 'Search blog posts',
    description:
      'Full-text search across the blog (titles, summaries, tags). Returns matching ' +
      'posts with slug, title, URL, summary, tags and publish date. Use for topics ' +
      'like AI engineering, LLMs, RAG, Claude Code, or web development.',
    inputSchema: {
      type: 'object',
      properties: {
        query: { type: 'string', description: 'Search terms.' },
        limit: { type: 'integer', description: 'Max results (default 10, max 30).' }
      },
      required: ['query']
    }
  }
  // get_post, list_topics, get_profile ...
];

💡 Key insight: Tool descriptions are prompt engineering. A vague description means the model calls the wrong tool or skips it. Spell out when to use it and what it returns.

Step 3 — Back tools with data you already have

This is the part most tutorials overcomplicate. You don’t need a database. I back every tool with assets the site already prerenders:

  • search_posts fetches my existing /feed.json (a JSON Feed of every post) and ranks it.
  • get_post fetches the already-generated /blog/<slug>.md Markdown variant.
  • get_profile returns my llms.txt.

On a Cloudflare Worker you reach those via the assets binding, so there’s one source of truth and nothing to keep in sync:

async function searchPosts(assets, origin, query, limit) {
  const res = await assets.fetch(new URL('/feed.json', origin));
  if (!res.ok) throw new Error('Post index unavailable');
  const { items = [] } = await res.json();

  const terms = query.toLowerCase().split(/s+/).filter(Boolean);
  return items
    .map((item) => {
      const hay = `${item.title} ${(item.tags || []).join(' ')} ${item.summary}`.toLowerCase();
      // weight title hits over tags over summary
      const score = terms.reduce((s, t) => s + (item.title.toLowerCase().includes(t) ? 3 : 0)
        + ((item.tags || []).join(' ').toLowerCase().includes(t) ? 2 : 0)
        + (hay.includes(t) ? 1 : 0), 0);
      return { item, score };
    })
    .filter((x) => x.score > 0)
    .sort((a, b) => b.score - a.score)
    .slice(0, limit)
    .map(({ item }) => ({ title: item.title, url: item.url, summary: item.summary }));
}

Always validate inputs before using them. get_post takes a slug straight from the model, so it gets a strict regex check before it ever touches a path:

const SLUG_RE = /^[a-z0-9][a-z0-9-]{0,120}$/;
if (!SLUG_RE.test(slug)) {
  throw new Error(`Invalid slug "${slug}". Use a slug from search_posts.`);
}

Step 4 — Handle the protocol

The router is small. Four real methods, plus notification handling:

async function handleRpc(msg, assets, origin) {
  const { id, method, params } = msg;
  const isNotification = id === undefined || id === null;

  switch (method) {
    case 'initialize':
      return rpcResult(id, {
        protocolVersion: '2025-06-18',
        capabilities: { tools: { listChanged: false } },
        serverInfo: { name: 'my-site', version: '1.0.0' },
        instructions: 'Tools for querying my blog and profile.'
      });
    case 'ping':
      return rpcResult(id, {});
    case 'tools/list':
      return rpcResult(id, { tools: TOOLS });
    case 'tools/call': {
      const { name, arguments: args = {} } = params || {};
      try {
        const text = await callTool(assets, origin, name, args);
        return rpcResult(id, { content: [{ type: 'text', text }], isError: false });
      } catch (err) {
        // Report tool errors IN-BAND so the model can see and react to them.
        return rpcResult(id, { content: [{ type: 'text', text: err.message }], isError: true });
      }
    }
    default:
      if (isNotification) return null;            // ignore unknown notifications
      return rpcError(id, -32601, `Method not found: ${method}`);
  }
}

Three things people get wrong here, and they all live in this function:

  • initialize must echo a protocolVersion the client understands and declare your capabilities. Skip it and the handshake fails before any tool runs.
  • Tool failures are not protocol errors. A bad slug returns a normal result with isError: true and a message — so the model reads the failure and retries — not a JSON-RPC error. Reserve error (-32601, -32700, etc.) for malformed protocol.
  • Notifications get no response. If notifications/initialized arrives, acknowledge with 202 and an empty body. Returning a JSON-RPC object for a notification breaks strict clients.

Step 5 — Make it discoverable

A server nobody can find is useless. Advertise it three ways:

  1. A manifest at /.well-known/mcp — name, endpoint, transport, and the tool list.
  2. An entry in your API catalog (/.well-known/api-catalog, RFC 9727) pointing at the manifest.
  3. A Link header on your HTML responses: Link: </.well-known/mcp>; rel="service-desc"; type="application/json".

Then point an MCP client straight at https://yoursite.com/mcp.

Testing your MCP server

You don’t need a fancy client to test — curl speaks JSON-RPC fine. List the tools:

curl -s -X POST https://yoursite.com/mcp 
  -H 'Content-Type: application/json' 
  -d '{"jsonrpc":"2.0","id":1,"method":"tools/list"}'

Call one:

curl -s -X POST https://yoursite.com/mcp 
  -H 'Content-Type: application/json' 
  -d '{"jsonrpc":"2.0","id":2,"method":"tools/call",
       "params":{"name":"search_posts","arguments":{"query":"RAG","limit":3}}}'

Work through the lifecycle: initializetools/listtools/call, then confirm the edges — an invalid slug returns isError: true, a notification returns 202 with no body, an unknown method returns -32601, and a GET returns 405. If all of those behave, real clients will too.

Common mistakes

  • Treating tool errors as protocol errors. The single most common bug. Use isError: true in the result; keep JSON-RPC error for malformed requests only.
  • Building stateful sessions you don’t need. A read-only server should be stateless. Sessions add complexity and a scaling headache for zero benefit here.
  • Thin tool descriptions. “Search” tells the model nothing. Say what it searches, what it returns, and when to reach for it.
  • Duplicating your data. Don’t copy your content into the server. Point tools at what you already publish so there’s nothing to keep in sync.
  • Forgetting CORS. Browser-based MCP clients need it. Handle OPTIONS and allow the Mcp-Session-Id / Mcp-Protocol-Version headers.

Best practices

  1. Stateless first. Reach for sessions only when a tool genuinely needs continuity.
  2. Validate every argument. Treat tool inputs like any untrusted input — schema plus a guard.
  3. Write descriptions as prompts. They’re the only thing the model sees when deciding to call a tool.
  4. Reuse existing assets. Your feed, your Markdown, your profile file — one source of truth.
  5. Advertise it. Manifest + API catalog + Link header, so agents can find it without being told.
  6. Test the edges, not just the happy path. Notifications, unknown methods, invalid inputs, wrong HTTP verb.

Conclusion

An MCP server is less code than you expect — a JSON-RPC router, four well-described tools, and a thin layer over content you already ship. The mental shift is the real work: stop thinking of your site as pages to be read and start thinking of it as capabilities to be called. That’s the interface agents actually want.

I built mine on a Cloudflare Worker in an afternoon, and it now sits alongside the rest of this site’s agent-readiness as a first-class surface. If you’ve already got a JSON feed and Markdown pages, you’re most of the way there.

If this was useful, go deeper next: see how the pieces fit together across LLM Engineering — RAG, Fine-Tuning & Production LLMs and AI Coding Agents — Agentic AI for Developers, or read the official MCP specification for the full protocol.

Explore more: AI Coding Agents · LLM Engineering · Claude Code

Share this article:
X LinkedIn