Babayaga 0906b299f4 Maintenance Love :)

2026-03-21 20:18:25 +01:00

15 KiB

Raw Blame History

PoolyPress MCP Server

Model Context Protocol (MCP) server that lets any LLM search, browse, and read content on a PoolyPress instance.

Architecture

┌──────────────────────────────┐
│  MCP Client (Claude, etc.)   │
└──────────┬───────────────────┘
           │  POST /api/mcp
           │  JSON-RPC 2.0
┌──────────▼───────────────────┐
│   McpProduct (Hono handler)  │
│       handlers.ts            │
│   ─────────────────────────  │
│   initialize · tools/list    │
│   tools/call → tools.ts      │
└──────────┬───────────────────┘
           │  direct function calls
┌──────────▼───────────────────┐
│   Server-side logic          │
│   searchDirect · categories  │
│   pages-data · site-scrape   │
└──────────────────────────────┘

Key decisions

Decision	Choice	Rationale
SDK	None — raw JSON-RPC 2.0	Zero deps; MCP spec is just JSON-RPC over HTTP
Transport	`POST /api/mcp` (HTTP)	Single endpoint; works with any HTTP client
Auth	`Bearer <supabase-token>`	Reuses existing `getUserCached()` — no new auth layer
Code reuse	Direct imports from `products/serving/`	No REST-over-HTTP round-trips; zero duplication

Source Files

server/src/products/mcp/
├── index.ts       # McpProduct class (extends AbstractProduct)
├── routes.ts      # POST /api/mcp route definition
├── handlers.ts    # JSON-RPC 2.0 dispatcher
├── tools.ts       # 17 tool definitions + handlers
└── __tests__/
    └── mcp.e2e.test.ts  # E2E tests

File	Source	Purpose
index.ts	Product entry	Registers with platform product system
routes.ts	Route	`POST /api/mcp` — private (auth required)
handlers.ts	Handler	Dispatches `initialize`, `tools/list`, `tools/call`
tools.ts	Tools	All tool schemas + handler functions
registry.ts	Registration	`'mcp': McpProduct` entry
products.json	Config	`mcp` enabled, depends on `serving`

Upstream dependencies

Import	Source file	Used by
`searchDirect()`	db-search.ts	`search_content`, `find_pages`, `find_pictures`, `find_files`
`fetchCategoriesServer()`	db-categories.ts	`list_categories`
`getCategoryState()`	db-categories.ts	`find_by_category`
`filterVisibleCategories()`	db-categories.ts	`list_categories`
`getPagesState()`	pages-data.ts	`get_page_content`, `find_by_category`
`enrichPageData()`	pages-data.ts	`get_page_content`
`JSDOM` + `Readability`	jsdom, @mozilla/readability	`markdown_scraper`
`getPageTranslations()`	pages-i18n.ts	`get_page_translations`, `set_page_translations`
`getTranslationGaps()`	db-i18n.ts	`get_translation_gaps`

Tools

`search_content`

Full-text search across pages, posts, pictures, and VFS files.

{
  "name": "search_content",
  "inputSchema": {
    "type": "object",
    "properties": {
      "query": { "type": "string", "description": "Search query (full-text)" },
      "limit": { "type": "number", "description": "Max results (default 20, max 50)" },
      "type":  { "type": "string", "enum": ["all","pages","posts","pictures","files"] }
    },
    "required": ["query"]
  }
}

Returns: [{ id, title, description, type, rank, url, created_at }]

Backend: searchDirect({ q, limit, type, userId })

`find_pages`

Search specifically for pages.

{ "name": "find_pages", "inputSchema": { "properties": { "query": {}, "limit": {} }, "required": ["query"] } }

Returns: [{ id, title, slug, description, rank, created_at }]

`find_pictures`

Search specifically for pictures/images.

{ "name": "find_pictures", "inputSchema": { "properties": { "query": {}, "limit": {} }, "required": ["query"] } }

Returns: [{ id, title, description, image_url, rank, created_at }]

`find_files`

Search for files and folders in the Virtual File System (VFS).

{ "name": "find_files", "inputSchema": { "properties": { "query": {}, "limit": {} }, "required": ["query"] } }

Returns: [{ id, title, path, type, url, created_at }]

`get_page_content`

Get the full content of a specific page by slug or ID.

{
  "name": "get_page_content",
  "inputSchema": {
    "type": "object",
    "properties": {
      "slug": { "type": "string", "description": "Page slug (e.g. \"about-us\")" },
      "id":   { "type": "string", "description": "Page UUID (alternative to slug)" }
    }
  }
}

Returns: { id, title, slug, description, content, tags, is_public, created_at, updated_at, meta }

`list_categories`

List all content categories with hierarchy.

{
  "name": "list_categories",
  "inputSchema": {
    "type": "object",
    "properties": {
      "parentSlug":      { "type": "string", "description": "Filter children of parent" },
      "includeChildren": { "type": "boolean", "description": "Include nested children (default true)" }
    }
  }
}

Returns: [{ id, name, slug, description, children: [{ id, name, slug }] }]

`find_by_category`

Get all pages belonging to a category (and descendants).

{
  "name": "find_by_category",
  "inputSchema": {
    "type": "object",
    "properties": {
      "slug":               { "type": "string" },
      "limit":              { "type": "number", "description": "Max items (default 50)" },
      "includeDescendants": { "type": "boolean", "description": "Include child categories (default true)" }
    },
    "required": ["slug"]
  }
}

Returns: { category: { id, name, slug, description }, total, items: [{ id, title, slug, description, variables, created_at }] }

`markdown_scraper`

Scrape a URL and return clean Markdown.

{
  "name": "markdown_scraper",
  "inputSchema": {
    "type": "object",
    "properties": {
      "url": { "type": "string", "description": "URL to scrape" }
    },
    "required": ["url"]
  }
}

Returns: { markdown, title } or { error }

Uses lightweight fetch + Readability + Turndown. For JavaScript-heavy pages, the full Scrapeless-powered endpoint at POST /api/scrape/markdown (site-scrape.ts) is available separately.

`get_page_translations`

Get existing translations for a page. Returns all widget translations and meta (title/description) for a specific target language.

{
  "name": "get_page_translations",
  "inputSchema": {
    "type": "object",
    "properties": {
      "slug":        { "type": "string", "description": "Page slug" },
      "id":          { "type": "string", "description": "Page UUID (alternative to slug)" },
      "target_lang": { "type": "string", "description": "Target language code (e.g. \"es\", \"de\")" },
      "source_lang": { "type": "string", "description": "Source language code (default \"en\")" }
    },
    "required": ["target_lang"]
  }
}

Returns: { page_id, page_title, slug, target_lang, source_lang, translations: [{ widget_id, prop_path, source_text, translated_text, status, outdated }], summary: { total, translated, missing, outdated } }

`set_page_translations`

Save translations for a page. Batch-upserts widget translations for a target language. The LLM performs the translation — this tool persists the results.

{
  "name": "set_page_translations",
  "inputSchema": {
    "type": "object",
    "properties": {
      "slug":        { "type": "string" },
      "id":          { "type": "string" },
      "target_lang": { "type": "string" },
      "source_lang": { "type": "string", "description": "default \"en\"" },
      "translations": {
        "type": "array",
        "items": {
          "type": "object",
          "properties": {
            "widget_id":       { "type": "string", "description": "Widget instance ID or \"__meta__\"" },
            "translated_text": { "type": "string" },
            "prop_path":       { "type": "string", "description": "default \"content\"" },
            "status":          { "type": "string", "enum": ["draft","machine","reviewed","published"] }
          },
          "required": ["widget_id", "translated_text"]
        }
      }
    },
    "required": ["target_lang", "translations"]
  }
}

Returns: { success, page_id, slug, target_lang, count, message }

Auth: Owner only

`get_translation_gaps`

Find pages/entities with missing or outdated translations for a given language.

{
  "name": "get_translation_gaps",
  "inputSchema": {
    "type": "object",
    "properties": {
      "target_lang":  { "type": "string", "description": "Target language code (e.g. \"de\")" },
      "entity_type":  { "type": "string", "enum": ["page","category","type"], "description": "default \"page\"" },
      "mode":         { "type": "string", "enum": ["missing","outdated","all"], "description": "default \"all\"" },
      "source_lang":  { "type": "string", "description": "default \"en\"" }
    },
    "required": ["target_lang"]
  }
}

Returns: Array of entities with their untranslated/outdated source text

Protocol

The endpoint speaks JSON-RPC 2.0 — no MCP SDK required on either side.

Methods

Method	Purpose
`initialize`	Handshake — returns server info and capabilities
`tools/list`	Lists all 17 tools with schemas
`tools/call`	Execute a tool by name with arguments

Request format

{
  "jsonrpc": "2.0",
  "id": 1,
  "method": "tools/call",
  "params": {
    "name": "search_content",
    "arguments": { "query": "plastic", "limit": 5 }
  }
}

Response format

{
  "jsonrpc": "2.0",
  "id": 1,
  "result": {
    "content": [{
      "type": "text",
      "text": "[{\"id\":\"...\",\"title\":\"...\"}]"
    }]
  }
}

Error codes

Code	Meaning
`-32700`	Parse error (malformed JSON)
`-32600`	Invalid request (missing jsonrpc/method)
`-32601`	Method/tool not found
`-32603`	Internal error

Usage

curl

# Initialize
curl -X POST http://localhost:3001/api/mcp \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -d '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{}}'

# List tools
curl -X POST http://localhost:3001/api/mcp \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -d '{"jsonrpc":"2.0","id":2,"method":"tools/list"}'

# Search
curl -X POST http://localhost:3001/api/mcp \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -d '{"jsonrpc":"2.0","id":3,"method":"tools/call","params":{"name":"search_content","arguments":{"query":"plastic","limit":3}}}'

Claude Desktop / Cursor / Windsurf

These clients expect an stdio transport or an SSE endpoint. To use the HTTP endpoint, you can wrap it with a thin stdio ↔ HTTP bridge:

{
  "mcpServers": {
    "poolypress": {
      "url": "http://localhost:3001/api/mcp",
      "headers": {
        "Authorization": "Bearer YOUR_TOKEN"
      }
    }
  }
}

Note: Claude Desktop 2025+ supports HTTP MCP servers natively via the url field.

Configuration

The MCP product is enabled in server/config/products.json:

{
  "name": "mcp",
  "enabled": true,
  "workers": 0,
  "deps": ["serving"]
}

To disable the MCP endpoint, set "enabled": false.

Testing

cd server
npm run test:mcp

Runs 13 E2E tests covering all tools, error handling, and protocol compliance.

Test file: server/src/products/mcp/__tests__/mcp.e2e.test.ts

How to Add a New Tool

Define the tool in tools.ts:

const myNewTool: McpTool = {
    name: 'my_tool',
    description: 'What this tool does — shown to the LLM.',
    inputSchema: {
        type: 'object',
        properties: {
            param1: { type: 'string', description: '...' }
        },
        required: ['param1']
    },
    handler: async (args, userId) => {
        // Call server-side logic directly
        const result = await someServerFunction(args.param1);
        return result;
    }
};

Register it — add to the MCP_TOOLS array at the bottom of tools.ts:

export const MCP_TOOLS: McpTool[] = [
    // … existing tools …
    myNewTool
];

That's it. The handler in handlers.ts auto-discovers tools via the MCP_TOOLS_MAP. No route changes needed.

Add tests — add a test case in mcp.e2e.test.ts and update the tool count assertion.

Tool design guidelines

Call server-side functions directly — never make HTTP requests to your own server
Accept userId as second argument — pass it through for visibility/ACL filtering
Return structured data — the handler serializes it to JSON automatically
Use existing caches — getPagesState(), getCategoryState(), etc. are all cached
Keep schemas minimal — LLMs work better with fewer, well-described parameters

Security

Auth gating: Every tool call resolves the user from the Bearer token. Anonymous requests get limited visibility (public content only).
VFS ACL: File searches respect the existing ACL layer.
Visibility filtering: searchDirect() applies owner/public/private filtering based on userId.
Rate limiting: Inherits the platform's apiRateLimiter middleware.
Write operations: Content creation, editing, and translation tools require authentication and verify page ownership (userId === page.owner). Admin-only actions are not available.

15 KiB Raw Blame History