Babayaga 0906b299f4 Maintenance Love :)

2026-03-21 20:18:25 +01:00

36 KiB

Raw Blame History

SEO & Discoverability on Polymech

Polymech is built as an SEO-first platform. Every piece of content — whether it's a media post, a CMS page, or a product listing — is automatically discoverable by search engines, social platforms, AI agents, and feed readers. No plugins, no external services, no config files. It's all baked in.

This document covers every SEO-related feature the platform offers.

Multi-Format Content Export
Discovery Endpoints
Open Graph & Social Meta
JSON-LD Structured Data
Server-Side Rendering & Initial State Injection
Responsive Image Optimization
Internationalization (i18n)
Embeddable Content
API-First Architecture
Developer Experience
Client-Side SEO & Performance
Route Reference

Multi-Format Content Export

Every content entity on Polymech (posts and pages) can be exported in multiple formats by simply changing the file extension in the URL. No API keys, no special headers — just append the extension.

Source: Page exports → pages-routes.ts, Post exports → db-post-exports.ts

Pages

Pages are rich, widget-based documents built with a visual editor. They export to:

pages-rich-html.ts · pages-html.ts · pages-pdf.ts · pages-markdown.ts · pages-email.ts · pages-data.ts

Format	URL Pattern	Content-Type	Description
XHTML	`/user/:id/pages/:slug.xhtml`	`text/html`	Standalone rich HTML with Tailwind CSS styling, full meta tags, JSON-LD, and responsive layout. Ready to share or archive.
HTML	`/user/:id/pages/:slug.html`	`text/html`	SPA shell with injected Open Graph metadata for crawlers and social previews.
PDF	`/user/:id/pages/:slug.pdf`	`application/pdf`	Print-ready PDF export. Great for invoices, reports, or offline sharing.
Markdown	`/user/:id/pages/:slug.md`	`text/markdown`	Clean Markdown export of the page content. Useful for migration, backups, or feeding to other systems.
JSON	`/user/:id/pages/:slug.json`	`application/json`	Raw page data including content tree, metadata, and author profile. Perfect for headless CMS integrations.
Email HTML	`/user/:id/pages/:slug.email.html`	`text/html`	Email-client-optimized HTML with inlined styles and table-based layout. Compatible with Outlook, Gmail, Apple Mail, and others.

Posts

Posts are media-centric entries (photos, videos, link cards). They export to:

db-post-exports.ts · db-posts.ts

Format	URL Pattern	Content-Type	Description
XHTML	`/post/:id.xhtml`	`text/html`	Standalone rich HTML with Tailwind CSS, responsive image gallery, OG meta, and JSON-LD structured data.
PDF	`/post/:id.pdf`	`application/pdf`	PDF export of the post with embedded images.
Markdown	`/post/:id.md`	`text/markdown`	Markdown with title, description, and linked images.
JSON	`/post/:id.json`	`application/json`	Full post data with pictures array and author profile.

How it works

The export system doesn't use templates or pre-rendered files. Each format is generated server-side on-the-fly from the same canonical content tree, which means:

Exports are always up-to-date — no build step needed
All formats share the same data pipeline — update once, export everywhere
The widget-based content system is format-agnostic — markdown text, photo cards, galleries, tabs, and nested layouts all render correctly in every format

Discovery Endpoints

Source: content.ts · routes.ts

RSS Feed — `/feed.xml`

Standard RSS 2.0 feed of the latest posts and pages. Supports filtering by category via query parameters: → content.ts handleGetFeedXml

/feed.xml?categorySlugs=tutorials&limit=50&sortBy=latest

Image enclosures with optimized proxy URLs
Per-item author attribution
Category filtering (by ID or slug, including descendants)
Configurable sort order (latest or top)

Google Merchant Feed — `/products.xml`

A Google Merchant Center compatible XML feed for products. Automatically includes only items with pricing data set through the type system: → content.ts handleGetMerchantFeed

<g:id>product-uuid</g:id>
<g:title>Product Name</g:title>
<g:price>29.99 EUR</g:price>
<g:product_type>Category > Subcategory</g:product_type>
<g:image_link>https://service.polymech.info/api/images/cache/optimized.jpg</g:image_link>

Automatically resolves price, currency, and condition from the type system & page variables
Full category path hierarchy
Optimized product images via the image proxy
All items link to their canonical page/post URL

Sitemap — `/sitemap-en.xml`

Auto-generated XML sitemap of all public, visible pages: → content.ts handleGetSitemap

<url>
  <loc>https://polymech.info/user/username/pages/my-page</loc>
  <lastmod>2025-03-01T12:00:00.000Z</lastmod>
  <changefreq>weekly</changefreq>
  <priority>0.8</priority>
</url>

Only includes public + visible pages (respects content visibility settings)
Uses updated_at for accurate <lastmod> timestamps
Ready to submit to Google Search Console, Bing Webmaster Tools, etc.

LLM-Readable Content — `/llms.txt` & `/llms.md`

Following the emerging llms.txt standard, Polymech generates a machine-readable summary of the entire site at /llms.txt (and /llms.md for Markdown content-type): → content.ts handleGetLLMText

# Polymech

> A full-stack media platform...

## Pages

- [Getting Started](https://polymech.info/user/admin/pages/getting-started): Introduction to...
- [Product Catalog](https://polymech.info/user/admin/pages/catalog): Browse our...

## Posts

- [New Release](https://polymech.info/post/abc123) by admin: Announcing...

## Public API

- Post Details JSON: /api/posts/{id}
- Page XHTML Export: /user/{username}/pages/{slug}.xhtml
- RSS Feed: /feed.xml
- Sitemap: /sitemap-en.xml

This endpoint is designed for AI agents (ChatGPT, Claude, Perplexity, etc.) to quickly understand what the site contains and how to access it. It includes:

Site description from app-config.json
Top 20 public pages with links and descriptions
Top 20 recent posts with author attribution
Full public API reference with URL patterns

OpenAPI / Scalar API Reference — `/api/reference`

Every API endpoint is documented via OpenAPI 3.0 and served through a Scalar interactive UI. This isn't just documentation — it's a live, testable interface for every route in the system.

Every content URL automatically injects proper Open Graph and Twitter Card metadata into the HTML <head>. This happens at the server level before the SPA loads, so crawlers and social platforms always get the right preview.

Source: SPA injection → renderer.ts, Posts → db-post-exports.ts, Pages XHTML → pages-rich-html.ts, Pages HTML → pages-html.ts

What gets injected

Meta Tag	Source
`og:title`	Page title or post title with author attribution
`og:description`	Page description, extracted from content, or auto-generated fallback
`og:image`	First photo card, gallery image, or markdown image — resolved through the image optimization proxy
`og:type`	`article` for pages/posts, `product` for product pages
`og:url`	Canonical URL
`twitter:card`	`summary_large_image` (when image is available)
`twitter:title`	Same as `og:title`
`twitter:image`	Same as `og:image`

Image resolution priority

The system walks the content tree to find the best display image:

Photo Card widget — highest priority, uses picture ID for resolution
Gallery widget — uses first image from the gallery
Explicit image widget — direct image URL
Markdown image — extracted from inline markdown ![](url)
Page meta thumbnail — fallback from page metadata

All images are proxied through the image optimization service (see below) to ensure optimal dimensions and format for social previews.

Home Page

The home page (/) gets its own meta injection using site config from app-config.json, with optional override from the _site/home system page. This includes full JSON-LD with WebSite and Organization schemas, plus a SearchAction for sitelinks search box.

JSON-LD Structured Data

Polymech generates context-appropriate JSON-LD structured data for every content type:

Posts → `SocialMediaPosting`

{
  "@context": "https://schema.org",
  "@type": "SocialMediaPosting",
  "headline": "Post Title",
  "image": ["https://...optimized.jpg"],
  "datePublished": "2025-03-01T12:00:00Z",
  "author": {
    "@type": "Person",
    "name": "Author Name"
  }
}

Pages → `Article`

{
  "@context": "https://schema.org",
  "@type": "Article",
  "headline": "Page Title by Author | PolyMech",
  "author": { "@type": "Person", "name": "Author" },
  "description": "...",
  "image": "https://..."
}

Product Pages → `Product` with `Offer`

When a page belongs to a products category, the structured data automatically switches to the Product schema with pricing:

{
  "@context": "https://schema.org",
  "@type": "Product",
  "name": "Product Name",
  "description": "...",
  "image": "https://...",
  "category": "Products > Subcategory",
  "offers": {
    "@type": "Offer",
    "price": "29.99",
    "priceCurrency": "EUR",
    "availability": "https://schema.org/InStock",
    "itemCondition": "https://schema.org/NewCondition"
  }
}

Price, currency, condition, and availability are resolved from the type system / page variables — no manual JSON-LD editing needed.

Home Page → `WebSite` + `Organization`

{
  "@context": "https://schema.org",
  "@graph": [
    {
      "@type": "WebSite",
      "name": "PolyMech",
      "url": "https://polymech.info",
      "potentialAction": {
        "@type": "SearchAction",
        "target": "https://polymech.info/search?q={search_term_string}",
        "query-input": "required name=search_term_string"
      }
    },
    {
      "@type": "Organization",
      "name": "Polymech",
      "url": "https://polymech.info",
      "logo": "https://..."
    }
  ]
}

Server-Side Rendering & Initial State Injection

Polymech is a React SPA, but it doesn't sacrifice SEO for interactivity. The server pre-fetches data and injects it into the HTML before sending it to the client:

Source: Home/post/embed injection → index.ts, Embed pages → content.ts, Profile injection → db-user.ts

Home page (/): Feed data and site home page content are fetched in parallel and injected as window.__INITIAL_STATE__
Post pages (/post/:id): Post metadata is resolved and injected as OG/Twitter/JSON-LD meta tags
User pages (/user/:id/pages/:slug): Page content, author profile, category paths, and meta image are all resolved server-side

This means:

Google sees a fully populated <head> with title, description, image, and structured data
Social platforms (Facebook, Twitter, LinkedIn, Discord, Slack) render rich link previews immediately
The React app hydrates instantly without a loading spinner — the data is already there

Responsive Image Optimization

Every image served through Polymech's SEO routes is automatically optimized:

Source: db-pictures.ts · html-generator.ts

Format negotiation: Images are served in modern formats (AVIF, WebP) with JPEG fallback
Responsive srcsets: Multiple size variants (320w, 640w, 1024w) are pre-generated and cached on disk
Aspect-ratio preservation: Height is calculated from source metadata to prevent layout shift
LCP optimization: The first image in any export gets fetchpriority="high", subsequent images get loading="lazy"
Edge caching: Optimized variants are served from /api/images/cache/ after first generation

The XHTML exports use <img> tags with proper loading and fetchpriority attributes. The RSS and Merchant feeds use the image proxy URLs for optimized product images at 1200px width.

Internationalization (i18n)

Polymech's SEO features are fully i18n-aware, all the way down to the widget level.

Source: pages-i18n.ts · db-i18n.ts

How it works

Widget-level translations — Each widget in a page (markdown text, photo cards, tabs, etc.) can have its content translated to any language. Translations are stored per widget_id + prop_path + target_lang.
Page meta translations — Title and description can be translated using a special __meta__ sentinel in the translations table.
Feed translations — The home feed widget in XHTML exports translates page titles and descriptions when a ?lang=xx parameter is provided.

Where i18n applies

Feature	i18n Support
XHTML page export	✅ `?lang=de` translates all widget content, title, and description
XHTML rich HTML export	✅ Feed items within home widgets are translated
HTML meta injection	✅ Translated title/description used for OG tags
Markdown export	✅ Widget content translated before Markdown conversion
Email export	✅ Full widget translation applied before email rendering
RSS feed	Pages in feed use translated descriptions
Sitemap	URLs point to canonical (untranslated) versions
llms.txt	Currently English only (descriptions from source content)

Usage

Append ?lang=xx to any page export URL:

/user/admin/pages/about.xhtml?lang=de     → German rich HTML
/user/admin/pages/about.md?lang=fr         → French Markdown
/user/admin/pages/about.email.html?lang=es → Spanish email

Translation management is handled through the platform's built-in glossary system and widget translation API, with AI-assisted translation support.

Embeddable Content

Posts and pages can be embedded in external sites via iframe using the embed routes: → content.ts

/embed/:postId     → Embeddable post viewer
/embed/page/:pageId → Embeddable page viewer

Embed pages are served with injected initial state (no API call needed on load) and include proper meta for social previews when the embed URL itself is shared.

API-First Architecture

All SEO endpoints are part of the OpenAPI 3.0 spec and documented at /api/reference. This means:

Source: Route definitions → routes.ts, Product registration → index.ts

Every route has proper request/response schemas
Rate limiting and caching headers are standardized
Third-party tools (Zapier, n8n, custom scripts) can programmatically access all content
The API is browsable and testable through the interactive Scalar UI

Relevant data endpoints

Endpoint	Description
`GET /api/posts/:id`	Full post data with pictures, responsive variants, and video job status
`GET /api/user-page/:identifier/:slug`	Full page data with content tree, profile, and metadata
`GET /api/feed`	Paginated feed with category filtering, sorting, and user-specific likes
`GET /api/profiles?ids=...`	Batch user profile lookup
`GET /api/media-items?ids=...`	Batch media item lookup with responsive image generation
`GET /api/serving/site-info?url=...`	Extract OG/JSON-LD metadata from any external URL → site-info.ts
`GET /api/search?q=...`	Full-text search across posts and pages → db-search.ts

Route Reference

Content Exports

Route	Method	Description
`/post/:id.xhtml`	GET	Post as standalone rich HTML
`/post/:id.pdf`	GET	Post as PDF
`/post/:id.md`	GET	Post as Markdown
`/post/:id.json`	GET	Post as JSON
`/user/:id/pages/:slug.xhtml`	GET	Page as standalone rich HTML
`/user/:id/pages/:slug.html`	GET	Page with OG meta injection
`/user/:id/pages/:slug.pdf`	GET	Page as PDF
`/user/:id/pages/:slug.md`	GET	Page as Markdown
`/user/:id/pages/:slug.json`	GET	Page as JSON
`/user/:id/pages/:slug.email.html`	GET	Page as email-optimized HTML

Discovery & Feeds

Route	Method	Description
`/feed.xml`	GET	RSS 2.0 feed
`/products.xml`	GET	Google Merchant XML feed
`/sitemap-en.xml`	GET	XML Sitemap
`/llms.txt`	GET	LLM-readable site summary
`/llms.md`	GET	LLM summary (Markdown content-type)
`/api/reference`	GET	Interactive OpenAPI documentation

Meta Injection

Route	Method	Description
`/`	GET	Home page with feed injection + WebSite/Organization JSON-LD
`/post/:id`	GET	Post page with OG/Twitter/JSON-LD injection
`/user/:id/pages/:slug`	GET	Page with OG/Twitter meta injection
`/embed/:id`	GET	Embeddable post with initial state
`/embed/page/:id`	GET	Embeddable page with initial state

Developer Experience

Polymech isn't just SEO-friendly for end users — it's built to be a joy for developers integrating with or extending the platform.

Source: Server entry point → index.ts · routes.ts

OpenAPI 3.1 Specification — `/doc`

The entire API is described by a machine-readable OpenAPI 3.1 spec served at /doc. Every route — from feed endpoints to image uploads to page CRUD — is fully typed with Zod schemas that auto-generate the spec. No hand-written YAML, no drift between code and docs.

GET /doc → OpenAPI 3.1 JSON spec

This spec can be imported directly into Postman, Insomnia, or any OpenAPI-compatible tool for instant client generation.

Swagger UI — `/ui`

Classic Swagger UI is available at /ui for developers who prefer the traditional interactive API explorer. It connects to the same live OpenAPI spec:

Try-it-out for every endpoint
Request/response schema visualization
Bearer token authentication built in
Auto-generated curl commands

Scalar API Reference — `/reference` & `/api/reference`

Scalar provides a modern, polished alternative to Swagger UI. Polymech serves it at both /reference and /api/reference:

Beautiful, searchable interface — grouped by tag (Serving, Posts, Media, Storage, etc.)
Pre-authenticated — Bearer token auto-filled from SCALAR_AUTH_TOKEN env var
Live request testing — send requests directly from the browser with real responses
Code generation — copy-paste ready snippets in curl, JavaScript, Python, Go, and more
Dark mode — because of course

Modular Product Architecture

The server is organized as a registry of Products — self-contained modules that each own their routes, handlers, workers, and lifecycle:

Product	Description
Serving	Content delivery, SEO, feeds, exports, meta injection
Images	Upload, optimization, proxy, responsive variant generation
Videos	Upload, transcoding (HLS), thumbnail extraction
Email	Page-to-email rendering, SMTP delivery, template management
Storage	Virtual file system with ACL, mounts, and glob queries
OpenAI	AI chat, image generation, markdown tools
Analytics	Request tracking, geo-lookup, real-time streaming
Ecommerce	Cart, checkout, payment integration

Each product registers its own OpenAPI routes via app.openapi(route, handler), so the spec always reflects exactly what's deployed. Adding a new product automatically exposes it in Swagger, Scalar, and /doc.

Zod-Powered Schema Validation

All request and response schemas are defined with Zod using @hono/zod-openapi. This gives you:

Runtime validation — invalid requests are rejected with structured error messages before hitting business logic
Type safety — TypeScript types are inferred from schemas, zero manual type definitions
Auto-docs — Zod schemas feed directly into the OpenAPI spec with examples and descriptions
Composability — shared schemas (e.g., pagination, media items) are reused across products

Background Job Queue (PgBoss)

Long-running tasks (video transcoding, email sending, cache warming) are managed through PgBoss, a PostgreSQL-backed job queue:

Jobs are submittable via API: POST /api/boss/job
Job status is queryable: GET /api/boss/job/:id
Jobs can be cancelled, resumed, completed, or failed via dedicated endpoints
Workers auto-register on startup and process jobs in the background

Real-Time Log Streaming

System logs and analytics are streamable in real-time via SSE (Server-Sent Events):

GET /api/logs/system/stream   → Live system logs
GET /api/analytics/stream     → Live request analytics

This makes debugging in staging or production trivial — just open the stream in a browser tab or curl.

WebSocket Support

When ENABLE_WEBSOCKETS=true, the server initializes a WebSocket manager for real-time features like live feed updates and collaborative editing notifications.

Security & Middleware Stack

The server applies a layered middleware stack to all routes: → see security.md

Source: auth.ts · analytics.ts · rateLimiter.ts · blocklist.ts

Layer	Description
CORS	Fully permissive for API consumption from any origin
Analytics	Request tracking with IP resolution and geo-lookup
Auth	Optional JWT-based authentication via `Authorization: Bearer` header
Admin	Role-based access control for admin-only endpoints
Compression	Brotli/gzip compression on all responses
Secure Headers	CSP, X-Frame-Options (permissive for embeds), CORP disabled for cross-origin media
Rate Limiting	Configurable per-route rate limiting (disabled by default)

Client-Side SEO & Performance

The React SPA contributes to SEO through smart hydration, code splitting, and i18n support.

Source: App.tsx · i18n.tsx · formatDetection.ts

HelmetProvider — Dynamic `<head>` Management

The app is wrapped in react-helmet-async's <HelmetProvider>, enabling any component to dynamically inject <title>, <meta>, and <link> tags into the document head. This complements the server-side meta injection — the server provides OG/Twitter tags for crawlers, while Helmet handles client-side navigation.

Route-Based Code Splitting

25+ routes use React.lazy() for on-demand loading, keeping the initial bundle small for faster First Contentful Paint:

Eagerly loaded (in initial bundle): Index, Auth, Profile, UserProfile, TagPage, SearchResults — the high-traffic, SEO-critical pages
Lazy loaded: Post, UserPage, Wizard, AdminPage, all playground routes, FileBrowser, Tetris, ecommerce routes

This split ensures that unauthenticated, view-only visitors (including crawlers) get the fastest possible load time.

Initial State Hydration

The client reads window.__INITIAL_STATE__ injected by the server (see Server-Side Rendering) to avoid waterfall API calls on first load. This covers:

feed — Home page feed data
siteHomePage — Home page CMS content
profile — User profile on /user/:id pages

Client-Side i18n — Language Detection & `<T>` Component

Source: i18n.tsx · JSON translations in src/i18n/*.json

The <T> component wraps translatable strings and resolves them against per-language JSON dictionaries. Language is determined via a cascading priority chain:

URL parameter (?lang=de) — highest priority, enables shareable translated links
Cookie (lang=de) — persists across navigation, set when URL param is used
Browser language (navigator.languages) — automatic fallback

13 supported languages: English, Français, Kiswahili, Deutsch, Español, Nederlands, 日本語, 한국어, Português, Русский, Türkçe, 中文

Translation dictionaries are loaded eagerly via Vite's import.meta.glob for instant availability. Missing keys auto-collect into localStorage for dictionary building (downloadTranslations() exports them as JSON).

Format Detection

On app boot, initFormatDetection() probes browser support for modern image formats (AVIF, WebP). This informs the responsive image system which <source> elements to include in <picture> tags, ensuring optimal Core Web Vitals scores.

Summary

Polymech treats SEO as a core platform feature, not an afterthought. Every content entity is automatically:

Discoverable — via sitemap, RSS, merchant feed, and LLM endpoints
Previewable — with Open Graph, Twitter Cards, and JSON-LD for rich social sharing
Exportable — in 6+ formats (XHTML, HTML, PDF, Markdown, JSON, Email)
Translatable — with widget-level i18n that flows through all export formats
Optimized — with responsive images, lazy loading, LCP prioritization, and edge caching
Programmable — with a full OpenAPI spec and interactive documentation

All of this works out of the box. No configuration needed.

TODO — Pending Improvements

Critical

Canonical URLs — Add <link rel="canonical"> to all XHTML/HTML exports and SPA pages to prevent duplicate content penalties across .xhtml, .html, and SPA routes
robots.txt — Serve a dynamic robots.txt at the root with sitemap references and crawl-delay directives. Currently missing entirely
Hreflang tags — Add <link rel="alternate" hreflang="..."> tags to multi-language pages so search engines serve the correct language variant per region
Meta description per page — Pages and posts currently inherit a generic description. Wire the post description / page meta.description field into the <meta name="description"> tag

High Priority

Structured data expansion — Add BreadcrumbList schema for page navigation paths and WebSite schema with SearchAction for sitelinks search box
[-] Sitemap pagination — Current sitemap is a single XML file. For large catalogs (1000+ products), split into sitemap index + per-entity sitemaps (sitemap-posts.xml, sitemap-pages.xml, sitemap-products.xml)
Last-modified headers — Set Last-Modified and ETag on all content routes (posts, pages, feeds) to support conditional requests and improve crawler efficiency
Dynamic OG images — Auto-generate Open Graph images for pages/posts that don't have a cover image, using title + brand overlay
JSON-LD for products — Add Product schema with offers, aggregateRating, and brand to product pages for rich shopping results

Medium Priority

[-] AMP pages — Generate AMP-compliant HTML exports for posts to enable AMP carousel in Google mobile search
RSS per-user feeds — Currently only a global /feed.xml. Add per-user feeds at /user/:id/feed.xml so individual creators can be subscribed to
Merchant feed i18n — Product feed currently exports in the default language. Generate per-locale feeds (/products-de.xml, /products-fr.xml) using the i18n translation system
Preconnect / DNS-prefetch hints — Add <link rel="preconnect"> for known external domains (CDN, image proxy, analytics) in the SPA shell
llms.txt expansion — Current llms.txt covers posts. Extend to include pages, products, and user profiles for broader AI agent discovery → content.ts
WebSub / PubSubHubbub — Add <link rel="hub"> to RSS feeds and implement WebSub pings on content publish for real-time feed reader updates

Low Priority / Nice-to-Have

Core Web Vitals monitoring — Integrate CrUX API or web-vitals library to track LCP, FID, CLS and surface in analytics dashboard
Schema.org FAQ / HowTo — Auto-detect FAQ-style and tutorial page content and inject corresponding structured data
Twitter Cards validation — Add twitter:site and twitter:creator meta tags from user profiles for proper attribution
Video schema — Add VideoObject JSON-LD for posts containing video media items
IndexNow — Implement IndexNow API pings to Bing/Yandex on content publish for near-instant indexing

AEO — Answer Engine Optimization

Optimize content to be cited as direct answers by AI answer engines (Google AI Overviews, Bing Copilot, Perplexity, ChatGPT).

Answer-first content blocks — In XHTML/HTML exports, structure pages with concise 40-60 word answer summaries at the top of each section, before the detailed explanation. AI engines pull individual passages — clarity wins
FAQPage schema injection — Auto-detect Q&A patterns in page widgets (heading + paragraph pairs) and inject FAQPage JSON-LD. This is the #1 schema type cited by answer engines
QAPage schema for posts — When a post title is phrased as a question, wrap the body in QAPage structured data with acceptedAnswer
Text fragment identifiers — Add #:~:text= fragment links in sitemaps and llms.txt to guide AI engines to the most relevant passage in long-form pages
Featured snippet optimization — Ensure XHTML exports use <table>, <ol>, and <dl> for comparison content, definitions, and step-by-step guides — these are the formats Google AI Overview pulls from
Concise <meta name="description"> per section — For long pages with multiple sections, consider generating per-section meta descriptions via anchor-targeted structured data

GEO — Generative Engine Optimization

Optimize content to be referenced and summarized by generative AI systems (ChatGPT, Gemini, Claude, Perplexity).

Entity authority via JSON-LD — Add Organization, Person, and WebSite schema with consistent @id URIs across all pages. AI models use entity graphs to determine source authority
E-E-A-T signals — Inject author schema with credentials, link to author profile pages, and add datePublished / dateModified to all content. Generative engines weight experience and freshness
Comparison and "X vs Y" pages — Create comparison page templates that AI systems frequently pull from when users ask evaluative questions
Fact-dense content markers — Add ClaimReview or Dataset schema where applicable. AI models prioritize statistically-backed and verifiable claims
Citation-optimized exports — In Markdown and JSON exports, include source_url, author, published_date, and license fields so AI systems can properly attribute when citing
AI Share of Voice tracking — Track brand mentions across ChatGPT, Perplexity, and Google AI Overviews to measure GEO effectiveness. Consider building an internal monitoring endpoint or integrating third-party tools

AI Crawler Management

Control and optimize how AI training bots and inference crawlers interact with the platform.

Dynamic robots.txt with AI directives — Serve a robots.txt that explicitly manages AI crawlers: allow GPTBot, ClaudeBot, PerplexityBot on content routes, but disallow on admin/API routes. Consider Google-Extended for training opt-in/out
llms.txt v2 — Expand current llms.txt beyond posts to include: pages with summaries, product catalog overview, author profiles, and a structured capability description. Follow the emerging llms.txt spec with Markdown formatting
llms-full.txt — Generate a comprehensive full-content version at /llms-full.txt with all page content flattened into Markdown for deep AI ingestion
AI crawler rate limiting — Apply custom rate limits for known AI user agents (GPTBot, ClaudeBot, CCBot, PerplexityBot) to prevent content scraping from overloading the server while still allowing indexing
AI access analytics — Track and surface AI bot traffic separately in the analytics dashboard: which bots, how often, which routes, and bandwidth consumed. Use the existing user-agent parsing in analytics.ts
Structured content API for AI — Create a dedicated /api/content endpoint that returns semantically structured content (title, sections, facts, entities) optimized for LLM consumption, distinct from the user-facing API
IETF AI Preferences compliance — Monitor the IETF "AI Preferences Working Group" (launched 2025) for the standardized machine-readable AI access rules spec. Implement when finalized — will likely supersede or extend robots.txt for AI

AI-Native Content Formats

Markdown-first content pipeline — Ensure all page widgets can export clean, semantic Markdown. This is the preferred format for LLM ingestion and is used by llms.txt, llms-full.txt, and AI-friendly feeds
Structured knowledge base export — Generate a /knowledge.json endpoint that exports the entire content catalog as a structured knowledge graph (entities, relationships, facts) for RAG pipelines and enterprise AI integrations
MCP (Model Context Protocol) server — Expose platform content as an MCP resource so AI assistants (Claude, Cursor, etc.) can directly query posts, pages, and products as context — leveraging the existing REST API as the backend
AI-friendly RSS — Extend RSS feed items with full content (not just excerpts), structured metadata, and <media:content> tags so AI feed consumers get complete context without needing to crawl

36 KiB Raw Blame History

SEO & Discoverability on Polymech

Table of Contents

Multi-Format Content Export

Pages

Posts

How it works

Discovery Endpoints

RSS Feed — /feed.xml

Google Merchant Feed — /products.xml

Sitemap — /sitemap-en.xml

LLM-Readable Content — /llms.txt & /llms.md

OpenAPI / Scalar API Reference — /api/reference

Open Graph & Social Meta

What gets injected

Image resolution priority

Home Page

JSON-LD Structured Data

Posts → SocialMediaPosting

Pages → Article

Product Pages → Product with Offer

Home Page → WebSite + Organization

Server-Side Rendering & Initial State Injection

Responsive Image Optimization

Internationalization (i18n)

How it works

Where i18n applies

Usage

Embeddable Content

API-First Architecture

Relevant data endpoints

Route Reference

Content Exports

Discovery & Feeds

Meta Injection

Developer Experience

OpenAPI 3.1 Specification — /doc

Swagger UI — /ui

Scalar API Reference — /reference & /api/reference

Modular Product Architecture

Zod-Powered Schema Validation

Background Job Queue (PgBoss)

Real-Time Log Streaming

WebSocket Support

Security & Middleware Stack

Client-Side SEO & Performance

HelmetProvider — Dynamic <head> Management

Route-Based Code Splitting

Initial State Hydration

Client-Side i18n — Language Detection & <T> Component

Format Detection

Summary

TODO — Pending Improvements

Critical

High Priority

Medium Priority

Low Priority / Nice-to-Have

AEO — Answer Engine Optimization

GEO — Generative Engine Optimization

AI Crawler Management

AI-Native Content Formats

36 KiB

Raw Blame History

RSS Feed — `/feed.xml`

Google Merchant Feed — `/products.xml`

Sitemap — `/sitemap-en.xml`

LLM-Readable Content — `/llms.txt` & `/llms.md`

OpenAPI / Scalar API Reference — `/api/reference`

Posts → `SocialMediaPosting`

Pages → `Article`

Product Pages → `Product` with `Offer`

Home Page → `WebSite` + `Organization`

OpenAPI 3.1 Specification — `/doc`

Swagger UI — `/ui`

Scalar API Reference — `/reference` & `/api/reference`

HelmetProvider — Dynamic `<head>` Management

Client-Side i18n — Language Detection & `<T>` Component