676 lines
36 KiB
Markdown
676 lines
36 KiB
Markdown
# SEO & Discoverability on Polymech
|
|
|
|
Polymech is built as an SEO-first platform. Every piece of content — whether it's a media post, a CMS page, or a product listing — is automatically discoverable by search engines, social platforms, AI agents, and feed readers. No plugins, no external services, no config files. It's all baked in.
|
|
|
|
This document covers every SEO-related feature the platform offers.
|
|
|
|
---
|
|
|
|
## Table of Contents
|
|
|
|
- [Multi-Format Content Export](#multi-format-content-export)
|
|
- [Discovery Endpoints](#discovery-endpoints)
|
|
- [Open Graph & Social Meta](#open-graph--social-meta)
|
|
- [JSON-LD Structured Data](#json-ld-structured-data)
|
|
- [Server-Side Rendering & Initial State Injection](#server-side-rendering--initial-state-injection)
|
|
- [Responsive Image Optimization](#responsive-image-optimization)
|
|
- [Internationalization (i18n)](#internationalization-i18n)
|
|
- [Embeddable Content](#embeddable-content)
|
|
- [API-First Architecture](#api-first-architecture)
|
|
- [Developer Experience](#developer-experience)
|
|
- [Client-Side SEO & Performance](#client-side-seo--performance)
|
|
- [Route Reference](#route-reference)
|
|
|
|
---
|
|
|
|
## Multi-Format Content Export
|
|
|
|
Every content entity on Polymech (posts and pages) can be exported in multiple formats by simply changing the file extension in the URL. No API keys, no special headers — just append the extension.
|
|
|
|
> **Source:** Page exports → [pages-routes.ts](../server/src/products/serving/pages/pages-routes.ts), Post exports → [db-post-exports.ts](../server/src/products/serving/db/db-post-exports.ts)
|
|
|
|
### Pages
|
|
|
|
Pages are rich, widget-based documents built with a visual editor. They export to:
|
|
|
|
> [pages-rich-html.ts](../server/src/products/serving/pages/pages-rich-html.ts) · [pages-html.ts](../server/src/products/serving/pages/pages-html.ts) · [pages-pdf.ts](../server/src/products/serving/pages/pages-pdf.ts) · [pages-markdown.ts](../server/src/products/serving/pages/pages-markdown.ts) · [pages-email.ts](../server/src/products/serving/pages/pages-email.ts) · [pages-data.ts](../server/src/products/serving/pages/pages-data.ts)
|
|
|
|
| Format | URL Pattern | Content-Type | Description |
|
|
|--------|-------------|--------------|-------------|
|
|
| **XHTML** | `/user/:id/pages/:slug.xhtml` | `text/html` | Standalone rich HTML with Tailwind CSS styling, full meta tags, JSON-LD, and responsive layout. Ready to share or archive. |
|
|
| **HTML** | `/user/:id/pages/:slug.html` | `text/html` | SPA shell with injected Open Graph metadata for crawlers and social previews. |
|
|
| **PDF** | `/user/:id/pages/:slug.pdf` | `application/pdf` | Print-ready PDF export. Great for invoices, reports, or offline sharing. |
|
|
| **Markdown** | `/user/:id/pages/:slug.md` | `text/markdown` | Clean Markdown export of the page content. Useful for migration, backups, or feeding to other systems. |
|
|
| **JSON** | `/user/:id/pages/:slug.json` | `application/json` | Raw page data including content tree, metadata, and author profile. Perfect for headless CMS integrations. |
|
|
| **Email HTML** | `/user/:id/pages/:slug.email.html` | `text/html` | Email-client-optimized HTML with inlined styles and table-based layout. Compatible with Outlook, Gmail, Apple Mail, and others. |
|
|
|
|
### Posts
|
|
|
|
Posts are media-centric entries (photos, videos, link cards). They export to:
|
|
|
|
> [db-post-exports.ts](../server/src/products/serving/db/db-post-exports.ts) · [db-posts.ts](../server/src/products/serving/db/db-posts.ts)
|
|
|
|
| Format | URL Pattern | Content-Type | Description |
|
|
|--------|-------------|--------------|-------------|
|
|
| **XHTML** | `/post/:id.xhtml` | `text/html` | Standalone rich HTML with Tailwind CSS, responsive image gallery, OG meta, and JSON-LD structured data. |
|
|
| **PDF** | `/post/:id.pdf` | `application/pdf` | PDF export of the post with embedded images. |
|
|
| **Markdown** | `/post/:id.md` | `text/markdown` | Markdown with title, description, and linked images. |
|
|
| **JSON** | `/post/:id.json` | `application/json` | Full post data with pictures array and author profile. |
|
|
|
|
### How it works
|
|
|
|
The export system doesn't use templates or pre-rendered files. Each format is generated server-side on-the-fly from the same canonical content tree, which means:
|
|
|
|
- Exports are always up-to-date — no build step needed
|
|
- All formats share the same data pipeline — update once, export everywhere
|
|
- The widget-based content system is format-agnostic — markdown text, photo cards, galleries, tabs, and nested layouts all render correctly in every format
|
|
|
|
---
|
|
|
|
## Discovery Endpoints
|
|
|
|
> **Source:** [content.ts](../server/src/products/serving/content.ts) · [routes.ts](../server/src/products/serving/routes.ts)
|
|
|
|
### RSS Feed — `/feed.xml`
|
|
|
|
Standard RSS 2.0 feed of the latest posts and pages. Supports filtering by category via query parameters: → [content.ts](../server/src/products/serving/content.ts) `handleGetFeedXml`
|
|
|
|
```
|
|
/feed.xml?categorySlugs=tutorials&limit=50&sortBy=latest
|
|
```
|
|
|
|
- Image enclosures with optimized proxy URLs
|
|
- Per-item author attribution
|
|
- Category filtering (by ID or slug, including descendants)
|
|
- Configurable sort order (`latest` or `top`)
|
|
|
|
### Google Merchant Feed — `/products.xml`
|
|
|
|
A Google Merchant Center compatible XML feed for products. Automatically includes only items with pricing data set through the type system: → [content.ts](../server/src/products/serving/content.ts) `handleGetMerchantFeed`
|
|
|
|
```xml
|
|
<g:id>product-uuid</g:id>
|
|
<g:title>Product Name</g:title>
|
|
<g:price>29.99 EUR</g:price>
|
|
<g:product_type>Category > Subcategory</g:product_type>
|
|
<g:image_link>https://service.polymech.info/api/images/cache/optimized.jpg</g:image_link>
|
|
```
|
|
|
|
- Automatically resolves price, currency, and condition from the type system & page variables
|
|
- Full category path hierarchy
|
|
- Optimized product images via the image proxy
|
|
- All items link to their canonical page/post URL
|
|
|
|
### Sitemap — `/sitemap-en.xml`
|
|
|
|
Auto-generated XML sitemap of all public, visible pages: → [content.ts](../server/src/products/serving/content.ts) `handleGetSitemap`
|
|
|
|
```xml
|
|
<url>
|
|
<loc>https://polymech.info/user/username/pages/my-page</loc>
|
|
<lastmod>2025-03-01T12:00:00.000Z</lastmod>
|
|
<changefreq>weekly</changefreq>
|
|
<priority>0.8</priority>
|
|
</url>
|
|
```
|
|
|
|
- Only includes public + visible pages (respects content visibility settings)
|
|
- Uses `updated_at` for accurate `<lastmod>` timestamps
|
|
- Ready to submit to Google Search Console, Bing Webmaster Tools, etc.
|
|
|
|
### LLM-Readable Content — `/llms.txt` & `/llms.md`
|
|
|
|
Following the emerging [llms.txt standard](https://llmstxt.org/), Polymech generates a machine-readable summary of the entire site at `/llms.txt` (and `/llms.md` for Markdown content-type): → [content.ts](../server/src/products/serving/content.ts) `handleGetLLMText`
|
|
|
|
```markdown
|
|
# Polymech
|
|
|
|
> A full-stack media platform...
|
|
|
|
## Pages
|
|
|
|
- [Getting Started](https://polymech.info/user/admin/pages/getting-started): Introduction to...
|
|
- [Product Catalog](https://polymech.info/user/admin/pages/catalog): Browse our...
|
|
|
|
## Posts
|
|
|
|
- [New Release](https://polymech.info/post/abc123) by admin: Announcing...
|
|
|
|
## Public API
|
|
|
|
- Post Details JSON: /api/posts/{id}
|
|
- Page XHTML Export: /user/{username}/pages/{slug}.xhtml
|
|
- RSS Feed: /feed.xml
|
|
- Sitemap: /sitemap-en.xml
|
|
```
|
|
|
|
This endpoint is designed for AI agents (ChatGPT, Claude, Perplexity, etc.) to quickly understand what the site contains and how to access it. It includes:
|
|
|
|
- Site description from `app-config.json`
|
|
- Top 20 public pages with links and descriptions
|
|
- Top 20 recent posts with author attribution
|
|
- Full public API reference with URL patterns
|
|
|
|
### OpenAPI / Scalar API Reference — `/api/reference`
|
|
|
|
Every API endpoint is documented via OpenAPI 3.0 and served through a Scalar interactive UI. This isn't just documentation — it's a live, testable interface for every route in the system.
|
|
|
|
---
|
|
|
|
## Open Graph & Social Meta
|
|
|
|
Every content URL automatically injects proper Open Graph and Twitter Card metadata into the HTML `<head>`. This happens at the server level before the SPA loads, so crawlers and social platforms always get the right preview.
|
|
|
|
> **Source:** SPA injection → [renderer.ts](../server/src/products/serving/renderer.ts), Posts → [db-post-exports.ts](../server/src/products/serving/db/db-post-exports.ts), Pages XHTML → [pages-rich-html.ts](../server/src/products/serving/pages/pages-rich-html.ts), Pages HTML → [pages-html.ts](../server/src/products/serving/pages/pages-html.ts)
|
|
|
|
### What gets injected
|
|
|
|
| Meta Tag | Source |
|
|
|----------|--------|
|
|
| `og:title` | Page title or post title with author attribution |
|
|
| `og:description` | Page description, extracted from content, or auto-generated fallback |
|
|
| `og:image` | First photo card, gallery image, or markdown image — resolved through the image optimization proxy |
|
|
| `og:type` | `article` for pages/posts, `product` for product pages |
|
|
| `og:url` | Canonical URL |
|
|
| `twitter:card` | `summary_large_image` (when image is available) |
|
|
| `twitter:title` | Same as `og:title` |
|
|
| `twitter:image` | Same as `og:image` |
|
|
|
|
### Image resolution priority
|
|
|
|
The system walks the content tree to find the best display image:
|
|
|
|
1. **Photo Card widget** — highest priority, uses picture ID for resolution
|
|
2. **Gallery widget** — uses first image from the gallery
|
|
3. **Explicit image widget** — direct image URL
|
|
4. **Markdown image** — extracted from inline markdown ``
|
|
5. **Page meta thumbnail** — fallback from page metadata
|
|
|
|
All images are proxied through the image optimization service (see below) to ensure optimal dimensions and format for social previews.
|
|
|
|
### Home Page
|
|
|
|
The home page (`/`) gets its own meta injection using site config from `app-config.json`, with optional override from the `_site/home` system page. This includes full JSON-LD with `WebSite` and `Organization` schemas, plus a `SearchAction` for sitelinks search box.
|
|
|
|
---
|
|
|
|
## JSON-LD Structured Data
|
|
|
|
Polymech generates context-appropriate JSON-LD structured data for every content type:
|
|
|
|
### Posts → `SocialMediaPosting`
|
|
|
|
```json
|
|
{
|
|
"@context": "https://schema.org",
|
|
"@type": "SocialMediaPosting",
|
|
"headline": "Post Title",
|
|
"image": ["https://...optimized.jpg"],
|
|
"datePublished": "2025-03-01T12:00:00Z",
|
|
"author": {
|
|
"@type": "Person",
|
|
"name": "Author Name"
|
|
}
|
|
}
|
|
```
|
|
|
|
### Pages → `Article`
|
|
|
|
```json
|
|
{
|
|
"@context": "https://schema.org",
|
|
"@type": "Article",
|
|
"headline": "Page Title by Author | PolyMech",
|
|
"author": { "@type": "Person", "name": "Author" },
|
|
"description": "...",
|
|
"image": "https://..."
|
|
}
|
|
```
|
|
|
|
### Product Pages → `Product` with `Offer`
|
|
|
|
When a page belongs to a `products` category, the structured data automatically switches to the `Product` schema with pricing:
|
|
|
|
```json
|
|
{
|
|
"@context": "https://schema.org",
|
|
"@type": "Product",
|
|
"name": "Product Name",
|
|
"description": "...",
|
|
"image": "https://...",
|
|
"category": "Products > Subcategory",
|
|
"offers": {
|
|
"@type": "Offer",
|
|
"price": "29.99",
|
|
"priceCurrency": "EUR",
|
|
"availability": "https://schema.org/InStock",
|
|
"itemCondition": "https://schema.org/NewCondition"
|
|
}
|
|
}
|
|
```
|
|
|
|
Price, currency, condition, and availability are resolved from the type system / page variables — no manual JSON-LD editing needed.
|
|
|
|
### Home Page → `WebSite` + `Organization`
|
|
|
|
```json
|
|
{
|
|
"@context": "https://schema.org",
|
|
"@graph": [
|
|
{
|
|
"@type": "WebSite",
|
|
"name": "PolyMech",
|
|
"url": "https://polymech.info",
|
|
"potentialAction": {
|
|
"@type": "SearchAction",
|
|
"target": "https://polymech.info/search?q={search_term_string}",
|
|
"query-input": "required name=search_term_string"
|
|
}
|
|
},
|
|
{
|
|
"@type": "Organization",
|
|
"name": "Polymech",
|
|
"url": "https://polymech.info",
|
|
"logo": "https://..."
|
|
}
|
|
]
|
|
}
|
|
```
|
|
|
|
---
|
|
|
|
## Server-Side Rendering & Initial State Injection
|
|
|
|
Polymech is a React SPA, but it doesn't sacrifice SEO for interactivity. The server pre-fetches data and injects it into the HTML before sending it to the client:
|
|
|
|
> **Source:** Home/post/embed injection → [index.ts](../server/src/products/serving/index.ts), Embed pages → [content.ts](../server/src/products/serving/content.ts), Profile injection → [db-user.ts](../server/src/products/serving/db/db-user.ts)
|
|
|
|
- **Home page** (`/`): Feed data and site home page content are fetched in parallel and injected as `window.__INITIAL_STATE__`
|
|
- **Post pages** (`/post/:id`): Post metadata is resolved and injected as OG/Twitter/JSON-LD meta tags
|
|
- **User pages** (`/user/:id/pages/:slug`): Page content, author profile, category paths, and meta image are all resolved server-side
|
|
|
|
This means:
|
|
|
|
- **Google** sees a fully populated `<head>` with title, description, image, and structured data
|
|
- **Social platforms** (Facebook, Twitter, LinkedIn, Discord, Slack) render rich link previews immediately
|
|
- **The React app** hydrates instantly without a loading spinner — the data is already there
|
|
|
|
---
|
|
|
|
## Responsive Image Optimization
|
|
|
|
Every image served through Polymech's SEO routes is automatically optimized:
|
|
|
|
> **Source:** [db-pictures.ts](../server/src/products/serving/db/db-pictures.ts) · [html-generator.ts](../server/src/products/serving/pages/html-generator.ts)
|
|
|
|
- **Format negotiation**: Images are served in modern formats (AVIF, WebP) with JPEG fallback
|
|
- **Responsive srcsets**: Multiple size variants (320w, 640w, 1024w) are pre-generated and cached on disk
|
|
- **Aspect-ratio preservation**: Height is calculated from source metadata to prevent layout shift
|
|
- **LCP optimization**: The first image in any export gets `fetchpriority="high"`, subsequent images get `loading="lazy"`
|
|
- **Edge caching**: Optimized variants are served from `/api/images/cache/` after first generation
|
|
|
|
The XHTML exports use `<img>` tags with proper `loading` and `fetchpriority` attributes. The RSS and Merchant feeds use the image proxy URLs for optimized product images at 1200px width.
|
|
|
|
---
|
|
|
|
## Internationalization (i18n)
|
|
|
|
Polymech's SEO features are fully i18n-aware, all the way down to the widget level.
|
|
|
|
> **Source:** [pages-i18n.ts](../server/src/products/serving/pages/pages-i18n.ts) · [db-i18n.ts](../server/src/products/serving/db/db-i18n.ts)
|
|
|
|
### How it works
|
|
|
|
1. **Widget-level translations** — Each widget in a page (markdown text, photo cards, tabs, etc.) can have its content translated to any language. Translations are stored per `widget_id` + `prop_path` + `target_lang`.
|
|
|
|
2. **Page meta translations** — Title and description can be translated using a special `__meta__` sentinel in the translations table.
|
|
|
|
3. **Feed translations** — The home feed widget in XHTML exports translates page titles and descriptions when a `?lang=xx` parameter is provided.
|
|
|
|
### Where i18n applies
|
|
|
|
| Feature | i18n Support |
|
|
|---------|-------------|
|
|
| XHTML page export | ✅ `?lang=de` translates all widget content, title, and description |
|
|
| XHTML rich HTML export | ✅ Feed items within home widgets are translated |
|
|
| HTML meta injection | ✅ Translated title/description used for OG tags |
|
|
| Markdown export | ✅ Widget content translated before Markdown conversion |
|
|
| Email export | ✅ Full widget translation applied before email rendering |
|
|
| RSS feed | Pages in feed use translated descriptions |
|
|
| Sitemap | URLs point to canonical (untranslated) versions |
|
|
| llms.txt | Currently English only (descriptions from source content) |
|
|
|
|
### Usage
|
|
|
|
Append `?lang=xx` to any page export URL:
|
|
|
|
```
|
|
/user/admin/pages/about.xhtml?lang=de → German rich HTML
|
|
/user/admin/pages/about.md?lang=fr → French Markdown
|
|
/user/admin/pages/about.email.html?lang=es → Spanish email
|
|
```
|
|
|
|
Translation management is handled through the platform's built-in glossary system and widget translation API, with AI-assisted translation support.
|
|
|
|
---
|
|
|
|
## Embeddable Content
|
|
|
|
Posts and pages can be embedded in external sites via iframe using the embed routes: → [content.ts](../server/src/products/serving/content.ts)
|
|
|
|
```
|
|
/embed/:postId → Embeddable post viewer
|
|
/embed/page/:pageId → Embeddable page viewer
|
|
```
|
|
|
|
Embed pages are served with injected initial state (no API call needed on load) and include proper meta for social previews when the embed URL itself is shared.
|
|
|
|
---
|
|
|
|
## API-First Architecture
|
|
|
|
All SEO endpoints are part of the OpenAPI 3.0 spec and documented at `/api/reference`. This means:
|
|
|
|
> **Source:** Route definitions → [routes.ts](../server/src/products/serving/routes.ts), Product registration → [index.ts](../server/src/products/serving/index.ts)
|
|
|
|
- Every route has proper request/response schemas
|
|
- Rate limiting and caching headers are standardized
|
|
- Third-party tools (Zapier, n8n, custom scripts) can programmatically access all content
|
|
- The API is browsable and testable through the interactive Scalar UI
|
|
|
|
### Relevant data endpoints
|
|
|
|
| Endpoint | Description |
|
|
|----------|-------------|
|
|
| `GET /api/posts/:id` | Full post data with pictures, responsive variants, and video job status |
|
|
| `GET /api/user-page/:identifier/:slug` | Full page data with content tree, profile, and metadata |
|
|
| `GET /api/feed` | Paginated feed with category filtering, sorting, and user-specific likes |
|
|
| `GET /api/profiles?ids=...` | Batch user profile lookup |
|
|
| `GET /api/media-items?ids=...` | Batch media item lookup with responsive image generation |
|
|
| `GET /api/serving/site-info?url=...` | Extract OG/JSON-LD metadata from any external URL → [site-info.ts](../server/src/products/serving/site-info.ts) |
|
|
| `GET /api/search?q=...` | Full-text search across posts and pages → [db-search.ts](../server/src/products/serving/db/db-search.ts) |
|
|
|
|
---
|
|
|
|
## Route Reference
|
|
|
|
### Content Exports
|
|
|
|
| Route | Method | Description |
|
|
|-------|--------|-------------|
|
|
| `/post/:id.xhtml` | GET | Post as standalone rich HTML |
|
|
| `/post/:id.pdf` | GET | Post as PDF |
|
|
| `/post/:id.md` | GET | Post as Markdown |
|
|
| `/post/:id.json` | GET | Post as JSON |
|
|
| `/user/:id/pages/:slug.xhtml` | GET | Page as standalone rich HTML |
|
|
| `/user/:id/pages/:slug.html` | GET | Page with OG meta injection |
|
|
| `/user/:id/pages/:slug.pdf` | GET | Page as PDF |
|
|
| `/user/:id/pages/:slug.md` | GET | Page as Markdown |
|
|
| `/user/:id/pages/:slug.json` | GET | Page as JSON |
|
|
| `/user/:id/pages/:slug.email.html` | GET | Page as email-optimized HTML |
|
|
|
|
### Discovery & Feeds
|
|
|
|
| Route | Method | Description |
|
|
|-------|--------|-------------|
|
|
| `/feed.xml` | GET | RSS 2.0 feed |
|
|
| `/products.xml` | GET | Google Merchant XML feed |
|
|
| `/sitemap-en.xml` | GET | XML Sitemap |
|
|
| `/llms.txt` | GET | LLM-readable site summary |
|
|
| `/llms.md` | GET | LLM summary (Markdown content-type) |
|
|
| `/api/reference` | GET | Interactive OpenAPI documentation |
|
|
|
|
### Meta Injection
|
|
|
|
| Route | Method | Description |
|
|
|-------|--------|-------------|
|
|
| `/` | GET | Home page with feed injection + WebSite/Organization JSON-LD |
|
|
| `/post/:id` | GET | Post page with OG/Twitter/JSON-LD injection |
|
|
| `/user/:id/pages/:slug` | GET | Page with OG/Twitter meta injection |
|
|
| `/embed/:id` | GET | Embeddable post with initial state |
|
|
| `/embed/page/:id` | GET | Embeddable page with initial state |
|
|
|
|
---
|
|
|
|
## Developer Experience
|
|
|
|
Polymech isn't just SEO-friendly for end users — it's built to be a joy for developers integrating with or extending the platform.
|
|
|
|
> **Source:** Server entry point → [index.ts](../server/src/products/serving/index.ts) · [routes.ts](../server/src/products/serving/routes.ts)
|
|
|
|
### OpenAPI 3.1 Specification — `/doc`
|
|
|
|
The entire API is described by a machine-readable OpenAPI 3.1 spec served at `/doc`. Every route — from feed endpoints to image uploads to page CRUD — is fully typed with Zod schemas that auto-generate the spec. No hand-written YAML, no drift between code and docs.
|
|
|
|
```
|
|
GET /doc → OpenAPI 3.1 JSON spec
|
|
```
|
|
|
|
This spec can be imported directly into Postman, Insomnia, or any OpenAPI-compatible tool for instant client generation.
|
|
|
|
### Swagger UI — `/ui`
|
|
|
|
Classic Swagger UI is available at `/ui` for developers who prefer the traditional interactive API explorer. It connects to the same live OpenAPI spec:
|
|
|
|
- Try-it-out for every endpoint
|
|
- Request/response schema visualization
|
|
- Bearer token authentication built in
|
|
- Auto-generated curl commands
|
|
|
|
### Scalar API Reference — `/reference` & `/api/reference`
|
|
|
|
[Scalar](https://scalar.com/) provides a modern, polished alternative to Swagger UI. Polymech serves it at both `/reference` and `/api/reference`:
|
|
|
|
- **Beautiful, searchable interface** — grouped by tag (Serving, Posts, Media, Storage, etc.)
|
|
- **Pre-authenticated** — Bearer token auto-filled from `SCALAR_AUTH_TOKEN` env var
|
|
- **Live request testing** — send requests directly from the browser with real responses
|
|
- **Code generation** — copy-paste ready snippets in curl, JavaScript, Python, Go, and more
|
|
- **Dark mode** — because of course
|
|
|
|
### Modular Product Architecture
|
|
|
|
The server is organized as a registry of **Products** — self-contained modules that each own their routes, handlers, workers, and lifecycle:
|
|
|
|
| Product | Description |
|
|
|---------|-------------|
|
|
| **Serving** | Content delivery, SEO, feeds, exports, meta injection |
|
|
| **Images** | Upload, optimization, proxy, responsive variant generation |
|
|
| **Videos** | Upload, transcoding (HLS), thumbnail extraction |
|
|
| **Email** | Page-to-email rendering, SMTP delivery, template management |
|
|
| **Storage** | Virtual file system with ACL, mounts, and glob queries |
|
|
| **OpenAI** | AI chat, image generation, markdown tools |
|
|
| **Analytics** | Request tracking, geo-lookup, real-time streaming |
|
|
| **Ecommerce** | Cart, checkout, payment integration |
|
|
|
|
Each product registers its own OpenAPI routes via `app.openapi(route, handler)`, so the spec always reflects exactly what's deployed. Adding a new product automatically exposes it in Swagger, Scalar, and `/doc`.
|
|
|
|
### Zod-Powered Schema Validation
|
|
|
|
All request and response schemas are defined with [Zod](https://zod.dev/) using `@hono/zod-openapi`. This gives you:
|
|
|
|
- **Runtime validation** — invalid requests are rejected with structured error messages before hitting business logic
|
|
- **Type safety** — TypeScript types are inferred from schemas, zero manual type definitions
|
|
- **Auto-docs** — Zod schemas feed directly into the OpenAPI spec with examples and descriptions
|
|
- **Composability** — shared schemas (e.g., pagination, media items) are reused across products
|
|
|
|
### Background Job Queue (PgBoss)
|
|
|
|
Long-running tasks (video transcoding, email sending, cache warming) are managed through [PgBoss](https://github.com/timgit/pg-boss), a PostgreSQL-backed job queue:
|
|
|
|
- Jobs are submittable via API: `POST /api/boss/job`
|
|
- Job status is queryable: `GET /api/boss/job/:id`
|
|
- Jobs can be cancelled, resumed, completed, or failed via dedicated endpoints
|
|
- Workers auto-register on startup and process jobs in the background
|
|
|
|
### Real-Time Log Streaming
|
|
|
|
System logs and analytics are streamable in real-time via SSE (Server-Sent Events):
|
|
|
|
```
|
|
GET /api/logs/system/stream → Live system logs
|
|
GET /api/analytics/stream → Live request analytics
|
|
```
|
|
|
|
This makes debugging in staging or production trivial — just open the stream in a browser tab or curl.
|
|
|
|
### WebSocket Support
|
|
|
|
When `ENABLE_WEBSOCKETS=true`, the server initializes a WebSocket manager for real-time features like live feed updates and collaborative editing notifications.
|
|
|
|
### Security & Middleware Stack
|
|
|
|
The server applies a layered middleware stack to all routes: → see [security.md](./security.md)
|
|
|
|
> **Source:** [auth.ts](../server/src/middleware/auth.ts) · [analytics.ts](../server/src/middleware/analytics.ts) · [rateLimiter.ts](../server/src/middleware/rateLimiter.ts) · [blocklist.ts](../server/src/middleware/blocklist.ts)
|
|
|
|
| Layer | Description |
|
|
|-------|-------------|
|
|
| **CORS** | Fully permissive for API consumption from any origin |
|
|
| **Analytics** | Request tracking with IP resolution and geo-lookup |
|
|
| **Auth** | Optional JWT-based authentication via `Authorization: Bearer` header |
|
|
| **Admin** | Role-based access control for admin-only endpoints |
|
|
| **Compression** | Brotli/gzip compression on all responses |
|
|
| **Secure Headers** | CSP, X-Frame-Options (permissive for embeds), CORP disabled for cross-origin media |
|
|
| **Rate Limiting** | Configurable per-route rate limiting (disabled by default) |
|
|
|
|
---
|
|
|
|
## Client-Side SEO & Performance
|
|
|
|
The React SPA contributes to SEO through smart hydration, code splitting, and i18n support.
|
|
|
|
> **Source:** [App.tsx](../src/App.tsx) · [i18n.tsx](../src/i18n.tsx) · [formatDetection.ts](../src/utils/formatDetection.ts)
|
|
|
|
### HelmetProvider — Dynamic `<head>` Management
|
|
|
|
The app is wrapped in `react-helmet-async`'s `<HelmetProvider>`, enabling any component to dynamically inject `<title>`, `<meta>`, and `<link>` tags into the document head. This complements the server-side meta injection — the server provides OG/Twitter tags for crawlers, while Helmet handles client-side navigation.
|
|
|
|
### Route-Based Code Splitting
|
|
|
|
25+ routes use `React.lazy()` for on-demand loading, keeping the initial bundle small for faster First Contentful Paint:
|
|
|
|
- **Eagerly loaded** (in initial bundle): `Index`, `Auth`, `Profile`, `UserProfile`, `TagPage`, `SearchResults` — the high-traffic, SEO-critical pages
|
|
- **Lazy loaded**: `Post`, `UserPage`, `Wizard`, `AdminPage`, all playground routes, `FileBrowser`, `Tetris`, ecommerce routes
|
|
|
|
This split ensures that unauthenticated, view-only visitors (including crawlers) get the fastest possible load time.
|
|
|
|
### Initial State Hydration
|
|
|
|
The client reads `window.__INITIAL_STATE__` injected by the server (see [Server-Side Rendering](#server-side-rendering--initial-state-injection)) to avoid waterfall API calls on first load. This covers:
|
|
|
|
- `feed` — Home page feed data
|
|
- `siteHomePage` — Home page CMS content
|
|
- `profile` — User profile on `/user/:id` pages
|
|
|
|
### Client-Side i18n — Language Detection & `<T>` Component
|
|
|
|
> **Source:** [i18n.tsx](../src/i18n.tsx) · JSON translations in [src/i18n/*.json](../src/i18n/)
|
|
|
|
The `<T>` component wraps translatable strings and resolves them against per-language JSON dictionaries. Language is determined via a cascading priority chain:
|
|
|
|
1. **URL parameter** (`?lang=de`) — highest priority, enables shareable translated links
|
|
2. **Cookie** (`lang=de`) — persists across navigation, set when URL param is used
|
|
3. **Browser language** (`navigator.languages`) — automatic fallback
|
|
|
|
**13 supported languages:** English, Français, Kiswahili, Deutsch, Español, Nederlands, 日本語, 한국어, Português, Русский, Türkçe, 中文
|
|
|
|
Translation dictionaries are loaded eagerly via Vite's `import.meta.glob` for instant availability. Missing keys auto-collect into localStorage for dictionary building (`downloadTranslations()` exports them as JSON).
|
|
|
|
### Format Detection
|
|
|
|
On app boot, `initFormatDetection()` probes browser support for modern image formats (AVIF, WebP). This informs the responsive image system which `<source>` elements to include in `<picture>` tags, ensuring optimal Core Web Vitals scores.
|
|
|
|
---
|
|
|
|
## Summary
|
|
|
|
Polymech treats SEO as a core platform feature, not an afterthought. Every content entity is automatically:
|
|
|
|
- **Discoverable** — via sitemap, RSS, merchant feed, and LLM endpoints
|
|
- **Previewable** — with Open Graph, Twitter Cards, and JSON-LD for rich social sharing
|
|
- **Exportable** — in 6+ formats (XHTML, HTML, PDF, Markdown, JSON, Email)
|
|
- **Translatable** — with widget-level i18n that flows through all export formats
|
|
- **Optimized** — with responsive images, lazy loading, LCP prioritization, and edge caching
|
|
- **Programmable** — with a full OpenAPI spec and interactive documentation
|
|
|
|
All of this works out of the box. No configuration needed.
|
|
|
|
---
|
|
|
|
## TODO — Pending Improvements
|
|
|
|
### Critical
|
|
|
|
- [x] **Canonical URLs** — Add `<link rel="canonical">` to all XHTML/HTML exports and SPA pages to prevent duplicate content penalties across `.xhtml`, `.html`, and SPA routes
|
|
- [ ] **robots.txt** — Serve a dynamic `robots.txt` at the root with sitemap references and crawl-delay directives. Currently missing entirely
|
|
- [x] **Hreflang tags** — Add `<link rel="alternate" hreflang="...">` tags to multi-language pages so search engines serve the correct language variant per region
|
|
- [x] **Meta description per page** — Pages and posts currently inherit a generic description. Wire the post `description` / page `meta.description` field into the `<meta name="description">` tag
|
|
|
|
### High Priority
|
|
|
|
- [x] **Structured data expansion** — Add `BreadcrumbList` schema for page navigation paths and `WebSite` schema with `SearchAction` for sitelinks search box
|
|
- [-] **Sitemap pagination** — Current sitemap is a single XML file. For large catalogs (1000+ products), split into sitemap index + per-entity sitemaps (`sitemap-posts.xml`, `sitemap-pages.xml`, `sitemap-products.xml`)
|
|
- [x] **Last-modified headers** — Set `Last-Modified` and `ETag` on all content routes (posts, pages, feeds) to support conditional requests and improve crawler efficiency
|
|
- [ ] **Dynamic OG images** — Auto-generate Open Graph images for pages/posts that don't have a cover image, using title + brand overlay
|
|
- [x] **JSON-LD for products** — Add `Product` schema with `offers`, `aggregateRating`, and `brand` to product pages for rich shopping results
|
|
|
|
### Medium Priority
|
|
|
|
- [-] **AMP pages** — Generate AMP-compliant HTML exports for posts to enable AMP carousel in Google mobile search
|
|
- [ ] **RSS per-user feeds** — Currently only a global `/feed.xml`. Add per-user feeds at `/user/:id/feed.xml` so individual creators can be subscribed to
|
|
- [ ] **Merchant feed i18n** — Product feed currently exports in the default language. Generate per-locale feeds (`/products-de.xml`, `/products-fr.xml`) using the i18n translation system
|
|
- [ ] **Preconnect / DNS-prefetch hints** — Add `<link rel="preconnect">` for known external domains (CDN, image proxy, analytics) in the SPA shell
|
|
- [ ] **llms.txt expansion** — Current `llms.txt` covers posts. Extend to include pages, products, and user profiles for broader AI agent discovery → [content.ts](../server/src/products/serving/content.ts)
|
|
- [ ] **WebSub / PubSubHubbub** — Add `<link rel="hub">` to RSS feeds and implement WebSub pings on content publish for real-time feed reader updates
|
|
|
|
### Low Priority / Nice-to-Have
|
|
|
|
- [ ] **Core Web Vitals monitoring** — Integrate CrUX API or web-vitals library to track LCP, FID, CLS and surface in analytics dashboard
|
|
- [ ] **Schema.org FAQ / HowTo** — Auto-detect FAQ-style and tutorial page content and inject corresponding structured data
|
|
- [ ] **Twitter Cards validation** — Add `twitter:site` and `twitter:creator` meta tags from user profiles for proper attribution
|
|
- [ ] **Video schema** — Add `VideoObject` JSON-LD for posts containing video media items
|
|
- [ ] **IndexNow** — Implement IndexNow API pings to Bing/Yandex on content publish for near-instant indexing
|
|
|
|
---
|
|
|
|
### AEO — Answer Engine Optimization
|
|
|
|
Optimize content to be **cited as direct answers** by AI answer engines (Google AI Overviews, Bing Copilot, Perplexity, ChatGPT).
|
|
|
|
- [ ] **Answer-first content blocks** — In XHTML/HTML exports, structure pages with concise 40-60 word answer summaries at the top of each section, before the detailed explanation. AI engines pull individual passages — clarity wins
|
|
- [ ] **FAQPage schema injection** — Auto-detect Q&A patterns in page widgets (heading + paragraph pairs) and inject `FAQPage` JSON-LD. This is the #1 schema type cited by answer engines
|
|
- [ ] **QAPage schema for posts** — When a post title is phrased as a question, wrap the body in `QAPage` structured data with `acceptedAnswer`
|
|
- [ ] **Text fragment identifiers** — Add `#:~:text=` fragment links in sitemaps and llms.txt to guide AI engines to the most relevant passage in long-form pages
|
|
- [ ] **Featured snippet optimization** — Ensure XHTML exports use `<table>`, `<ol>`, and `<dl>` for comparison content, definitions, and step-by-step guides — these are the formats Google AI Overview pulls from
|
|
- [ ] **Concise `<meta name="description">` per section** — For long pages with multiple sections, consider generating per-section meta descriptions via anchor-targeted structured data
|
|
|
|
### GEO — Generative Engine Optimization
|
|
|
|
Optimize content to be **referenced and summarized** by generative AI systems (ChatGPT, Gemini, Claude, Perplexity).
|
|
|
|
- [ ] **Entity authority via JSON-LD** — Add `Organization`, `Person`, and `WebSite` schema with consistent `@id` URIs across all pages. AI models use entity graphs to determine source authority
|
|
- [ ] **E-E-A-T signals** — Inject `author` schema with credentials, link to author profile pages, and add `datePublished` / `dateModified` to all content. Generative engines weight experience and freshness
|
|
- [ ] **Comparison and "X vs Y" pages** — Create comparison page templates that AI systems frequently pull from when users ask evaluative questions
|
|
- [ ] **Fact-dense content markers** — Add `ClaimReview` or `Dataset` schema where applicable. AI models prioritize statistically-backed and verifiable claims
|
|
- [ ] **Citation-optimized exports** — In Markdown and JSON exports, include `source_url`, `author`, `published_date`, and `license` fields so AI systems can properly attribute when citing
|
|
- [ ] **AI Share of Voice tracking** — Track brand mentions across ChatGPT, Perplexity, and Google AI Overviews to measure GEO effectiveness. Consider building an internal monitoring endpoint or integrating third-party tools
|
|
|
|
### AI Crawler Management
|
|
|
|
Control and optimize how AI training bots and inference crawlers interact with the platform.
|
|
|
|
- [ ] **Dynamic `robots.txt` with AI directives** — Serve a `robots.txt` that explicitly manages AI crawlers: allow `GPTBot`, `ClaudeBot`, `PerplexityBot` on content routes, but disallow on admin/API routes. Consider `Google-Extended` for training opt-in/out
|
|
- [ ] **`llms.txt` v2** — Expand current `llms.txt` beyond posts to include: pages with summaries, product catalog overview, author profiles, and a structured capability description. Follow the emerging llms.txt spec with Markdown formatting
|
|
- [ ] **`llms-full.txt`** — Generate a comprehensive full-content version at `/llms-full.txt` with all page content flattened into Markdown for deep AI ingestion
|
|
- [ ] **AI crawler rate limiting** — Apply custom rate limits for known AI user agents (`GPTBot`, `ClaudeBot`, `CCBot`, `PerplexityBot`) to prevent content scraping from overloading the server while still allowing indexing
|
|
- [ ] **AI access analytics** — Track and surface AI bot traffic separately in the analytics dashboard: which bots, how often, which routes, and bandwidth consumed. Use the existing user-agent parsing in [analytics.ts](../server/src/middleware/analytics.ts)
|
|
- [ ] **Structured content API for AI** — Create a dedicated `/api/content` endpoint that returns semantically structured content (title, sections, facts, entities) optimized for LLM consumption, distinct from the user-facing API
|
|
- [ ] **IETF AI Preferences compliance** — Monitor the IETF "AI Preferences Working Group" (launched 2025) for the standardized machine-readable AI access rules spec. Implement when finalized — will likely supersede or extend `robots.txt` for AI
|
|
|
|
### AI-Native Content Formats
|
|
|
|
- [ ] **Markdown-first content pipeline** — Ensure all page widgets can export clean, semantic Markdown. This is the preferred format for LLM ingestion and is used by `llms.txt`, `llms-full.txt`, and AI-friendly feeds
|
|
- [ ] **Structured knowledge base export** — Generate a `/knowledge.json` endpoint that exports the entire content catalog as a structured knowledge graph (entities, relationships, facts) for RAG pipelines and enterprise AI integrations
|
|
- [ ] **MCP (Model Context Protocol) server** — Expose platform content as an MCP resource so AI assistants (Claude, Cursor, etc.) can directly query posts, pages, and products as context — leveraging the existing REST API as the backend
|
|
- [ ] **AI-friendly RSS** — Extend RSS feed items with full content (not just excerpts), structured metadata, and `<media:content>` tags so AI feed consumers get complete context without needing to crawl
|