mono/packages/media/cpp/README.md
2026-04-16 12:42:59 +02:00

20 KiB
Raw Blame History

pm-image (C++)

CMake-based pm-image binary: CLI, HTTP REST (serve), and line-delimited JSON IPC (ipc). Image processing uses libvips — the same engine as Sharp (Node.js), exposed with a similar option model. Optional .env is loaded from the working directory (laserpants/dotenv-cpp).

Prerequisites

Requirement Notes
CMake ≥ 3.20
C++ compiler C++17
libvips Required — pkg-config (Unix) or third_party/vips-dev-* / VIPS_ROOT (Windows)
Git For FetchContent (CLI11, Asio, httplib, json, libcurl, dotenv, p-ranav/glob, PicoSHA2 for cache keys)
Node.js Optional — npm run test:media (Node 18+)

Installing libvips

Debian / Ubuntu

sudo apt install libvips-dev pkg-config

macOS (Homebrew)

brew install vips pkg-config

Windows (official dev bundle — recommended)

npm run setup:vips

Downloads build-win64-mxe vips-dev-x64-all into third_party/vips-dev-*. CMake adds that path automatically; DLLs are copied to dist/ on link.

Pin the version with MEDIA_VIPS_VERSION (default 8.18.2) if needed.

Alternatively set VIPS_ROOT or CMAKE_PREFIX_PATH to a tree with include/vips/vips.h and lib/libvips.lib.

Build

cd packages/media/cpp
cmake --preset release
cmake --build --preset release

Binary: dist/pm-image (.exe on Windows).

Windows installer (NSIS)

From packages/media/cpp, after a release build:

npm run build:installer

Produces dist/pm-image-Setup.exe, installs pm-image.exe, libvips DLLs, vips-modules-8.18, and scripts/explorer-resize.ps1, and prepends the install directory to the user PATH. Uninstaller is registered under Add/Remove Programs.

Explorer context menu (pm-image register-explorer)

Implemented in C++ (Windows registry). The NSIS installer runs pm-image.exe register-explorer after copying files; uninstall runs register-explorer --unregister --no-refresh-shell.

pm-image register-explorer
pm-image register-explorer --dry
pm-image register-explorer --unregister

Registers a PM-Media cascading menu on image extensions (including .avif, .arw, .webp, TIFF, etc.), on folders, and on the folder background (empty area). Resize presets (default widths 1980, 1200, 800, 400): in place or copy as ${SRC_NAME}_${width}${SRC_EXT}. Convert to JPG writes ${SRC_NAME}_converted.jpg via explorer-convert.ps1. Override widths with --widths 1920,800.

Under SystemFileAssociations, each extension is registered as .ext, .EXT, and .Ext so Explorer picks up the menu regardless of filename casing.

Defaults: --media-bin = this executable; --explorer-script / --explorer-convert-script / --explorer-ui-script = <exe_dir>\\scripts\\explorer-*.ps1, or packages/media/scripts\\… from a dev cpp\\dist build.

pm-media register-explorer forwards argv to pm-image.exe for convenience.

Context menus run powershell.exe -WindowStyle Hidden; re-run pm-image register-explorer after updates so registry command lines refresh.

Formats — same idea as Sharp / libvips

Sharp wraps libvips: decode → process → encode. We do the same with vips_image_new_from_file and format-specific savers.

Supported
Resize / geometry fit, dimensions, crop (cover + position), letterbox (contain + background), rotate, flip, flop, EXIF autorotate
Output (first-class in code) JPEG (Q, strip), PNG (compression), WebP (Q, strip), TIFF
AVIF / HEIC Via format / file extension and quality (libvips HEIF/AVIF saver — needs libheif in your libvips build)
Anything else libvips knows Fallback: vips_image_write_to_file from extension (e.g. GIF, JP2K, … depending on how libvips was built)

Input types match whatever your libvips build can load (the Windows vips-dev-x64-all bundle includes broad loader support). Set output format with the output path extension or JSON / CLI format (webp, avif, jpg, …).

Sharp-like options (resize / JSON)

Sharp concept pm-image / JSON field Notes
resize.fit fit inside, cover, contain, fill, outside
resize.position position centre, attention, entropy, … → libvips interesting
resize.kernel kernel nearest, cubic, mitchell, lanczos2, lanczos3 (default)
`jpeg webp … quality`
png compression png_compression 09
withoutEnlargement without_enlargement Default true; CLI --allow-enlargement flips
EXIF orientation autorotate Default true; CLI --no-autorotate
Strip metadata strip_metadata Default true; CLI --no-strip
rotate rotate 0, 90, 180, 270 (after autorotate)
flip / flop flip / flop
Letterbox background #rrggbb for contain

Windows: pm-image resize --ui opens a native Win32 dialog to choose input/output paths, max dimensions, fit mode, quality, and enlargement / autorotate / strip options. Other CLI flags seed the dialog; if you cancel, the command exits without processing. pm-image resize --ui-next opens the newer Win32++ ribbon UI with drag-drop queue and settings panel; optional --src or positional input paths pre-seed the queue.

REST POST /v1/resize accepts application/json (same keys as the table, plus “Batch paths & cache” below) or multipart/form-data (upload a file; response is the image bytes — see serve). IPC uses the same JSON keys as the table, plus the batch/cache fields.


Batch paths & cache: globs, variables, caching

Use this when one resize invocation (CLI, or a single REST/IPC request) should match many inputs or build per-file output paths. See also docs/Examples.md (TypeScript / Sharp reference).

Summary

Topic What it does
Input glob * / ? / ** in input expand to a list of files (p-ranav/glob).
HTTP(S) URL input may be http:// or https:// — image is fetched with libcurl (follows redirects; --url-timeout default 5 s; --url-max-redirects default 20). Cache key is URL + options (no local mtime).
Omit output (CLI) If the input resolves to exactly one file or URL, you may omit the second positional argument: the file is written to the current working directory using a sanitized basename (same rules as sanitizeFilename in packages/acl — illegal/control chars, Windows reserved names, trailing dots/spaces, 255-byte UTF-8 cap). URLs without a path extension default to .jpg. --format overrides the output extension when set.
Destination variables output may contain ${SRC_DIR}, ${SRC_NAME}, ${SRC_FILE_EXT} (or &{…}) — expanded per matched input.
expand_glob JSON false: treat input / output as literal paths (no glob expansion). Templates still apply if output contains ${SRC_ / &{SRC_.
Output cache SHA-256 key from input path + size + mtime + options; default dir <cwd>/cache/images/.

Not supported in C++: Bash extglob (e.g. *.+(jpg)). Use *.jpg, **/*.jpg, or separate runs. Bare {SRC_NAME} without $ / & is not a placeholder — use ${SRC_NAME} or &{SRC_NAME}.

Input globs

  • Syntax: *, ?, and ** for recursion. Paths are resolved from the current working directory unless absolute.
  • Multiple files → directory output: output must be an existing directory, or a new directory given with a trailing / or \ (parent dirs are created).
  • Single file from a literal path or a glob that matches one file: output can be a full file path, or a directory (trailing sep) to keep the original filename.

CLI: use positional input/output or --src / --dst together. You can omit output only when input is a single path or URL (not a multi-match glob).

pm-image resize './photos/**/*.jpg' ./out/
pm-image resize --src './shots/*.png' --dst ./thumbs/
# URL → ./200.jpg under cwd (picsum path segment "200", default extension .jpg)
pm-image resize 'https://picsum.photos/200' --max-width 400 --url-timeout 30

JSON: same strings in "input" and "output". When expand_glob is true (default), glob expansion runs when the pattern contains *, ?, or **.

Destination variables (${SRC_*} / &{SRC_*})

Placeholders are expanded after inputs are resolved (glob or single file). Each output path is built from the absolute input file for that row.

Placeholder Meaning
SRC_DIR Parent directory of the current input (generic path, / separators).
SRC_NAME Filename stem without extension (photo for photo.JPG).
SRC_FILE_EXT Extension with leading dot (e.g. .jpg), or empty if none.

Use cases: write beside each source (${SRC_DIR}/out/${SRC_NAME}.webp), suffix stems (${SRC_NAME}_thumb.jpg), or change extension via format / path.

pm-image resize --src ./photo.jpg --dst '${SRC_DIR}/${SRC_NAME}_medium.jpg' --max-width 800
pm-image resize --src './shots/*.jpg' --dst '${SRC_DIR}/${SRC_NAME}.webp' --max-width 1920

REST / IPC — same strings in JSON (escape quotes in shell as needed):

curl -s -X POST http://127.0.0.1:8080/v1/resize \
  -H 'Content-Type: application/json' \
  -d '{"input":"/data/in.png","output":"/out/${SRC_NAME}_thumb.webp","max_width":256}'

Responses: {"ok":true} for a single output; if more than one file is produced, {"ok":true,"count":N,"outputs":["..."]}.

JSON reference (batch + cache)

Field Type Default Purpose
input string required Source path or glob.
output string required File path, directory, or template with ${SRC_*} / &{SRC_*}.
expand_glob bool true If false, no glob expansion; paths are literal.
cache bool true (or server default from serve / ipc flags) Enable/disable cache for this request.
cache_dir string empty → <cwd>/cache/images (or server --cache-dir) Root directory for cached blobs.
url_timeout_sec int 5 Total + connect timeout for HTTP(S) fetch (seconds; 0 = libcurl default).
url_max_redirects int 20 Max redirects when fetching URLs.

All resize options (max_width, fit, …) participate in the same JSON body.

REST / IPC JSON always require "output" in JSON (no automatic path). Multipart upload does not use input/output paths.

Output cache

  • Default: caching is on; root dir cache/images under the process current working directory (override with --cache-dir or JSON cache_dir).
  • Key: SHA-256 (PicoSHA2) over canonical input path, file size, modification time, and a stable encoding of all resize options — change any of these and you get a miss.
  • Storage: <cache_dir>/XX/<hex> (two-letter shard).
  • Hit: copy cached bytes to output; libvips is not initialized for that job.
  • Miss: run resize, then best-effort store into cache (failure to store does not fail the request).

CLI (resize): --no-cache, --cache-dir <path>.

serve / ipc: same flags set defaults for requests that omit cache / cache_dir. Per-request JSON can still set "cache": true or "cache_dir": "/path" to override.

Batch + cache: glob batches run sequentially (one file after another); each file may hit or miss the cache independently.

serve example with cache and glob

pm-image serve --host 127.0.0.1 -p 8080 --cache-dir /var/cache/pm-image
curl -s -X POST http://127.0.0.1:8080/v1/resize \
  -H 'Content-Type: application/json' \
  -d '{"input":"/data/in/*.jpg","output":"/data/out/","max_width":400,"cache":true,"cache_dir":"/var/cache/pm-image"}'

Concurrency

  • HTTP serve: cpp-httplib default thread pool (CPPHTTPLIB_THREAD_POOL_COUNT — see upstream httplib.h).
  • libvips: processing is thread-safe per image; configure process-wide concurrency with VIPS_CONCURRENCY (or vips_concurrency_set in code later if needed).

CLI examples

Paths below use Unix style; on Windows run pm-image.exe and use .\ or full paths as needed.

Help and version

pm-image --help
pm-image resize --help
pm-image -v

resize — fit inside a box (default), write WebP / AVIF by extension

# Max 800×600, stay inside the box, Lanczos3 (default), write JPEG quality 85 (default)
pm-image resize photo.jpg out.jpg --max-width 800 --max-height 600

# Same, explicit quality
pm-image resize photo.jpg out.jpg --max-width 800 --max-height 600 -q 90

# WebP output (quality applies)
pm-image resize photo.jpg thumb.webp --max-width 400 --max-height 400 -q 82

# AVIF output (quality applies; needs HEIF/AVIF support in your libvips build)
pm-image resize photo.png out.avif --max-width 1200 --max-height 1200 -q 50

# Force output format when the path has no extension you trust
pm-image resize in.tif /tmp/out --format webp --max-width 512

resize — square images (1:1)

Use the same --max-width and --max-height (that value is the square side in pixels). Pick --fit:

fit Result
cover Fills the square; crops overflow (default crop: --position centre, or attention / entropy for smart crop).
contain Full image inside the square; letterboxing on two sides if needed (--background).
fill Stretches to the square (ignores aspect ratio).
# 512×512 crop-to-square (avatars, thumbnails)
pm-image resize portrait.jpg avatar.jpg --fit cover --max-width 512 --max-height 512

# 1080×1080 WebP, smart crop on subject
pm-image resize product.png grid.webp --fit cover --max-width 1080 --max-height 1080 --position attention -q 85

# Square canvas, no crop — padded bands with a colour
pm-image resize panoramic.jpg square.jpg --fit contain --max-width 800 --max-height 800 --background '#111111'

# Exact square by stretching (rare)
pm-image resize any.jpg out.jpg --fit fill --max-width 256 --max-height 256

REST / IPC JSON: e.g. "max_width": 512, "max_height": 512, "fit": "cover", "position": "attention".

resize — cover (crop), contain (letterbox), rotate / flip

# Cover: fill 1200×630, crop centre (use --position attention for smart crop)
pm-image resize wide.jpg social.jpg --fit cover --max-width 1200 --max-height 630

# Contain: fit inside 800×600 canvas, letterbox with a background
pm-image resize logo.png padded.png --fit contain --max-width 800 --max-height 600 --background '#1a1a1a'

# EXIF autorotate (default), then rotate 90° CCW, vertical flip
pm-image resize img.jpg rotated.jpg --max-width 1024 --rotate 90 --flip

serve — HTTP REST

# Default: http://127.0.0.1:8080 — GET /health, POST /v1/resize
pm-image serve --host 127.0.0.1 -p 8080
Mode Content-Type Request Response
Upload multipart/form-data Image part file, image, or upload; optional fields: max_width, max_height, format, fit, quality, … Binary image (Content-Type matches format; default JPEG if format omitted). Content-Disposition: inline; filename="resized…".
Paths application/json input and output (server-visible paths), plus resize / batch / cache fields JSON{"ok":true} or batch count / outputs

Upload (multipart): the response body is the processed image, not JSON. The on-disk output cache is not used for uploads (each request uses temporary files).

curl -s http://127.0.0.1:8080/health
curl -s -o thumb.jpg -X POST http://127.0.0.1:8080/v1/resize \
  -F "file=@/path/in.png" \
  -F "max_width=400" \
  -F "quality=85"

Paths (JSON): same resize options as the CLI, with input and output paths (must be readable/writable by the server process).

curl -s -X POST http://127.0.0.1:8080/v1/resize \
  -H 'Content-Type: application/json' \
  -d '{"input":"/path/in.png","output":"/path/out.webp","max_width":400,"quality":80}'

Optional JSON: "cache":false, "expand_glob":false, "cache_dir":"..." — see Batch paths & cache above.

transform — AI image editing

Uses a generative-AI model (currently Google Gemini) to edit an image based on a text prompt. Reads the input, sends image + prompt to the API, and writes the result.

pm-image transform photo.jpg -p "remove the background"
pm-image transform photo.jpg out.png -p "make it black and white" --model gemini-3-pro-image-preview
Flag Default Notes
input (positional) required Input image path
output (positional) auto from input + prompt Output path
-p, --prompt required Editing prompt
--provider google AI provider
--model gemini-3-pro-image-preview Model name
--api-key env IMAGE_TRANSFORM_GOOGLE_API_KEY API key
--aspect-ratio auto 1:1, 16:9, 4:3, 3:4, 9:16, 21:9
--image-size 1K 512, 1K, 2K, 4K

If --api-key is omitted, the value of the IMAGE_TRANSFORM_GOOGLE_API_KEY environment variable is used.

ipc — one JSON line per connection (TCP; Unix socket on Linux/macOS)

pm-image ipc --host 127.0.0.1 -p 9333 --cache-dir ./cache/images
# elsewhere: send a single line, read one line back, e.g.
# {"input":"/tmp/a.jpg","output":"/tmp/b.webp","max_width":320,"format":"webp","cache":true}

Same JSON fields as REST (input, output, globs, expand_glob, cache, cache_dir, resize options).

kbot — forward to kbot

Forwards remaining arguments to an external kbot binary. Set the KBOT_EXE environment variable to the kbot binary path (e.g. packages/kbot/cpp/dist/kbot.exe).

KBOT_EXE=./kbot.exe pm-image kbot ai --prompt "hello"

Tests

From packages/media/cpp, after cmake --build --preset release (or npm run build:release):

npm run test:media                  # full suite
npm run test:media:rest             # REST only (JSON + multipart)
npm run test:media:multipart        # multipart upload → image body only
npm run test:media:templates        # `${SRC_*}` REST + IPC + CLI
npm run test:media:glob             # recursive glob + templates (PNG under glob-in/)
npm run test:media:glob:raw         # same for tests/assets/raw/**/*.arw (skips if none)
npm run test:media:ipc              # IPC only
npm run test:media:url              # HTTP(S) URL inputs — needs network

Requires a built dist/pm-image linked against libvips and fixture PNGs (npm run generate:assets if missing).

Each run overwrites tests/test-report-last.md with Markdown from orchestrator/reports.js: host (CPU, RAM, load), Node process (CPU/memory deltas, RSS), timing, plus image rows (fixture and response byte sizes, PNG pixel dimensions when known, and multipart/JSON output file sizes).

The suite covers REST (JSON paths and multipart upload), IPC (TCP), optional Unix socket on non-Windows, destination templates, and recursive glob + ${SRC_DIR} / ${SRC_NAME} (test:media:glob — outputs under tests/assets/glob-in/**/out/, gitignored, for manual inspection). test:media:glob:raw runs the same glob flow against user-supplied .arw files under tests/assets/raw/ (skipped when the folder is missing or has no ARW files).

HTTP URL smoke tests (downloads from picsum.photos, needs network):

npm run test:media:url

License

See LICENSE in this directory when present.

Testing - UI