112 lines
5.5 KiB
Markdown
112 lines
5.5 KiB
Markdown
# polymech-cli
|
|
|
|
Cross-platform C++ CLI built with CMake.
|
|
|
|
## Prerequisites
|
|
|
|
| Tool | Version |
|
|
|------|---------|
|
|
| CMake | ≥ 3.20 |
|
|
| C++ compiler | C++17 (MSVC, GCC, or Clang) |
|
|
|
|
## Build
|
|
|
|
```bash
|
|
# Debug
|
|
cmake --preset dev
|
|
cmake --build --preset dev
|
|
|
|
# Release
|
|
cmake --preset release
|
|
cmake --build --preset release
|
|
```
|
|
|
|
## Usage
|
|
|
|
```bash
|
|
polymech-cli --help
|
|
polymech-cli --version
|
|
```
|
|
|
|
## Worker Mode & Gridsearch
|
|
|
|
The `worker` subcommand is designed to be spawned by the Node.js frontend orchestrator (`GridSearchUdsManager`) for background gridsearch execution. It accepts length-prefixed JSON frames over a Unix Domain Socket (UDS) or a local TCP port on Windows.
|
|
|
|
```bash
|
|
polymech-cli worker --uds <path_or_port> --daemon --user-uid <id> --config <path>
|
|
```
|
|
|
|
### IPC Resiliency and Logging
|
|
The C++ worker pipeline incorporates extensive feedback and retry instrumentation:
|
|
|
|
1. **Watchdog Heartbeats (`ping` / `pong`)**
|
|
- The Node orchestrator sweeps the active worker pool every 15 seconds. It explicitly logs when a ping is sent and when a `pong` (or other active events like `log`, `job_progress`, or `ack`) are received.
|
|
- If a C++ worker stops responding to IPC events for 60 seconds (hanging thread or deadlock), it is automatically killed (`SIGKILL`) and evicted from the pool.
|
|
|
|
2. **Socket Traceability**
|
|
- The UDS socket actively traps unexpected closures and TCP faults (like `ECONNRESET`). If the pipe breaks mid-job, explicit socket `error` event handlers in the Node orchestrator will instantly fail the job and log the stack trace, preventing indefinite client-side UI hangs, especially during heavy re-runs.
|
|
|
|
3. **Persistent Crash Logging (`logs/uds.json`)**
|
|
- The C++ worker initializes a multi-sink logger (`logger::init_uds`). It pumps standard logs to `stderr` while simultaneously persisting an append-only file trace to `server/logs/uds.json`.
|
|
- The file sink guarantees synchronization to disk aggressively (every 1 second, and immediately on `info` severity). If the worker process vanishes or crashes, `uds.json` acts as the black-box flight recorder for post-mortem debugging.
|
|
|
|
4. **Job Specification Transparency**
|
|
- Gridsearch payloads (including `retry` and `expand` endpoints) aggressively log their input shape (`guided` bounds flag, `enrichers` subset) within the Node console before passing work to the C++ orchestrator. This allows for clear traceability from UI action -> Node submission -> C++ execution.
|
|
|
|
5. **Thread Safety & Frame Synchronization (Mutexes)**
|
|
- The UDS socket handles dual-direction asynchronous streams. The background execution graph (powered by Taskflow) emits high-frequency events (`location`, `waypoint-start`) via `GridsearchCallbacks`. Concurrently, the orchestrator Node.js process sends periodic commands (`ping`, `cancel`) that the C++ socket loop must instantly acknowledge.
|
|
- To prevent overlapping payload frames (which corrupt the critical 4-byte `len` header), a global `g_uds_socket_mutex` is strictly enforced. It guarantees that direct UI acknowledgments (`pong`, `cancel_ack`) and background logging (`uds_sink` / Taskflow events) never interleave their `asio::write` bursts onto the pipe.
|
|
|
|
### IPC Framing & Payload Protocol
|
|
Communication runs strictly via length-prefixed JSON frames. This safeguards against TCP fragmentation during heavy event streams.
|
|
|
|
**Binary Frame Format:**
|
|
`[4-byte Unsigned Little-Endian Integer (Payload Length)] [UTF-8 JSON Object]`
|
|
|
|
#### Control Commands (Node → C++)
|
|
If the JSON object contains an `"action"` field, it is handled synchronously on the socket thread:
|
|
- **Health Check:** `{"action": "ping"}`
|
|
→ *Replies:* `{"type": "pong", "data": {"memoryMb": 120, "cpuTimeMs": 4500}}`
|
|
- **Cancellation:** `{"action": "cancel", "jobId": "job_123"}`
|
|
→ Worker sets the atomic cancellation token to safely halt the target `taskflow`, instantly replying `{"type": "cancel_ack", "data": "job_123"}`
|
|
- **Daemon Teardown:** `{"action": "stop"}`
|
|
→ Flushes all streams and exits cleanly.
|
|
|
|
#### Gridsearch Payload (Node → C++)
|
|
If no `"action"` field exists, the message is treated as a gridsearch spec and pushed into a lock-free `ConcurrentQueue` for the background execution graph:
|
|
```json
|
|
{
|
|
"jobId": "run_9a8bc7",
|
|
"configPath": "config/postgres.toml",
|
|
"cacheDir": "../packages/gadm/cache",
|
|
"enrich": true,
|
|
"guided": {
|
|
"areas": [{ "gid": "ESP.6_1", "level": 1 }],
|
|
"settings": { "gridMode": "hex", "cellSize": 5.0 }
|
|
},
|
|
"search": {
|
|
"types": ["restaurant"],
|
|
"limitPerArea": 500
|
|
}
|
|
}
|
|
```
|
|
|
|
#### Event Streaming (C++ → Node)
|
|
As the gridsearch pipeline executes, the `GridsearchCallbacks` emit standard length-prefixed events directly back to the active UDS socket:
|
|
- **`ack`**: Acknowledges job was successfully dequeued (`{"type": "ack", "data": {"jobId": "..."}}`).
|
|
- **`log`**: Passthrough of all internal C++ `spdlog` messages using the custom `uds_sink` adapter.
|
|
- **`location` / `node`**: Raw geolocation geometries and enriched contact details streamed incrementally.
|
|
- **`job_progress`**: Phase updates (Grid Generation → Search → Enrichment).
|
|
- **`job_result`**: The final statistical and timer summary (EnumMs, SearchMs, Total Emails, etc).
|
|
- **`error`**: Unrecoverable boundary parsing or database initialization faults.
|
|
|
|
## License
|
|
|
|
BSD-3-Clause
|
|
|
|
## Requirements
|
|
|
|
- [https://github.com/taskflow/taskflow](https://github.com/taskflow/taskflow)
|
|
- [https://github.com/cameron314/concurrentqueue](https://github.com/cameron314/concurrentqueue)
|
|
- [https://github.com/chriskohlhoff/asio](https://github.com/chriskohlhoff/asio)
|