Implement CostObserver that intercepts LlmResponse observer events and
records token usage to the CostTracker with proper cost calculations.
Changes:
- Add src/observability/cost.rs: CostObserver implementation
- Listens for LlmResponse events with token counts
- Looks up model pricing from CostConfig (with fallback defaults)
- Records usage via CostTracker.record_usage()
- Includes model family matching for pricing lookups
- Update src/observability/mod.rs:
- Export CostObserver
- Add create_observer_with_cost_tracking() helper that wraps base
observer with CostObserver when cost tracking is enabled
- Update src/gateway/mod.rs:
- Use create_observer_with_cost_tracking() to wire cost observer
into the gateway observer stack when config.cost.enabled is true
The /api/cost endpoint already exists and will now return accurate
session/daily/monthly cost data populated by the CostObserver.
Resolves#2111
- Add non-loopback auth guard to /v1/chat/completions (matching /api/chat)
- Fix migration guide references to non-existent files (api_chat.rs,
openai_compat_shim.rs, mod_patch.rs) — endpoints live in openclaw_compat.rs
- Remove phantom `provider` field from /api/chat response docs
- Add TOML string escaping to config converter to handle special chars
- Add proper JSON parse error handling in config converter
- Update deployment checklist and troubleshooting to match actual file layout
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add a complete OpenClaw → ZeroClaw migration toolkit:
- POST /api/chat: ZeroClaw-native endpoint with full agent loop (tools, memory,
context enrichment). Supports session_id scoping and context[] injection for
conversation history. Same code path as Linq/WhatsApp/Nextcloud handlers.
- POST /v1/chat/completions: OpenAI-compatible shim that routes through
run_gateway_chat_with_tools instead of the simple provider.chat_with_history
path. Extracts last user message + up to 10 messages of conversation context
from the messages[] array. Supports streaming (simulated SSE). Drop-in
replacement for OpenClaw callers with zero code changes.
Both endpoints include full observability instrumentation (AgentStart, LlmRequest,
LlmResponse, RequestLatency, AgentEnd), auth (pairing + webhook secret), rate
limiting, auto-save to memory, and response sanitization.
Also adds:
- scripts/convert-openclaw-config.py: Converts openclaw.json → config.toml with
provider mapping, channel detection, and migration notes
- docs/migration/openclaw-migration-guide.md: Full migration walkthrough with
endpoint reference, config mapping, callsite examples, and deployment checklist
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
`supports_vision` is currently hardcoded per-provider. The same Ollama instance can run `llava` (vision) or
`codellama` (no vision), but the code fixes vision support at the provider level with no user override.
This adds a top-level `model_support_vision: Option<bool>` config key — tri-state:
- **Unset (default):** provider's built-in value, zero behavior change
- **`true`:** force vision on (e.g. Ollama + llava)
- **`false`:** force vision off
Follows the exact same pattern as `reasoning_enabled`. Override is applied at the wrapper layer (`ReliableProvider` /
`RouterProvider`) — no concrete provider code is touched.
## Changes
**Config surface:**
- Top-level `model_support_vision` field in `Config` struct with `#[serde(default)]`
- Env override: `ZEROCLAW_MODEL_SUPPORT_VISION` / `MODEL_SUPPORT_VISION`
**Provider wrappers (core logic):**
- `ReliableProvider`: `vision_override` field + `with_vision_override()` builder + `supports_vision()` override
- `RouterProvider`: same pattern
**Wiring (1-line each):**
- `ProviderRuntimeOptions` struct + factory functions
- 5 construction sites: `loop_.rs`, `channels/mod.rs`, `gateway/mod.rs`, `tools/mod.rs`, `onboard/wizard.rs`
**Docs (i18n parity):**
- `config-reference.md` — Core Keys table
- `providers-reference.md` — new "Ollama Vision Override" section
- Vietnamese sync: `docs/i18n/vi/` + `docs/vi/` (4 files)
## Non-goals
- Does not change any concrete provider implementation
- Does not auto-detect model vision capability
## Test plan
- [x] `cargo fmt --all -- --check`
- [x] `cargo clippy --all-targets -- -D warnings` (no new errors)
- [x] 5 new tests passing:
- `model_support_vision_deserializes` — TOML parse + default None
- `env_override_model_support_vision` — env var override + invalid value ignored
- `vision_override_forces_true` — ReliableProvider override
- `vision_override_forces_false` — ReliableProvider override
- `vision_override_none_defers_to_provider` — passthrough behavior
## Risk and Rollback
- **Risk:** Low. `None` default = zero behavior change for existing users.
- **Rollback:** Revert commit. Field is `#[serde(default)]` so old configs without it will deserialize fine.
(cherry picked from commit a1b8dee785)
- Add underscores to long numeric literals (1234567890 → 1_234_567_890)
- Allow cast_possible_truncation for rough token estimates
- Replace loop/match with while-let for event stream parsing
- Merge identical match arms for event types
- Add #[allow(clippy::cast_possible_truncation)] on test helper
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add an OpenAI-compatible API surface to the gateway so that standard
OpenAI client libraries can interact with ZeroClaw directly.
Endpoints:
- POST /v1/chat/completions — supports both streaming (SSE) and
non-streaming responses, bearer token auth, rate limiting
- GET /v1/models — returns the gateway's configured model
The chat completions endpoint accepts the standard OpenAI request format
(model, messages, temperature, stream) and returns responses in the
OpenAI envelope format. Streaming uses SSE with delta chunks and a
[DONE] sentinel. A 512KB body limit is applied (vs 64KB default) since
chat histories can be large.
When the underlying provider doesn't support native streaming, the
handler falls back to wrapping the non-streaming response in a single
SSE chunk for transparent compatibility.
Includes 8 unit tests for request/response serialization.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Web chat was calling provider.chat_with_history() directly, bypassing
the agent loop. Tool calls were rendered as raw XML instead of executing.
Changes:
- Add tools_registry_exec to AppState for executable tools
- Replace chat_with_history with run_tool_call_loop in ws.rs
- Maintain conversation history per WebSocket session
- Add multimodal and max_tool_iterations config to AppState
Closes#1524
Replace line-based TOML masking with structured config masking so secret fields keep their original types (including reliability.api_keys arrays).\nHydrate dashboard PUT payloads with runtime config_path/workspace_dir and restore masked secret placeholders from current config before validation/save.\nAlso allow GET on /api/doctor for dashboard/client compatibility to avoid 405 responses.