* fix(web-fetch): remove dead feature gates, add noise stripping, add docstrings
The nanohtml2text and fast_html2md providers were both guarded by
cfg(feature) checks for features (web-fetch-plaintext, web-fetch-html2md)
that are never declared in Cargo.toml. This caused every web_fetch call
to silently return an error instead of fetching content.
Changes:
- Add strip_noise_elements() which removes <script>, <style>, <nav>,
<header>, <footer>, <aside>, <noscript>, <form>, <button> blocks
before text extraction, eliminating menu/ad/boilerplate noise.
- Fix fast_html2md path: when web-fetch-html2md feature is not compiled
in, fall through to nanohtml2text rather than returning an error.
- Fix nanohtml2text path: remove dead cfg(feature = "web-fetch-plaintext")
gate; nanohtml2text is a direct dependency and needs no feature flag.
- Both previously gated tests (html_to_markdown_conversion_preserves_structure,
html_to_plaintext_conversion_removes_html_tags) are now always-on.
Added strip_noise_removes_nav_scripts_footer test.
- Add docstrings to all public/private methods to meet coverage threshold.
Tavily and firecrawl providers are unchanged.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* fix(web-fetch): align default provider to nanohtml2text, remove dead feature
- Change empty-provider default from deprecated 'fast_html2md' to
'nanohtml2text' to match WEB_FETCH_PROVIDER_HELP and PR description.
- Remove dead 'web-fetch-plaintext' feature from Cargo.toml (no code
references it after the feature-gate removal).
- Apply cargo fmt to strip_noise_elements array formatting.
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: xj <gh-xj@users.noreply.github.com>
The Copilot API proxy for Claude models (Opus 4.6, Opus 4.6-1m) splits
text content and tool_calls into separate choices. Previously only
choices[0] was read, causing all tool calls to be silently dropped
when they appeared in choices[1].
Merge text and tool_calls from all choices so tool calling works
regardless of how the proxy splits the response.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
SQLite WAL mode requires shared-memory (mmap/shm) which is unavailable
on many network and virtual shared filesystems (NFS, SMB/CIFS,
UTM/VirtioFS, VirtualBox shared folders), causing xShmMap I/O errors
at startup.
Add `sqlite_journal_mode` config option under `[memory]` that accepts
"wal" (default) or "delete". When set to "delete", SQLite uses the
legacy DELETE journal mode and disables mmap, allowing ZeroClaw to run
with workspaces on shared/network filesystems.
Usage:
[memory]
sqlite_journal_mode = "delete"
Changes:
- config/schema.rs: Add sqlite_journal_mode field to MemoryConfig
- memory/sqlite.rs: Add with_options() supporting journal mode selection
- memory/mod.rs: Pass journal_mode from config to SqliteMemory
- onboard/wizard.rs: Include new field in default MemoryConfig
- Register 4 new tools (ManageAuthProfileTool, CheckProviderQuotaTool,
SwitchProviderTool, EstimateQuotaCostTool) in all_tools_with_runtime
- SwitchProviderTool now loads config from disk and calls save() to
persist default_provider/default_model to config.toml
- Inject Provider & Budget Context section into system prompt when
Config is available
- Remove emoji from tool output for cleaner parsing
- Replace format! push_str with std::fmt::Write for consistency
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Wire QuotaMetadata into ChatResponse for all provider implementations,
enabling quota tracking data to flow from API responses through the
agent loop to quota monitoring tools.
Depends on: circuit breaker (#1842) + quota monitoring (#1904)
Made-with: Cursor
WebDriver's execute() wraps the script as a function body. The snapshot
script used an IIFE without a top-level return, so the IIFE's return
value was discarded and the WebDriver function returned undefined (null).
All other execute() calls in the file (scroll, scrollIntoView, click)
correctly use explicit return statements.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
MIME strings like 'audio/webm; codecs=opus' were incorrectly matched by
the 'opus' branch (contains-check) before reaching the 'webm' branch,
returning 'voice.opus' instead of 'voice.webm'. This could cause the
Groq Whisper API to reject or misidentify the file format.
Fix: split on ';' to extract only the base MIME type, then match
exhaustively. Also add 'audio/x-wav' as a wav alias.
Adds a regression test: audio_mime_to_filename('audio/webm; codecs=opus')
must return 'voice.webm'.
Reported by CodeRabbit in PR review.
(cherry picked from commit 84861c727a)
Audio/voice messages on the WhatsApp Web channel were silently dropped
because `text_content()` returns an empty string for non-text messages
and no transcription path existed (unlike the Telegram channel which
already uses `transcription::transcribe_audio()`).
Changes:
- **Cargo.toml**: Move `qrcode` and all `wa-rs-*` crates out of the
`[target.'cfg(any(linux|macos|windows))'.dependencies]` section into
the unconditional `[dependencies]` section. All affected crates are
`optional = true`, so they add no compile cost unless
`--features whatsapp-web` is active. The previous placement caused
Cargo to exclude them when targeting `android` (target_os = "android"
does not match the cfg predicate), producing E0433 unresolved-crate
errors for every wa-rs import in `whatsapp_web.rs` and
`whatsapp_storage.rs` on Android cross-compilation.
- **whatsapp_web.rs**:
- Add `transcription: Option<TranscriptionConfig>` field.
- Add `with_transcription()` builder (mirrors `TelegramChannel`).
- Add `audio_mime_to_filename()` helper mapping WhatsApp MIME types
(e.g. `audio/ogg; codecs=opus`) to filenames the Groq Whisper API
accepts.
- Extend `Event::Message` handler: when text is empty, check
`msg.audio_message`; download and decrypt audio via
`client.download(audio_msg.as_ref())` (`.as_ref()` required because
prost boxes nested proto fields as `Box<AudioMessage>`, which does
not itself implement `Downloadable`); forward decrypted bytes to
`transcription::transcribe_audio()`.
- Add three unit tests: builder enable/disable guard and MIME mapping.
- **mod.rs**: Chain `.with_transcription(config.transcription.clone())`
onto `WhatsAppWebChannel::new(...)` in the `"web"` factory branch so
transcription is active whenever the global `[transcription]` section
is enabled.
Activation: set `[transcription] enabled = true` and export
`GROQ_API_KEY` in the environment.
(cherry picked from commit 325241aeb6)
Register /new, /model, and /models commands with Telegram's Bot API
on startup so they appear in the command menu for users. Registration
is non-fatal — if the API call fails, a warning is logged and the
bot continues listening normally.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Split the catch-all `_` match arm on the deleteMessage result into
separate `Ok(r)` and `Err(e)` arms so that HTTP status codes and
network errors are logged individually. The response body is not
logged (security policy).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When editMessageText returns 'message is not modified', the draft
already contains the correct content from update_draft. Detect this
Telegram API response and treat it as success rather than falling
through to the delete+send fallback, which would create a visible
duplicate message.
Also guard the final fallback: only send a new message after
successfully deleting the draft. If deleteMessage fails, the draft
still shows the response text, so sending would create a duplicate.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add support for Chrome, Firefox, and Edge browsers to the browser_open tool,
which previously only supported Brave. Users can now specify the browser
via the browser_open config option.
Changes:
- Add browser_open config field: "disable" | "brave" | "chrome" | "firefox" | "edge" | "default"
- Implement platform-specific launch commands for Chrome, Firefox, and Edge
- When set to "disable", only the browser automation tool is registered, not the browser_open tool
- Update tool descriptions and error messages to reflect browser selection
Co-Authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
* chore(ci): retrigger PR checks after intake body update
* fix(ci): stabilize local quality gates on rebased main
---------
Co-authored-by: Chummy <chumyin0912@gmail.com>
Co-authored-by: xj <gh-xj@users.noreply.github.com>
Implement CostObserver that intercepts LlmResponse observer events and
records token usage to the CostTracker with proper cost calculations.
Changes:
- Add src/observability/cost.rs: CostObserver implementation
- Listens for LlmResponse events with token counts
- Looks up model pricing from CostConfig (with fallback defaults)
- Records usage via CostTracker.record_usage()
- Includes model family matching for pricing lookups
- Update src/observability/mod.rs:
- Export CostObserver
- Add create_observer_with_cost_tracking() helper that wraps base
observer with CostObserver when cost tracking is enabled
- Update src/gateway/mod.rs:
- Use create_observer_with_cost_tracking() to wire cost observer
into the gateway observer stack when config.cost.enabled is true
The /api/cost endpoint already exists and will now return accurate
session/daily/monthly cost data populated by the CostObserver.
Resolves#2111
Phase 2 of ClawWork integration. Implements:
- TaskClassifier with 44 BLS occupations and wage data
- OccupationCategory enum (Tech/Business/Healthcare/Legal)
- Keyword-based classification with confidence scoring
- Hours estimation based on instruction complexity
- Fuzzy matching for occupation lookup
Reference: ClawWork/clawmode_integration/task_classifier.py
Add SiliconFlow provider factory support and alias/env handling.
Normalize onboarding UX to volcengine while preserving doubao/ark runtime aliases.
Add integration registry entries and provider resolution coverage tests.
Expand provider and command docs with setup and validation examples.
Addresses 4 findings from CodeRabbit's fourth review that were not
covered by the maintainer's commit 7ef075e:
1. [Major] http_client() per-call allocation: cache reqwest::Client in
FeishuDocTool struct field, return &reqwest::Client. Enables
connection pooling across all API calls.
2. [Major] SSRF bypass via HTTP redirects: download_media now uses a
no-redirect reqwest client (Policy::none()) to prevent attackers
from using a public URL that 301/302-redirects to internal IPs.
3. [Minor] Missing empty-conversion guard in action_upload_image:
added converted.is_empty() check consistent with all other
convert_markdown_blocks callers.
4. [Minor] Schema description for link_share stale: updated from
'default: true' to 'default: false' to match actual behavior.
Validation:
- cargo check --features channel-lark ✅
- cargo clippy -p zeroclaw --lib --features channel-lark -- -D warnings ✅
- cargo test --features channel-lark -- feishu_doc ✅ (7/7 tests pass)
(cherry picked from commit e846604a13)
Addresses all 5 findings from CodeRabbit's second review on PR #1853:
1. [Major] list_all_blocks: add MAX_PAGES (200) hard cap to prevent
unbounded pagination loops on misbehaving APIs or huge documents.
2. [Major] Empty conversion guard: action_write, action_update_block,
and write_single_cell now bail with explicit error when
convert_markdown_blocks returns empty results, preventing silent
data loss (delete-then-write-nothing scenario).
3. [Minor] action_create: grant_owner_permission failure is now a soft
warning instead of hard error. Document is already created and
verified; permission failure is reported in the response JSON
'warning' field instead of propagating as an error.
4. [Nitpick] extract_ttl_seconds: remove unreachable as_i64 fallback
branch (as_u64 already covers all non-negative integers).
5. [Nitpick] Add unit tests: test_extract_ttl_seconds_defaults_and_clamps
and test_write_rejects_empty_conversion.
Validation:
- cargo check --features channel-lark ✅
- cargo clippy -p zeroclaw --lib --features channel-lark -- -D warnings ✅
- cargo test --features channel-lark -- feishu_doc ✅ (7/7 tests pass)
(cherry picked from commit 762e6082ec)
- Reorder convert-before-delete in action_write, action_update_block,
and write_single_cell to prevent data loss if markdown conversion fails
- Separate create POST from verification retry loop in action_create
to prevent duplicate document creation on retry
- Add resolve_doc_token to upload_image and upload_file so wiki
node_token resolution works for upload actions
- Add SSRF protection to download_media: validate URL scheme (http/https
only), block local/private hosts via existing url_validation module
- Guard empty credentials in mod.rs: skip FeishuDocTool registration
when app_id or app_secret are empty/whitespace-only
(cherry picked from commit feb1d46f41)
Summary
- Problem: Agent cannot read DOCX files — file_read returns garbled binary/XML, making Word documents inaccessible to the
agent
- Why it matters: DOCX is the most common business document format; without native extraction, users must manually convert
files, breaking autonomous workflows
- What changed: Added docx_read tool using zip (existing) + quick-xml (new) to extract plain text from DOCX Office Open XML
- What did not change: No changes to file_read, agent loop, security policy, config schema, or any existing tool behavior
Label Snapshot (required)
- Risk label: risk: low
- Size label: size: S
- Scope labels: tool
- Module labels: tool: docx_read
- If any auto-label is incorrect: N/A
Change Metadata
- Change type: feature
- Primary scope: tool
Linked Issue
- Closes #(issue number)
Validation Evidence (required)
cargo fmt --all -- --check # pass
cargo clippy --all-targets -- -D warnings # pass (zero new warnings)
cargo test docx_read # 14/14 passed
- Evidence provided: test results, manual verification with zeroclaw agent -m against real DOCX file
Security Impact (required)
- New permissions/capabilities? No (mirrors existing pdf_read security model exactly)
- New external network calls? No
- Secrets/tokens handling changed? No
- File system access scope changed? No
Privacy and Data Hygiene (required)
- Data-hygiene status: pass
- Redaction/anonymization notes: Test fixtures use neutral content ("Hello DOCX", "First", "Second")
- Neutral wording confirmation: Yes
Compatibility / Migration
- Backward compatible? Yes
- Config/env changes? No
- Migration needed? No
i18n Follow-Through
- i18n follow-through triggered? No (no docs or user-facing wording changes)
Human Verification (required)
- Verified scenarios: zeroclaw agent -m "read the file test-test.docx and output the content" — model selected docx_read,
extracted text correctly
- Edge cases checked: invalid ZIP, missing word/document.xml, symlink escape, path traversal, rate limiting, truncation
- What was not verified: encrypted DOCX (out of scope), extremely large files (>50MB)
Side Effects / Blast Radius (required)
- Affected subsystems/workflows: Tool registry only — one new tool added
- Potential unintended effects: None — additive only, no existing behavior changed
- Guardrails/monitoring: Tool follows identical security chain as pdf_read
Rollback Plan (required)
- Fast rollback command/path: git revert <commit>
- Feature flags or config toggles: None needed (always-on, like pdf_read)
- Observable failure symptoms: docx_read tool missing from tool list
Risks and Mitigations
- Risk: quick-xml new dependency adds to compile time
- Mitigation: quick-xml is lightweight pure Rust (~15K LOC), widely used (100M+ downloads), and will be shared when
XLSX/PPTX tools are added later
- Add non-loopback auth guard to /v1/chat/completions (matching /api/chat)
- Fix migration guide references to non-existent files (api_chat.rs,
openai_compat_shim.rs, mod_patch.rs) — endpoints live in openclaw_compat.rs
- Remove phantom `provider` field from /api/chat response docs
- Add TOML string escaping to config converter to handle special chars
- Add proper JSON parse error handling in config converter
- Update deployment checklist and troubleshooting to match actual file layout
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add a complete OpenClaw → ZeroClaw migration toolkit:
- POST /api/chat: ZeroClaw-native endpoint with full agent loop (tools, memory,
context enrichment). Supports session_id scoping and context[] injection for
conversation history. Same code path as Linq/WhatsApp/Nextcloud handlers.
- POST /v1/chat/completions: OpenAI-compatible shim that routes through
run_gateway_chat_with_tools instead of the simple provider.chat_with_history
path. Extracts last user message + up to 10 messages of conversation context
from the messages[] array. Supports streaming (simulated SSE). Drop-in
replacement for OpenClaw callers with zero code changes.
Both endpoints include full observability instrumentation (AgentStart, LlmRequest,
LlmResponse, RequestLatency, AgentEnd), auth (pairing + webhook secret), rate
limiting, auto-save to memory, and response sanitization.
Also adds:
- scripts/convert-openclaw-config.py: Converts openclaw.json → config.toml with
provider mapping, channel detection, and migration notes
- docs/migration/openclaw-migration-guide.md: Full migration walkthrough with
endpoint reference, config mapping, callsite examples, and deployment checklist
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>