When auto_save is enabled and a photo is sent on the first turn of a
Telegram session, the [IMAGE:] marker was duplicated because:
1. auto_save stores the photo message (including the marker) to memory
2. build_memory_context recalls the just-saved entry as relevant context
3. The recalled marker is prepended to the original message content
Fix: skip memory context entries containing [IMAGE:] markers in
should_skip_memory_context_entry so auto-saved photo messages are not
re-injected through memory context enrichment.
Closes#2403
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When workspace_only=true, is_path_allowed() blanket-rejected all
absolute paths. This blocked legitimate tool calls that referenced
files inside the workspace using an absolute path (e.g. saving a
screenshot to /home/user/.zeroclaw/workspace/images/example.png).
The fix checks whether an absolute path falls within workspace_dir or
any configured allowed_root before rejecting it, mirroring the priority
order already used by is_resolved_path_allowed(). Paths outside the
workspace and allowed roots are still blocked, and the forbidden-paths
list continues to apply to all other absolute paths.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The Discord gateway event loop silently dropped websocket Ping frames
(via a catch-all `_ => continue`) without responding with Pong. After
splitting the websocket stream into read/write halves, automatic
Ping/Pong handling is disabled, so the server-side (Cloudflare/Discord)
eventually considers the client unresponsive and stops sending events.
Additionally, websocket read errors (`Some(Err(_))`) were silently
swallowed by the same catch-all, preventing reconnection on transient
failures.
This patch:
- Responds to `Message::Ping` with `Message::Pong` to maintain the
websocket keepalive contract
- Breaks the event loop on `Some(Err(_))` with a warning log, allowing
the supervisor to reconnect
Closes#2896
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The Electric Blue dashboard (PR #2804) sends the pairing token as a
?token= query parameter, but the WS handler only checked that single
source. Earlier PR #2193 had established a three-source precedence
chain (header > subprotocol > query param) which was lost.
Add extract_ws_token() with the documented precedence:
1. Authorization: Bearer <token> header
2. Sec-WebSocket-Protocol: bearer.<token> subprotocol
3. ?token=<token> query parameter
This ensures browser-based clients (which cannot set custom headers)
can authenticate via query param or subprotocol, while non-browser
clients can use the standard Authorization header.
Includes 9 unit tests covering each source, precedence ordering,
and empty-value fallthrough.
Closes#2884
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Rename shadowed `histories` and `store` bindings in three test functions
to eliminate variable shadows that are flagged under stricter lint
configurations (clippy::shadow_unrelated). The initial bindings are
consumed by struct initialization; the second bindings that lock the
mutex guard are now named distinctly (`locked_histories`, `cleanup_store`).
Closes#2443
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The provider HTTP request timeout was hardcoded at 120 seconds in
`OpenAiCompatibleProvider::http_client()`. This makes it configurable
via the `provider_timeout_secs` config key and the
`ZEROCLAW_PROVIDER_TIMEOUT_SECS` environment variable, defaulting
to 120s for backward compatibility.
Changes:
- Add `provider_timeout_secs` field to Config with serde default
- Add `ZEROCLAW_PROVIDER_TIMEOUT_SECS` env var override
- Add `timeout_secs` field and `with_timeout_secs()` builder on
`OpenAiCompatibleProvider`
- Add `provider_timeout_secs` to `ProviderRuntimeOptions`
- Thread config value through agent loop, channels, gateway, and tools
- Use `compat()` closure in provider factory to apply timeout to all
compatible providers
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add `agent.tool_call_dedup_exempt` config key (list of tool names) to
allow specific tools to bypass the within-turn identical-signature
deduplication check in run_tool_call_loop. This fixes the browser
snapshot polling use case where repeated calls with identical arguments
are legitimate and should not be suppressed.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add a `zeroclaw channel send` subcommand that sends a one-off message
through a configured channel without starting the full agent loop.
This enables hardware sensor triggers (e.g., range sensors on
Raspberry Pi) to push notifications to Telegram and other platforms.
Usage:
zeroclaw channel send 'Alert!' --channel-id telegram --recipient <chat_id>
Supported channels: telegram, discord, slack.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When running through a PTY chain (kubectl exec, SSH, remote terminals),
the transport layer may split data frames at space (0x20) boundaries,
interrupting multi-byte UTF-8 characters mid-sequence. Rust's
BufRead::read_line requires valid UTF-8 and returns InvalidData
immediately, crashing the interactive agent loop.
Replace stdin().read_line() with byte-level read_until(b'\n') followed
by String::from_utf8_lossy() in both the main input loop and the
/clear confirmation prompt. This reads raw bytes without UTF-8
validation during transport, then does lossy conversion (replacing any
truly invalid bytes with U+FFFD instead of crashing).
Also set ENV LANG=C.UTF-8 in both Dockerfile stages as defense-in-depth
to ensure the container locale defaults to UTF-8.
Closes#2984
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The credential leak detector's check_high_entropy_tokens would
false-positive on URL path segments (e.g. long alphanumeric filenames)
because extract_candidate_tokens included '/' in the token character
set, creating long mixed-alpha-digit tokens that exceeded the Shannon
entropy threshold.
Fix: strip URLs from content before extracting candidate tokens for
entropy analysis. Structural pattern checks (API keys, JWTs, AWS
credentials) use dedicated regexes and are unaffected.
Closes#3064
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
WebSearchTool previously stored the Brave API key once at boot and never
re-read it. This caused three failures: (1) keys set after boot via
web_search_config were ignored, (2) encrypted keys passed as raw enc2:
blobs to the Brave API, and (3) keys absent at startup left the tool
permanently broken.
The fix adds lazy key resolution at execution time. A fast path returns
the boot-time key when it is plaintext and non-empty. When the boot key
is missing or still encrypted, the tool re-reads config.toml, decrypts
the value through SecretStore, and uses the result. This also means
runtime config updates (e.g. `web_search_config set brave_api_key=...`)
are picked up on the next search invocation.
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Replace byte-level `&val[..4]` slice with `char_indices().nth(4)` to
prevent a panic when the captured credential value contains multi-byte
UTF-8 characters (e.g. Chinese text). Adds a regression test with
CJK input.
Closes#3024
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Extract `self.bot_handle.lock().take()` into a separate `let` binding
so the parking_lot::MutexGuard is dropped before the `.await`, making
the listen future Send again.
Closes#3312
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Replace the `use_feishu: bool` field on `LarkChannel` with a
`platform: LarkPlatform` enum field, add `mention_only` to the
`new_with_platform` constructor, and introduce `from_lark_config` /
`from_feishu_config` factory methods so the channel factory in
`mod.rs` and the existing tests compile.
Resolves#3302
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Centralize cron shell command validation so all entrypoints enforce the
same security policy (allowlist + risk gate + approval) before
persistence and execution.
Changes:
- Add validate_shell_command() and validate_shell_command_with_security()
as the single validation gate for all cron shell paths
- Add add_shell_job_with_approval() and update_shell_job_with_approval()
that validate before persisting
- Add add_once_validated() and add_once_at_validated() for one-shot jobs
- Make raw add_shell_job/add_job/add_once/add_once_at pub(crate) to
prevent unvalidated writes from outside the cron module
- Route gateway API through validated creation path
- Route schedule tool through validated helpers (single validation)
- Route cron_add/cron_update tools through validated helpers
- Unify scheduler execution validation via validate_shell_command_with_security
- CLI update handler uses full validate_command_execution instead of
just is_command_allowed
- Add focused tests for validation parity across entrypoints
- Standardize error format to "blocked by security policy: {reason}"
Closes#2741Closes#2742
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* fix(cli): honor config default_temperature in agent command
Fixes#3033
The agent command was using a hardcoded default_value of 0.7 for the
--temperature parameter, which ignored the default_temperature setting
in the config file.
Changes:
- Changed temperature from f64 to Option<f64>
- Removed hardcoded default_value
- Use config.default_temperature when --temperature is not provided
Users can now set default_temperature in config.toml and have it
honored when running 'zeroclaw agent' without --temperature.
Risk: low (behavior change: now honors config instead of hardcoded value)
* fix: resolve moved value error for config in agent command
Extract final_temperature before passing config to agent::run() to
avoid use-of-moved-value error (config is moved as the first argument
while config.default_temperature was being accessed in the same call).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: argenis de la rosa <theonlyhennygod@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Add vision support to Anthropic provider to enable image understanding:
- Add ImageSource struct for Anthropic's image content block format
- Add Image variant to NativeContentOut enum
- Implement capabilities() returning vision: true
- Update convert_messages() to parse [IMAGE:...] markers and convert
them to Anthropic's native image content blocks
- Support both data URIs and local file paths
- Add comprehensive tests for vision functionality
Fixes#3163
Co-authored-by: Argenis <theonlyhennygod@gmail.com>
Add default_subject field to EmailConfig to allow users to customize
the default subject line for outgoing emails. Previously hardcoded as
"ZeroClaw Message".
- Add default_subject field with serde default
- Update send() method to use configured default
- Add tests for new functionality
Fixes#2878
Co-authored-by: Argenis <theonlyhennygod@gmail.com>
When [web_fetch] section was specified in config without explicit
allowed_domains, serde used Vec::default() (empty vector) instead of
the wildcard ["*"] default. This caused all web fetch requests to be
rejected unexpectedly.
Fix by adding explicit serde default function that returns vec!["*"].
Fixes#2941
Co-authored-by: Argenis <theonlyhennygod@gmail.com>
Add support for openai-responses, open-ai-responses, openai-chat-completions,
and open-ai-chat-completions as aliases for wire_api configuration values.
This aligns the parser with documented values that users expect to work.
Fixes#2735
* feat(onboard): add --reinit flag to prevent accidental config overwrite
Add --reinit flag to onboard command that:
- Backs up existing ~/.zeroclaw directory with timestamp
- Starts fresh initialization after backup
- Requires --interactive mode to work
- Prevents accidental configuration loss
This addresses issue #3013 where onboard could accidentally
overwrite all configuration without warning.
Closes#3013🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
* fix(ci): SHA-pin all third-party GitHub Actions
Replace mutable version tags with immutable commit SHAs to prevent
tag-hijacking supply chain attacks (P1 finding).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* chore: retrigger CI after startup_failure
* fix(onboard): address PR #3102 review issues for --reinit flag
- Use resolve_runtime_dirs_for_onboarding() instead of hardcoded ~/.zeroclaw
- Remove unsafe relative path fallback, bail instead
- Add user confirmation prompt before reinitializing config
- Update docs/reference/cli/commands-reference.md with --reinit docs
* style: fix cargo fmt and clippy violations
- Fix import ordering in src/config/mod.rs (rustfmt)
- Collapse single-arg encrypt/decrypt calls in src/config/schema.rs (rustfmt)
- Box::pin large onboard futures to fix clippy::large_futures in src/main.rs
These violations were blocking CI lint checks.
---------
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Simian Astronaut 7 <simianastronaut7@gmail.com>
The daemon previously only handled SIGINT (Ctrl+C), ignoring SIGTERM
which is the standard termination signal used by Docker, Kubernetes,
and systemd. This caused containers to wait for the grace period
then be force-killed with SIGKILL.
Now the daemon handles both SIGINT and SIGTERM for graceful shutdown.
Fixes#2529
- Strip `<think>...</think>` blocks in parse_tool_calls(), XmlToolDispatcher,
and OllamaProvider before processing tool-call XML
- Add effective_content() fallback: when content is empty after stripping
think tags, check the thinking field for tool-call XML
- Add strip_think_tags() to ollama.rs, loop_.rs, and dispatcher.rs
- Add comprehensive tests for think-tag stripping and tool-call parsing
Fixes#3079
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
- #3009: Add handle_api_integrations_settings endpoint returning JSON with
per-integration enabled/category/status so /api/integrations/settings no
longer falls through to the SPA static handler.
- #3010: Extract Sec-WebSocket-Protocol header in handle_ws_chat and echo
back "zeroclaw.v1" via ws.protocols() when the client requests it.
- #3038: Generate a persistent session_id (crypto.randomUUID stored in
sessionStorage) in the web WS client and pass it as a query parameter.
Add session_id: Option<String> to WsQuery on the server side so the
backend can key conversations by session across reconnects.
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Add a test verifying Telegram bot_token is encrypted on save and
decrypted on load, and add .gitignore entries for local state backups.
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Wraps the WhatsApp Web listen() in a reconnect loop. When Event::LoggedOut
fires, the bot is torn down, stale session DB files are deleted, and after
exponential backoff (3s base, 300s cap, max 10 retries) the loop restarts
triggering fresh QR pairing.
- Broadcast channel for logout signaling from event handler to listen loop
- Session file cleanup (primary + WAL + SHM) only on explicit LoggedOut
- Proper resource ordering: client lock, abort handle, drop bot/device
- Tests for retry delay, counter, session purge, and file paths helpers
Adds symmetric encrypt/decrypt calls for all channel secret fields in
Config::save() and Config::load_or_init(). Previously only nostr.private_key
was handled, leaving all other channel secrets (bot_token, app_token,
access_token, api_token, password, etc.) and gateway.paired_tokens stored
as plaintext when secrets.encrypt = true.
Closes#3175, closes#3173.
Co-authored-by: jameslcowan
Adds opencode-go as a first-class provider with dedicated API endpoint,
env var, onboarding wizard wiring, and test coverage.
CI failures are pre-existing on master (Rust 1.94 formatting/lint changes per #3207).
The live tool call notifications feature sends extra messages to the
channel when tools are invoked. Update 5 tests that assumed exactly
1 sent message to instead check the last message in the list.
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Enable a single Matrix bot instance to respond in multiple rooms:
- Disable the room_id filter so messages from all joined rooms are processed
- Embed room_id in reply_target as "user||room_id" for routing replies
- Include room_id in channel field for per-room conversation isolation
- Extract room_id from recipient in send() for correct message routing
The configured room_id still serves as a fallback for direct sends
without a "||" separator in the recipient.
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Implement pin_message and unpin_message for the Matrix channel using
the m.room.pinned_events state event. Adds default no-op trait methods
to the Channel trait so other channel implementations are unaffected.
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Implement add_reaction/remove_reaction for Matrix channel using
ReactionEventContent and redaction. Add threading support via
Relation::Thread in send() and thread_ts extraction from incoming
messages, enabling threaded conversations.
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Argenis <theonlyhennygod@gmail.com>
Add ChannelNotifyObserver that wraps the observer to forward tool-call
events as real-time threaded messages on messaging channels. Include
tool arguments (truncated) in ToolCallStart events for better
visibility into what tools are doing. Auto-thread final replies when
tools were used.
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Argenis <theonlyhennygod@gmail.com>
The system prompt is built once at daemon startup and cached. The
"Current Date & Time" section becomes stale immediately. This patch
replaces it with a fresh timestamp every time build_channel_system_prompt
is called (i.e. per incoming message).
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
When embedding_provider differs from default_provider (e.g. default=gemini,
embedding=openai), the caller-supplied api_key belongs to the chat provider.
Passing it to the embedding endpoint causes 401 Unauthorized (gemini key
sent to api.openai.com/v1/embeddings).
Add embedding_provider_env_key() which looks up OPENAI_API_KEY,
OPENROUTER_API_KEY, or COHERE_API_KEY before falling back to the
caller-supplied key. This matches the provider-specific env var resolution
in providers/mod.rs without introducing cross-module coupling.
Also add config_secrets_survive_save_load_roundtrip test: full save→load
cycle with channel credentials (telegram, discord, slack bot_token,
slack app_token) and gateway paired_tokens, verifying that enc2: values
are correctly decrypted by Config::load_or_init(). Regression guard for
issues #3173 and #3175.
Closes#3083
Co-authored-by: ZeroClaw Bot <zeroclaw_bot@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Argenis <theonlyhennygod@gmail.com>
- `/model <name>` now auto-resolves provider from configured model_routes
by matching model name or hint, fixing 404 when switching to models on
different providers (e.g. `/model kimi-k2.5` with anthropic default)
- Conversation history is no longer cleared on `/model` or `/models` —
users can explicitly reset via `/new`
- Matrix channel now supports `/model`, `/models`, and `/new` commands
- `/model` (no args) lists configured model routes with hints
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The onboard wizard futures exceed clippy's large_futures threshold
(16KB+). Wrap in Box::pin to heap-allocate and fix the lint.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Import ordering in config/mod.rs and line-wrapping in config/schema.rs
were left unformatted by PR #2994. Run cargo fmt to fix.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
HTTP clients had no timeouts and could hang forever; add 30s total and
10s connect timeouts. Replace clock-nanos-based jitter with rand::random
for proper randomness. Add a 1000-entry cap to the user display name
cache with expired-entry pruning. Fix truncate_text to avoid scanning
the full string twice when checking for truncation.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Google TTS was passing the API key as a URL query parameter, which can
appear in logs and proxy access records. Move it to the x-goog-api-key
header instead. Add input validation for ElevenLabs voice IDs (reject
non-alphanumeric/dash/underscore characters) and restrict Edge TTS
binary_path to allowed basenames (edge-tts, edge-playback).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Slack private file fetching was sending the bot token on every redirect
hop. Since Slack CDN redirects use pre-signed URLs, sending the bearer
token to CDN hosts is unnecessary credential exposure.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Admin endpoints (/admin/shutdown, /admin/paircode, /admin/paircode/new) were
completely unauthenticated, allowing any network client to shut down the gateway
or read/generate pairing codes. Add require_localhost() guard that returns 403
for non-loopback IPs.
Replace std::process::exit(0) in shutdown handler with a tokio watch channel
for graceful shutdown, allowing proper destructor cleanup and connection
draining. Replace the 500ms sleep race in the restart command with a poll loop
that waits for the port to actually become free.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add --new flag to GetPaircode command in src/lib.rs
- Update main.rs to handle GetPaircode { new } parameter
- Add /admin/paircode/new POST endpoint in gateway/mod.rs
- Enhance documentation for constant_time_eq security function
Refs: #3015
The bitwise & operator is intentional in constant_time_eq() to prevent
timing side-channel attacks. Both comparisons must always execute to
ensure constant-time behavior regardless of the first comparison result.
- Revert logical && back to bitwise &
- Add #[allow(clippy::needless_bitwise_bool)] annotation
- Add explanatory comment documenting the intentional use
- Add security warning for 0.0.0.0 binding in help text
- Implement proper gateway shutdown before restart via /admin/shutdown endpoint
- Fetch live pairing code from running gateway via /admin/paircode endpoint
- Extract duplicate code into helper functions
- Fix clippy warnings
- Fix Critical: Split illegal or-pattern (Some(...) | None) into separate match arms
- Fix Major: Implement restart command with graceful shutdown check
- Fix Major: Improve get-paircode to check gateway status and provide clear instructions
- Fix Minor: Update help text to document public-bind precondition
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Add GatewayCommands enum with three subcommands:
- start: Start the gateway server (default behavior preserved)
- restart: Restart the gateway server
- get-paircode: Show current pairing status without restarting
This improves gateway management by allowing users to:
1. Restart gateway without manual stop/start
2. Check pairing status without disrupting running gateway
Closes#3014Closes#3015🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Modern macOS (Ventura+) stores iMessage content in the attributedBody
column as a binary typedstream blob rather than the text column. The
existing SQL filter `AND m.text IS NOT NULL` silently dropped all
incoming messages on affected systems.
Add a length-prefix extractor for the typedstream format and fall back
to attributedBody when text is NULL or empty. Includes real captured
blob fixtures and 14 new parser/integration tests.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Introduced a new function `check_api_key_prefix` to validate API key prefixes against their associated providers. This helps catch mismatches early in the process. Added unit tests to ensure correct functionality for various scenarios, including known and unknown key formats. This enhancement improves error handling and user guidance when incorrect provider keys are used.
Apply cargo fmt to fix formatting diffs in openrouter.rs and serial.rs.
Add web/dist placeholder step to lint, test, and build jobs so
RustEmbed compiles without the gitignored frontend assets.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Align src/peripherals/ and docs/hardware/ with the firmware directory renames
- Covers both compiled references (include_str!, constants) and documentation
Adds pluggable Text-to-Speech subsystem with TtsProvider trait,
TtsManager for provider selection, and per-provider config structs.
Includes secret encryption for TTS API keys.
- CI now builds across all 5 targets (linux x86/arm64, macOS x86/arm64,
Windows) matching the release matrix
- Fix chat_fails_without_credentials test to accept "builder error"
which occurs in CI environments without native TLS
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When a Telegram message originates from a forum topic, the thread_id was
extracted and used for reply routing but never stored in ChannelMessage.thread_ts.
This caused all messages from the same sender to share conversation history
regardless of which topic they were posted in.
Changes:
- Set thread_ts to the extracted thread_id in parse_update_message,
try_parse_voice_message, and try_parse_attachment_message
- Use 'ref' in if-let patterns to avoid moving thread_id before it's assigned
- Update conversation_history_key() to include thread_ts when present,
producing keys like 'telegram_<thread_id>_<sender>' for forum topics
- Update conversation_memory_key() to also include thread_ts for memory isolation
This enables proper per-topic session isolation in Telegram forum groups while
preserving existing behavior for regular groups and DMs (where thread_ts is None).
Closes#1532
Replace line-based TOML masking with structured config masking so secret fields keep their original types (including reliability.api_keys arrays).\nHydrate dashboard PUT payloads with runtime config_path/workspace_dir and restore masked secret placeholders from current config before validation/save.\nAlso allow GET on /api/doctor for dashboard/client compatibility to avoid 405 responses.
- security: honor explicit command paths in allowed_commands list
- security: respect workspace_only=false in resolved path checks
- config: enforce 0600 permissions on every config save (unix)
- config: reject temp-directory paths in active workspace marker
- provider: preserve reasoning_content in tool-call conversation history
- provider: add allow_user_image_parts parameter for minimax compatibility
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The supports_native_tools() method was hardcoded to return true,
but it should return the value of self.native_tool_calling to
properly disable native tool calling for providers like MiniMax.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(channels,providers): remap Docker /workspace paths and enable vision for custom provider
Two fixes:
1. Telegram channel: when a Docker-containerised runtime writes a file to
/workspace/<path>, the host-side sender couldn't find it because the
container mount point differs from the host workspace dir. Remap
/workspace/<rel> → <host_workspace_dir>/<rel> in send_attachment before
the path-exists check so generated media is delivered correctly.
2. Provider factory: custom: provider was created with vision disabled,
causing all image messages to be rejected with a capability error even
though the underlying OpenAI-compatible endpoint supports vision. Switch
to new_with_vision(..., true) so image inputs are forwarded correctly.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat(memory): restore Qdrant vector database backend
Re-adds the Qdrant memory backend that was removed from main in a
recent upstream merge. Restores:
- src/memory/qdrant.rs — full QdrantMemory implementation with lazy
init, HTTP REST client, embeddings, and Memory trait
- src/memory/backend.rs — Qdrant variant in MemoryBackendKind, profile,
classify and profile dispatch
- src/memory/mod.rs — module export, factory routing with build_qdrant_memory
- src/config/schema.rs — QdrantConfig struct and qdrant field on MemoryConfig
- src/config/mod.rs — re-export QdrantConfig
- src/onboard/wizard.rs — qdrant field in MemoryConfig initializer
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
copilot is the only provider that performs a device-code flow automatically on
first run. openai-codex and gemini (when OAuth-backed) require an explicit
`zeroclaw auth login --provider <name>` step. Split the device-flow next-steps
block to reflect this distinction.
Addresses Copilot review comment on PR #1509.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace hardcoded OPENROUTER_API_KEY hint with provider-aware logic:
- keyless local providers (ollama, llamacpp, etc.) show chat/gateway/status hints
- device-flow providers (copilot, gemini, openai-codex) show OAuth/first-run hint
- all other providers show the correct provider-specific env var via provider_env_var()
Also adds canonical alias "github-copilot" -> "copilot" in canonical_provider_name(),
and a new provider_supports_device_flow() helper with accompanying test.
Additionally fixes pre-existing compile blockers that prevented CI from running:
- fix(security): correct raw string literals in leak_detector.rs that terminated
early due to unescaped " inside r"..." (use r#"..."# instead)
- fix(gateway): add missing wati: None in two test AppState initializations
- fix(gateway): use serde::Deserialize path on WatiVerifyQuery struct
- fix(security): add #[allow(unused_imports)] on new pub use re-exports in mod.rs
- fix(security): remove unused serde::{Deserialize, Serialize} import
- chore: apply cargo fmt to files that had pending formatting diffs
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Scheduled jobs created via channel conversations (Discord, Telegram, etc.)
never delivered output back to the channel because:
1. The agent had no channel context (channel name + reply_target) in its
system prompt, so it could not populate the delivery config.
2. The schedule tool only creates shell jobs with no delivery support,
and the cron_add tool's delivery schema was opaque.
3. OpenAiCompatibleProvider was missing the native_tool_calling field,
causing a compile error.
Changes:
- Inject channel context (channel name + reply_target) into the system
prompt so the agent knows how to address delivery when scheduling.
- Improve cron_add tool description and delivery parameter schema to
guide the agent toward correct delivery config.
- Update schedule tool description to warn that output is only logged
and redirect to cron_add for channel delivery.
- Fix missing native_tool_calling field in OpenAiCompatibleProvider.
Co-authored-by: Cursor <cursoragent@cursor.com>
* ci(homebrew): prefer HOMEBREW_UPSTREAM_PR_TOKEN with fallback
* ci(homebrew): handle existing upstream remote and main base
* fix: always emit toolResult blocks for tool_use responses
The Bedrock Converse API requires that every toolUse block in an
assistant message has a corresponding toolResult block in the
subsequent user message. Two bugs caused violations of this contract:
1. When parse_tool_result_message failed (e.g. malformed JSON or
missing tool_call_id), the fallback emitted a plain text user
message instead of a toolResult block, causing Bedrock to reject
the request with "Expected toolResult blocks at messages.N.content
for the following Ids: ..."
2. When the assistant made multiple tool calls in a single turn, each
tool result was pushed as a separate ConverseMessage with role
"user". Bedrock expects all toolResult blocks for a turn to appear
in a single user message.
Fix (1) by making the fallback construct a toolResult with status
"error" containing the raw content, and attempting to extract the
tool_use_id from the previous assistant message if JSON parsing fails.
Fix (2) by merging consecutive tool-result user messages into a single
ConverseMessage during convert_messages.
Also accept alternate field names (tool_use_id, toolUseId) in addition
to tool_call_id when parsing tool result messages.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Will Sarg <12886992+willsarg@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
MiniMax API does not support OpenAI-style native tool definitions
(`tools` parameter in chat completions). Sending them causes a 500
Internal Server Error with "unknown error (1000)" on every request.
Add a `native_tool_calling` field to `OpenAiCompatibleProvider` so each
constructor can declare its tool-calling capability independently.
MiniMax (via `new_merge_system_into_user`) now sets this to `false`,
causing the agent loop to inject tool instructions into the system
prompt as text instead of sending native JSON tool definitions.
Closes#1387
(cherry picked from commit 2b92a774fb)
(cherry picked from commit 1816e8a829)
Co-authored-by: keiten arch <tang.zhengliang@ivis-sh.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Replace 🙌 and 💪 with 🔥 and 👍 in the TELEGRAM_ACK_REACTIONS pool.
The removed emojis are not in Telegram's allowed reaction set, causing
~40% of ACK reactions to fail with REACTION_INVALID (400 Bad Request).
All replacements verified against the Telegram Bot API setMessageReaction
endpoint in a live private chat.
Closes#1475
* feat(composio): fix v3 compatibility with parameter discovery, NLP text execution, and error enrichment
Three-layer fix for the Composio v3 API compatibility issue where the LLM
agent cannot discover parameter schemas, leading to repeated guessing and
execution failures.
Layer 1 – Surface parameter hints in list output:
- Add input_parameters field to ComposioV3Tool and ComposioAction structs
- Pass through input_parameters from v3 list response via map_v3_tools_to_actions
- Add format_input_params_hint() to show required/optional param names in list output
Layer 2 – Support natural-language text execution:
- Add text parameter to tool schema (mutually exclusive with params)
- Thread text through execute handler → execute_action → execute_action_v3
- Update build_execute_action_v3_request to send text instead of arguments
- Skip v2 fallback when text-mode is used (v2 has no NLP support)
Layer 3 – Enrich execute errors with parameter schema:
- Add get_tool_schema() to fetch full tool metadata from GET /api/v3/tools/{slug}
- Add format_schema_hint() to render parameter names, types, and descriptions
- On execute failure, auto-fetch schema and append to error message
Root cause: The v3 API returns input_parameters in list responses but
ComposioV3Tool was silently discarding them. The LLM had no way to discover
parameter schemas before calling execute, and error messages provided no
remediation guidance — creating an infinite guessing loop.
Co-Authored-By: unknown <>
(cherry picked from commit fd92cc5eb0)
* fix(composio): use floor_char_boundary for safe UTF-8 truncation in format_schema_hint
Co-Authored-By: unknown <>
(cherry picked from commit 18e72b6344)
* fix(composio): restore coherent v3 execute flow after replay
---------
Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
- Problem: The existing http_request tool returns raw HTML/JSON, which is nearly unusable for LLMs to extract
meaningful content from web pages.
- Why it matters: All mainstream AI agents (Claude Code, Gemini CLI, Aider) have dedicated web content extraction
tools. ZeroClaw lacks this capability, limiting its ability to research and gather information from the web.
- What changed: Added a new web_fetch tool that fetches web pages and converts HTML to clean plain text using
nanohtml2text. Includes domain allowlist/blocklist, SSRF protection, redirect following, and content-type aware
processing.
- What did not change (scope boundary): http_request tool is untouched. No shared code extracted between http_request
and web_fetch (DRY rule-of-three: only 2 callers). No changes to existing tool behavior or defaults.
Label Snapshot (required)
- Risk label: risk: medium
- Size label: size: M
- Scope labels: tool, config
- Module labels: tool: web_fetch
- If any auto-label is incorrect, note requested correction: N/A
Change Metadata
- Change type: feature
- Primary scope: tool
Linked Issue
- Closes #
- Related #
- Depends on #
- Supersedes #
Supersede Attribution (required when Supersedes # is used)
N/A
Validation Evidence (required)
cargo fmt --all -- --check # pass
cargo clippy --all-targets -- -D warnings # no new warnings (pre-existing warnings only)
cargo test --lib -- web_fetch # 26/26 passed
cargo test --lib -- tools::tests # 12/12 passed
cargo test --lib -- config::schema::tests # 134/134 passed
- Evidence provided: unit test results (26 new tests), manual end-to-end test with Ollama + qwen2.5:72b
- If any command is intentionally skipped, explain why: Full cargo clippy --all-targets has 43 pre-existing errors
unrelated to this PR (e.g. await_holding_lock, format! appended to String). Zero errors from web_fetch code.
Security Impact (required)
- New permissions/capabilities? Yes — new web_fetch tool can make outbound HTTP GET requests
- New external network calls? Yes — fetches web pages from allowed domains
- Secrets/tokens handling changed? No
- File system access scope changed? No
- If any Yes, describe risk and mitigation:
- Deny-by-default: enabled = false by default; tool is not registered unless explicitly enabled
- Domain filtering: allowed_domains (default ["*"] = all public hosts) + blocked_domains (takes priority).
Blocklist always wins over allowlist.
- SSRF protection: Blocks localhost, private IPs (RFC 1918), link-local, multicast, reserved ranges, IPv4-mapped
IPv6, .local TLD — identical coverage to http_request
- Rate limiting: can_act() + record_action() enforce autonomy level and rate limits
- Read-only mode: Blocked when autonomy is ReadOnly
- Response size cap: 500KB default truncation prevents context window exhaustion
- Proxy support: Honors [proxy] config via tool.web_fetch service key
Privacy and Data Hygiene (required)
- Data-hygiene status: pass
- Redaction/anonymization notes: No personal data in code, tests, or fixtures
- Neutral wording confirmation: All test identifiers use neutral project-scoped labels
Compatibility / Migration
- Backward compatible? Yes — new tool, no existing behavior changed
- Config/env changes? Yes — new [web_fetch] section in config.toml (all fields have defaults)
- Migration needed? No — #[serde(default)] on all fields; existing configs without [web_fetch] section work unchanged
i18n Follow-Through (required when docs or user-facing wording changes)
- i18n follow-through triggered? No — no docs or user-facing wording changes
Human Verification (required)
- Verified scenarios:
- End-to-end test: zeroclaw agent with Ollama qwen2.5:72b successfully called web_fetch to fetch
https://github.com/zeroclaw-labs/zeroclaw, returned clean plain text with project description, features, star count
- Tool registration: tool_count increased from 22 to 23 when enabled = true
- Config: enabled = false (default) → tool not registered; enabled = true → tool available
- Edge cases checked:
- Missing [web_fetch] section in existing config.toml → works (serde defaults)
- Blocklist priority over allowlist
- SSRF with localhost, private IPs, IPv6
- What was not verified:
- Proxy routing (no proxy configured in test environment)
- Very large page truncation with real-world content
Side Effects / Blast Radius (required)
- Affected subsystems/workflows: all_tools_with_runtime() signature gained one parameter (web_fetch_config); all 5
call sites updated
- Potential unintended effects: None — new tool only, existing tools unchanged
- Guardrails/monitoring for early detection: enabled = false default; tool_count in debug logs
Agent Collaboration Notes (recommended)
- Agent tools used: Claude Code (Opus 4.6)
- Workflow/plan summary: Plan mode → approval → implementation → validation
- Verification focus: Security (SSRF, domain filtering, rate limiting), config compatibility, tool registration
- Confirmation: naming + architecture boundaries followed (CLAUDE.md + CONTRIBUTING.md): Yes — trait implementation +
factory registration pattern, independent security helpers (DRY rule-of-three), deny-by-default config
Rollback Plan (required)
- Fast rollback command/path: git revert <commit>
- Feature flags or config toggles: [web_fetch] enabled = false (default) disables completely
- Observable failure symptoms: tool_count in debug logs drops by 1; LLM cannot call web_fetch
Risks and Mitigations
- Risk: SSRF bypass via DNS rebinding (attacker-controlled domain resolving to private IP)
- Mitigation: Pre-request host validation blocks known private/local patterns. Same defense level as existing
http_request tool. Full DNS-level protection would require async DNS resolution before connect, which is out of scope
for this PR.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
(cherry picked from commit 04597352cc)
Addresses the unbounded-map gap left by #951: entries below the lockout
threshold (count < MAX_PAIR_ATTEMPTS, lockout = None) were never evicted,
allowing distributed brute-force (>1024 unique IPs, <5 attempts each) to
permanently fill the tracking map and disable accounting for new attackers.
Hardening delta on top of #951:
- Replace raw tuple with typed FailedAttemptState (count, lockout_until,
last_attempt) for clarity and to enable retention-based sweep.
- Bump MAX_TRACKED_CLIENTS from 1024 to 10_000.
- Add 15-min retention sweep (prune_failed_attempts) on 5-min interval.
- Switch lockout from relative (locked_at + elapsed) to absolute
(lockout_until) for simpler and monotonic comparison.
- Add LRU eviction fallback when map is at capacity after pruning.
- Add normalize_client_key() to sanitize whitespace/empty client IDs.
- Add 3 focused tests: per-client reset isolation, bounded map capacity,
and sweep pruning of stale entries.
Supersedes:
- #670 by @fettpl (original hardening branch, rebased as delta)
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Prepends [YYYY-MM-DD HH:MM:SS TZ] to each user message before it
reaches the model. This gives the agent accurate temporal context
on every turn, not just session start.
Previously DateTimeSection only injected the time once when the
system prompt was built. Long conversations or cron jobs had
stale timestamps. Now every message carries the real time.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* ci(homebrew): prefer HOMEBREW_UPSTREAM_PR_TOKEN with fallback
* ci(homebrew): handle existing upstream remote and main base
* feat(tools): Use system default browser instead of hard-coded Brave Browser
---------
Co-authored-by: Will Sarg <12886992+willsarg@users.noreply.github.com>
Adds a `/new` runtime chat command for Telegram and Discord that clears
the sender's conversation history without changing provider or model.
Useful for starting a fresh session when stale context causes issues.
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
- Return output string from 'execute_and_persist_job' alongside job id and success flag.
- Include failure reason in 'tracing::warn' when a scheduler job fails.
- Makes failed cron job errors visible in logs without inspecting the database.
Gemini CLI oauth_creds.json can omit client_id/client_secret, causing refresh requests to fail with HTTP 400 invalid_request (could not determine client ID).
Parse id_token claims (aud/azp) as a client_id fallback, preserve env/file overrides, and keep refresh form logic explicit. Also add camelCase deserialization aliases and regression tests for refresh-form and id_token parsing edge cases.
Refs #1424
The previous emoji set included unsupported reactions (🦀, 👣) that Telegram API
rejects with REACTION_INVALID error in some chat contexts. Remove these while
keeping the working emojis.
Before: ["⚡️", "🦀", "🙌", "💪", "👌", "👀", "👣"]
After: ["⚡️", "🙌", "💪", "👌", "👀"]
Fixes warning: REACTION_INVALID 400 Bad Request
When max_response_size is set to 0, the condition `text.len() > 0` is
true for any non-empty response, causing all responses to be truncated
to empty strings. The conventional meaning of 0 for size limits is
"no limit" (matching ulimit, nginx client_max_body_size, curl, etc.).
Add an early return when max_response_size == 0 and update the doc
comment to document this behavior.
Fix OpenAI Codex vision support by converting file paths to data URIs
before sending requests to the API.
## Problem
OpenAI Codex API was rejecting vision requests with 400 error:
"Invalid 'input[0].content[1].image_url'. Expected a valid URL,
but got a value with an invalid format."
Root cause: provider was sending raw file paths (e.g. `/tmp/test.png`)
instead of data URIs (e.g. `data:image/png;base64,...`).
## Solution
Add image normalization in both `chat_with_system` and `chat_with_history`:
- Call `multimodal::prepare_messages_for_provider()` before building request
- Converts file paths to base64 data URIs
- Validates image size and MIME type
- Works with both local files and remote URLs
## Changes
- `src/providers/openai_codex.rs`:
- Normalize images in `chat_with_system()`
- Normalize images in `chat_with_history()`
- Simplify `ResponsesInputContent.image_url` from nested object to String
- Fix unit test assertion for flat image_url structure
- `tests/openai_codex_vision_e2e.rs`:
- Add E2E test for second profile vision support
- Validates capabilities, request success, and response content
## Verification
✅ Unit tests pass: `cargo test --lib openai_codex`
✅ E2E test passes: `cargo test openai_codex_second_vision -- --ignored`
✅ Second profile accepts vision requests (200 OK)
✅ Returns correct image descriptions
## Impact
- Enables vision support for all OpenAI Codex profiles
- Second profile works without rate limits
- Fallback chain: default → second → gemini
- No breaking changes to existing non-vision flows
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
- Add vision capability declaration (vision: true)
- Extend ResponsesInputContent to support image_url field
- Update build_responses_input() to parse [IMAGE:...] markers
- Add ImageUrlContent structure for data URI images
- Maintain backward compatibility with text-only messages
- Add comprehensive unit tests for image handling
Enables multimodal input for gpt-5.3-codex and similar models.
Image markers are parsed and sent as separate input_image content items.
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Добавлен автоматический refresh протухших OAuth токенов Gemini при вызове warmup().
## Проблема
При использовании Gemini как fallback провайдера, OAuth токены могут протухнуть пока daemon работает. Это приводит к ошибкам при попытке переключения с OpenAI Codex на Gemini.
Сценарий:
1. Daemon работает, но не делает запросов к Gemini
2. OAuth токены Gemini истекают (TTL = 1 час)
3. Происходит ошибка на OpenAI Codex → fallback на Gemini
4. Gemini провайдер использует протухшие токены → запрос падает
## Решение
### Изменения в `GeminiProvider::warmup()`
Добавлена проверка и обновление токенов для `ManagedOAuth`:
- Вызывается `AuthService::get_valid_gemini_access_token()` который автоматически обновляет токены если нужно
- Для `OAuthToken` (CLI): пропускается (существующее поведение)
- Для API key: проверяется через публичный API (существующее поведение)
### Тесты
**Unit тесты** (`src/providers/gemini.rs`):
- `warmup_managed_oauth_requires_auth_service()` — проверка что ManagedOAuth требует auth_service
- `warmup_cli_oauth_skips_validation()` — проверка что CLI OAuth пропускает валидацию
**E2E тест** (`tests/gemini_fallback_oauth_refresh.rs`):
- `gemini_warmup_refreshes_expired_oauth_token()` — live тест с expired токеном и реальным refresh
- `gemini_warmup_with_valid_credentials()` — простой тест что warmup работает с валидными credentials
### Зависимости
Добавлена dev-зависимость `scopeguard = "1.2"` для безопасного восстановления файлов в тестах.
## Верификация
Проверено на live daemon с Telegram ботом:
- OpenAI Codex упал с 429 rate limit
- Fallback на Gemini сработал успешно
- Бот ответил через Gemini без ошибок
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
When using streaming mode with Telegram, the finalize_draft function
would only edit the message text and never send actual image attachments
marked with [IMAGE:path] syntax.
This fix:
- Parses attachment markers in finalize_draft
- Deletes the draft message when attachments are present
- Sends text and attachments as separate messages
- Maintains backward compatibility for text-only messages
Fixes: Telegram finalize_draft edit failed; falling back to sendMessage
Issue: #1420
Some LLM providers (e.g., xAI grok) output tool calls in the format:
```tool file_write
{"path": "...", "content": "..."}
```
Previously, ZeroClaw only matched:
- ```tool_call
- ```tool-call
- ```toolcall
- ```invoke
This caused silent failures where:
1. Tool calls were not parsed
2. Agent reported success but no tools executed
3. LLM hallucinated tool execution results
Fix:
1. Added new regex `MD_TOOL_NAME_RE` to match ` ```tool <name>` format
2. Parse the tool name from the code block header
3. Parse JSON arguments from the block content
4. Updated `detect_tool_call_parse_issue()` to include this format
Added 3 tests:
- parse_tool_calls_handles_tool_name_fence_format
- parse_tool_calls_handles_tool_name_fence_shell
- parse_tool_calls_handles_multiple_tool_name_fences
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* ci(homebrew): prefer HOMEBREW_UPSTREAM_PR_TOKEN with fallback
* ci(homebrew): handle existing upstream remote and main base
* fix(skills): allow cross-skill references in open-skills audit
Issue: #1391
The skill audit was too strict when validating markdown links in
open-skills, causing many skills to fail loading with errors like:
- "absolute markdown link paths are not allowed (../other-skill/SKILL.md)"
- "markdown link points to a missing file (skill-name.md)"
Root cause:
1. `looks_like_absolute_path()` rejected paths starting with ".."
before canonicalization could validate they stay within root
2. Missing file errors were raised for cross-skill references that
are valid but point to skills not installed locally
Fix:
1. Allow ".." paths to pass through to canonicalization check which
properly validates they resolve within the skill root
2. Treat cross-skill references (parent dir traversal or bare .md
filenames) as non-fatal when pointing to missing files
Cross-skill references are identified by:
- Parent directory traversal: `../other-skill/SKILL.md`
- Bare skill filename: `other-skill.md`
- Explicit relative: `./other-skill.md`
Added 6 new tests to cover edge cases for cross-skill references.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* feat(config): warn on unknown config keys to prevent silent misconfig
Issue: #1304
When users configure `[providers.ollama]` with `api_url`, the setting is
silently ignored because `[providers.*]` sections don't exist in the
config schema. This causes Ollama to always use localhost:11434 regardless
of the configured URL.
Fix: Use serde_ignored to detect and warn about unknown config keys at
load time. This helps users identify misconfigurations like:
- `[providers.ollama]` (should be top-level `api_url`)
- Typos in section names
- Deprecated/removed options
The warning is non-blocking - config still loads, but users see:
```
WARN Unknown config key ignored: "providers". Check config.toml...
```
This follows the fail-fast/explicit errors principle (CLAUDE.md §3.5).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Will Sarg <12886992+willsarg@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Add a debug-level log line confirming when the startup probe succeeds
and the main long-poll loop is entered. Aids diagnostics when
troubleshooting persistent 409s (e.g. from an external competing poller).
Note: persistent 409 despite the startup probe and 35s backoff indicates
an external process is actively polling the same bot token from another
host. In that case, rotating the bot token via @BotFather is the fix.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Every daemon restart produced a flood of 409 Telegram polling conflicts for
up to several minutes. Two changes fix this:
1. **Startup probe (retry loop):** Before entering the long-poll loop,
repeatedly issue `getUpdates?timeout=0` until a 200 OK is received.
This claims the Telegram getUpdates slot before the 30-second long-poll
starts, preventing the first long-poll from racing a stale server-side
session left by the previous daemon. The probe retries every 5 seconds
until the slot is confirmed free.
2. **Extended 409 backoff:** Increased from 2 s → 35 s (> the 30-second
poll timeout). If a 409 still occurs despite the probe (e.g. in a genuine
dual-instance scenario), the retry now waits long enough for the competing
session to expire naturally before the next attempt, instead of hammering
Telegram with ~15 retries per minute.
Fixes#1281.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Issue: #1391
The skill audit was too strict when validating markdown links in
open-skills, causing many skills to fail loading with errors like:
- "absolute markdown link paths are not allowed (../other-skill/SKILL.md)"
- "markdown link points to a missing file (skill-name.md)"
Root cause:
1. `looks_like_absolute_path()` rejected paths starting with ".."
before canonicalization could validate they stay within root
2. Missing file errors were raised for cross-skill references that
are valid but point to skills not installed locally
Fix:
1. Allow ".." paths to pass through to canonicalization check which
properly validates they resolve within the skill root
2. Treat cross-skill references (parent dir traversal or bare .md
filenames) as non-fatal when pointing to missing files
Cross-skill references are identified by:
- Parent directory traversal: `../other-skill/SKILL.md`
- Bare skill filename: `other-skill.md`
- Explicit relative: `./other-skill.md`
Added 6 new tests to cover edge cases for cross-skill references.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
NVIDIA's NIM API (integrate.api.nvidia.com) does not support the
OpenAI Responses API endpoint. When chat completions returns a
non-success status, the fallback to /v1/responses also fails with
404, producing a confusing double-failure error.
Use `new_no_responses_fallback()` for the NVIDIA provider, matching
the approach already used for GLM and other chat-completions-only
providers.
Fixes#1282
Add a new WATI channel for WhatsApp Business API integration via the
WATI managed platform. WATI simplifies WhatsApp integration with its
own REST API and webhook system.
- New WatiChannel implementation (webhook mode, REST send)
- WatiConfig with api_token, api_url, tenant_id, allowed_numbers
- Gateway routes: GET/POST /wati for webhook verification and messages
- Flexible webhook parsing handles WATI's variable field names
- 15 unit tests covering parsing, allowlist, timestamps, phone normalization
- Register Novita AI in provider factory with NOVITA_API_KEY env var
- Add to integrations registry with active/available status detection
- Configure onboarding wizard with default model and API endpoint
- Add to PR labeler provider keyword hints
- Update providers reference documentation
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Thinking/reasoning models (Kimi K2.5, GLM-4.7, DeepSeek-R1) return a
reasoning_content field in assistant messages containing tool calls.
ZeroClaw was silently dropping this field when constructing conversation
history, causing provider APIs to reject follow-up requests with 400
errors: "thinking is enabled but reasoning_content is missing in
assistant tool call message".
Add reasoning_content: Option<String> as an opaque pass-through at every
layer of the pipeline: ChatResponse, ConversationMessage, NativeMessage
structs, parse/convert/build functions, and dispatcher. The field is
skip_serializing_if = None so it is invisible for non-thinking models.
Closes#1327
Add file extension validation before generating [IMAGE:] markers for
incoming Telegram attachments. Non-image files (e.g. .md, .txt, .pdf)
now always use [Document:] format regardless of how Telegram classifies
them, preventing false vision capability errors.
Extract format_attachment_content() and is_image_extension() helpers
to centralize the logic and make it testable.
Fixes#1274
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Models like GLM-4.7 emit malformed tool call formats that the existing
parser cannot handle: cross-alias close tags (e.g. <tool_call>...</invoke>),
shortened bodies (tool>value), YAML-style multi-line, and attribute-style
(tool key="value"). This adds defense-in-depth parsing for these formats
so tool calls are not silently dropped.
Changes:
- Add TOOL_CALL_CLOSE_TAGS constant for cross-alias close tag matching
- Add default_param_for_tool() for shortened body parameter inference
- Add parse_glm_shortened_body() for 3 GLM sub-formats inside tags
- Extend parse_tool_calls() with cross-alias resolution and GLM fallbacks
- Merge duplicate match arms in map_tool_name_alias() for clippy compliance
- Add 13 focused tests covering all new parsing paths
Fixes two related issues with Gemini OAuth:
1. CLI command `zeroclaw auth refresh --provider gemini` was hardcoded to
only support OpenAI Codex, making manual token refresh impossible for
Gemini profiles. Extended the CLI handler to support both providers.
2. GeminiProvider.build_generate_content_request() was missing bearer token
for ManagedOAuth auth type. The method applied OAuth bearer token only
for CLI OAuth (GeminiAuth::OAuthToken), but not for managed profiles
(GeminiAuth::ManagedOAuth), causing 401 Unauthorized errors even after
successful token refresh.
Changes:
- src/main.rs: AuthCommands::Refresh now handles both openai-codex and
gemini providers via pattern match
- src/providers/gemini.rs: Extended OAuth bearer token handling to include
GeminiAuth::ManagedOAuth case (line 837)
Verification:
- Manual test: zeroclaw auth refresh --provider gemini --profile second
- E2E test: echo "hello" | zeroclaw agent --provider gemini --model gemini-2.5-pro
- Unit tests: cargo test providers::gemini (38 passed)
Risk: Low (isolated auth flow changes, no API contract changes)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Replace hardcoded `version = "0.1.0"` in clap command attribute with
`version` (no value), which makes clap read from CARGO_PKG_VERSION
automatically. This ensures `zeroclaw -V` always reflects the version
defined in Cargo.toml.
Port the progress streaming code from the fork's 75fdeb0 commit.
The upstream run_tool_call_loop only uses on_delta for final response
streaming, missing real-time feedback during tool execution.
Added progress sends at 4 points in the tool loop:
- "Thinking..." / "Thinking (round N)..." before each LLM call
- "Got N tool call(s) (Xs)" after LLM responds with tool calls
- Tool start: "⏳ tool_name: hint..." before each tool execution
- Tool complete: "✅ tool_name (Xs)" or "❌ tool_name (Xs)" after
Also added DRAFT_CLEAR_SENTINEL handling in the channel draft updater
so progress lines are cleared before the final answer streams in.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
opentelemetry-otlp 0.31 does not automatically append /v1/traces
and /v1/metrics to the endpoint URL when configured via code,
causing telemetry data to be sent to / instead of correct paths.
Manually construct full endpoint URLs for both traces and metrics
exporters to ensure telemetry reaches the collector properly.
Add a complete web management panel for ZeroClaw, served directly from
the binary via rust-embed. The dashboard provides real-time monitoring,
agent chat, configuration editing, and system diagnostics — all
accessible at http://localhost:5555/ after pairing.
Backend (Rust):
- Add 15+ REST API endpoints under /api/* with bearer token auth
- Add WebSocket agent chat at /ws/chat with query param auth
- Add SSE event stream at /api/events via BroadcastObserver
- Add rust-embed static file serving at /_app/* with SPA fallback
- Extend AppState with tools_registry, cost_tracker, event_tx
- Extract doctor::diagnose() for structured diagnostic results
- Add Serialize derives to IntegrationStatus, CliCategory, DiscoveredCli
Frontend (React + Vite + Tailwind CSS):
- 10 dashboard pages: Dashboard, AgentChat, Tools, Cron, Integrations,
Memory, Config, Cost, Logs, Doctor
- WebSocket client with auto-reconnect for agent chat
- SSE client (fetch-based, supports auth headers) for live events
- Full EN/TR internationalization (~190 translation keys)
- Dark theme with responsive layouts
- Auth flow via 6-digit pairing code, token stored in localStorage
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(gateway): switch default port to 42617 across runtime and docs
* docs(changelog): record 42617 default port migration
* chore(release): bump crate version to 0.1.1
* fix(build): sync Cargo.lock with v0.1.1 manifest
Add `autonomy.allowed_roots` config option that lets the agent
read/write files under additional directory roots outside the
workspace (e.g. shared skills directories, project repos).
Resolved (canonical) paths under any allowed root pass
`is_resolved_path_allowed` alongside the workspace itself.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add Osaurus (https://github.com/dinoki-ai/osaurus) as a named provider,
following the established LM Studio / vLLM pattern with
OpenAiCompatibleProvider and Bearer auth.
Osaurus is a unified AI edge runtime for macOS (Apple Silicon) that goes
beyond traditional local inference servers:
- Local MLX inference (Llama, Qwen, Gemma, GLM, Phi, Nemotron, etc.)
- Cloud provider proxying through a single endpoint
- Multi-API: OpenAI, Anthropic, Ollama, and Open Responses simultaneously
- Built-in MCP (Model Context Protocol) support for tool/context servers
Provider wiring:
- Provider ID: "osaurus", default endpoint: http://localhost:1337/v1
- API key defaults to "osaurus" but is fully optional (keyless access)
- Credential env var: OSAURUS_API_KEY
- Registered as local provider in list_providers()
Onboard wizard:
- Added to all 10 wizard functions (auth, models, endpoints, env vars)
- Curated model list: qwen3-30b-a3b, gemma-3n-e4b, phi-4-mini-reasoning
- Tier 4 local provider with interactive endpoint/key prompts
Tests:
- factory_osaurus, factory_osaurus_uses_default_key_when_none
- factory_osaurus_custom_url, resolve_provider_credential_osaurus_env
- resilient_fallback_includes_osaurus
- Added to factory_all_providers_create_successfully array
Documentation:
- providers-reference.md: table row + Osaurus Server Notes section
- README.md: Osaurus Server Endpoint section
Daemon heartbeat and cron tasks called agent::run() which hardcoded
channel_name as "cli" and always created an ApprovalManager, causing
[Y]es / [N]o / [A]lways stdin prompts on the unattended daemon terminal.
Add interactive parameter to agent::run(): CLI passes true (preserving
approval flow), daemon/cron pass false (no ApprovalManager, channel
marked as "daemon").
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Fix 'Current Date & Time' section only emitting timezone string (e.g. 'Timezone: +08:00'), omitting actual date and time values.
- Caused AI to hallucinate incorrect dates when asked about current time.
- Emit full datetime in format 'YYYY-MM-DD HH:MM:SS (TZ)' instead.
SecurityPolicy::default() includes "date" in its allowed_commands list
(policy.rs:114), but AutonomyConfig::default() omits it (schema.rs:1809-1822).
Since SecurityPolicy::from_config() copies allowed_commands from AutonomyConfig,
the "date" command is effectively blocked at runtime despite appearing allowed
in the SecurityPolicy unit tests.
Add "date" to AutonomyConfig::default() to restore parity between the two
default lists.
When a channel message triggers an LLM error or idle timeout, the user
turn was already appended to conversation history (line 1517) but no
assistant turn was recorded. This orphan user turn caused the LLM to
treat the failed request as unfinished context on subsequent messages,
leading to unrelated replies (e.g., re-executing a timed-out GitHub
search when the user asked about WAL checkpoints).
Append a short assistant marker ("[Task failed — not continuing this
request]" or "[Task timed out — ...]") in the error and timeout
branches so the conversation history stays properly alternating and the
LLM sees the prior request as closed.
The cancel and context-overflow paths are intentionally left unchanged:
cancel is superseded by a newer message, and context-overflow prompts
the user to resend.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add markdown_to_telegram_html() to TelegramChannel: converts **bold**,
*italic*, `code`, ```blocks```, [text](url) links, and ## headers
to Telegram HTML tags (<b>, <i>, <code>, <pre>, <a href>)
- Switch send_text_chunks() and finalize_draft() from parse_mode=Markdown
to parse_mode=HTML — more reliable and supports richer formatting
- Update channel_delivery_instructions() for Telegram: guide model to use
bold, emoji, and concise style (mirrors OpenClaw SOUL.md approach)
- Add wildcard support to http_request allowlist: allowed_domains=["*"]
now bypasses domain filtering entirely
- Expand system prompt URL fetching guidance: jina.ai reader-mode proxy
as fallback for paywalled/403 content
Upstream main now derives schemars::JsonSchema on all config structs.
Our HooksConfig and BuiltinHooksConfig were missing it, causing CI
Build (Smoke) failure when the merge commit was compiled.
- C1: Use real tool success boolean instead of starts_with("Error")
heuristic in after_tool_call hook
- C2: Wire HookRunner from config into ChannelRuntimeContext so hooks
actually fire in daemon/channel mode (was hardcoded to None)
- I1: Suppress unused_imports warning on HookHandler public API re-export
- I3: Remove session_memory and boot_script config fields that had no
backing implementation (YAGNI); keep only command_logger which is wired
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add a built-in hook that logs tool calls for auditing, recording
tool name, duration, and success status with timestamps.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Thread Option<&HookRunner> into run_tool_call_loop with hook fire points
for LLM input, before/after tool calls. Add hooks field to
ChannelRuntimeContext for message received/sending interception.
Build HookRunner from config in run_gateway and fire gateway_start.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add HooksConfig and BuiltinHooksConfig structs to src/config/schema.rs
with serde defaults for backward compatibility. Wire hooks field into
Config struct and all explicit Config constructors (Default impl,
wizard, test helpers).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds a full NostrChannel implementation enabling ZeroClaw to send and
receive private messages over the Nostr protocol via user-configured
relay WebSocket connections.
Key design decisions:
- Implements the Channel trait in src/channels/nostr.rs; registered via
the existing factory in channels/mod.rs
- Supports both NIP-04 (legacy encrypted DMs) and NIP-17 (gift-wrapped
private messages); replies automatically mirror the sender's protocol
- Deny-by-default allowlist (allowed_pubkeys = [] denies all)
- Private key encrypted at rest via SecretStore (ChaCha20-Poly1305 AEAD)
when secrets.encrypt = true (the default)
- nostr-sdk added with default-features = false and only nip04 + nip59
features to minimise binary size impact
- health_check() returns true if any relay reports is_connected()
Wiring:
- New NostrConfig struct and optional field in ChannelsConfig
- has_supervised_channels() in daemon updated to include nostr
- Onboarding wizard extended with a dedicated Nostr step (key
validation, relay selection, allowlist configuration)
Docs compliance:
- channels-reference.md: channel matrix, delivery modes table, allowlist
field names, numbered config section (4.12), log keyword table (7.2),
and log filter command all updated
- config-reference.md: [channels_config.nostr] sub-section with key
table and security notes added
- network-deployment.md and README.md updated
- .github/pull_request_template.md: resolved stale conflict markers from
chore/labeler-spacing-trusted-tier
Add cascading fallback to file_read tool: UTF-8 → PDF text extraction
(via pdf-extract) → lossy UTF-8 conversion. Binary files no longer
produce errors; PDFs return extracted text, other binaries get lossy
output with U+FFFD replacement characters.
Changes:
- Cargo.toml: add rag-pdf to default features
- file_read.rs: cascading fallback logic + try_extract_pdf_text helper
- file_read.rs: update tool description
- test_document.pdf: replace empty fixture with PDF containing "Hello PDF"
- Tests: remove file_read_rejects_binary_pdf, add unit + e2e tests for
PDF extraction and lossy binary reads (including live OpenAI Codex e2e)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Extract d.attachments from MESSAGE_CREATE payloads and fetch text/*
content from Discord CDN URLs, appending it to ChannelMessage.content
before the agent loop receives the message.
- Add process_attachments() async helper: fetches text/* attachments,
skips all other MIME types with debug log, warns on fetch errors
- Reuse existing build_runtime_proxy_client HTTP client (no new deps)
- Format inlined content as [filename]\n<content>, joined by ---
- Add unit tests: empty list, unsupported MIME type skip
Closes#1169
Move strip_tool_call_tags to channels/mod.rs as shared utility and
call it in Discord's send method. Telegram already stripped these tags
but Discord sent raw LLM output including <tool_call>...</tool_call>
XML, which leaked internal protocol to end users.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove the 200-char truncation of quoted reply text in Telegram
channel. The agent benefits from seeing the complete original message
when replying to a conversation thread.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Photos now use [IMAGE:/path] format instead of [Photo] /path, so the
existing multimodal pipeline validates vision capability and rejects
unsupported providers (Groq, OpenAI-compatible) with a user-facing
error before calling the LLM.
Tests added (all offline, no API keys required):
- attachment_photo_content_uses_image_marker
- attachment_document_content_uses_document_label
- photo_image_marker_detected_by_multimodal
- photo_image_marker_with_caption
- e2e_attachment_saves_file_and_formats_content
- groq_provider_rejects_photo_with_vision_error
- e2e_photo_attachment_rejected_by_non_vision_provider
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add an ignored integration test that exercises the full voice
transcription pipeline: load a pre-recorded MP3 fixture, transcribe via
Groq Whisper API, verify the result contains "hello", cache it in
TelegramChannel.voice_transcriptions, and assert extract_reply_context
returns "[Voice] <transcription>" instead of the fallback placeholder.
The test gracefully skips when GROQ_API_KEY is not set.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When a user swipes to reply to a specific message, the agent now
receives the quoted original message as a blockquote prefix, e.g.:
> @alice:
> original message text
translate this
This makes reply-to-voice ("translate this" → previous transcription)
and other reply-aware interactions work correctly.
Changes:
- Extract `extract_sender_info` helper (DRY: was duplicated in
parse_update_message and try_parse_voice_message)
- Add `extract_reply_context` helper: parses reply_to_message,
handles text/voice/photo/document/video/sticker, truncates >200
chars, falls back from username to first_name
- Wire reply context into both parse_update_message and
try_parse_voice_message
- Add 8 unit tests covering all branches
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add voice-to-text transcription for Telegram voice/audio messages using
any Whisper-compatible API (Groq by default, configurable endpoint).
- New TranscriptionConfig in config schema (enabled, api_url, model,
language, max_duration_secs) with serde defaults
- New transcription module: MIME detection, .oga→.ogg normalization,
size/format validation, Whisper API client
- Telegram: voice download pipeline (getFile → CDN download → transcribe),
listen loop fallback for voice messages, [Voice] prefix on transcribed text
- Proxy support via "transcription.groq" service key
- 18 new tests (MIME mapping, normalization, config roundtrip, voice
metadata parsing, builder wiring, format/size rejection)
Disabled by default (enabled: false). Fail-fast validation order:
size → format → API key.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Parse "provider:profile" entries (e.g. "openai-codex:second") in the
fallback chain so multiple OAuth profiles of the same provider can be
rotated on 429. The profile override is propagated via
auth_profile_override in ProviderRuntimeOptions.
Entries prefixed with "custom:" or "anthropic-custom:" are left
untouched since the colon is part of the URL scheme.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add `add_reaction` and `remove_reaction` methods to the Channel trait
with default no-op implementations, and implement them for Discord using
the REST API (PUT/DELETE reactions/@me endpoints).
Wire reactions into the channel message processing loop:
- React with 👀 when a message is received (acknowledgement)
- Swap to ✅ on success or ⚠️ on error after processing completes
Includes emoji URL-encoding helper, unit tests for encoding, trait
defaults, and an integration test verifying the full reaction flow.
Co-authored-by: Cursor <cursoragent@cursor.com>
When a user sends multiple messages before the assistant replies,
normalize_cached_channel_turns now concatenates them with \n\n
instead of silently dropping later turns. Memory-context enrichment
is also fixed to replace only the current message suffix, preserving
earlier merged segments.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Address clippy lints (redundant continue, as-cast, match arms, elided
lifetimes, format vs write!) and reformat long cfg attributes and assert
macros to pass `cargo fmt --check` and `cargo clippy -D warnings`.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add comprehensive tool name alias mapping:
- fileread -> file_read
- filewrite -> file_write
- memoryrecall -> memory_recall
- bash/sh/cmd -> shell
- etc.
Apply to all new parsers (XML attribute, Perl, FunctionCall).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add parser for <FunctionCall> style that MiniMax also uses:
<FunctionCall>
file_read
<code>path>/Users/.../file.md</code>
</FunctionCall>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add parsers for two additional tool call formats that MiniMax LLM uses:
- XML attribute style: <minimax:toolcall><invoke name="shell"><parameter name="command">ls</parameter></invoke></minimax:toolcall>
- Perl/hash-ref style: {tool => "shell", args => { --command "ls" }}
Previously these were sent as plain text to Telegram channel instead of
being executed as tool calls.
Also fixes build warnings:
- Add #[allow(unused_imports)] to cost/mod.rs and onboard/mod.rs re-exports
- Change channels::handle_command visibility to pub(crate)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When the Telegram Bot API rejects a sendDocument/sendPhoto/etc by URL
(e.g. "wrong type of the web page content" or "failed to get HTTP URL
content"), the entire reply was lost because the error propagated
immediately via `?` with no fallback.
Now when any send-media-by-URL call fails, the channel logs a warning
and falls back to sending the URL as a plain text link. This ensures
the user always receives the agent's response, even when Telegram
can't fetch the linked content.
Also makes `api_base` configurable via `with_api_base()` for local
Bot API server support and testability.