- `/model <name>` now auto-resolves provider from configured model_routes
by matching model name or hint, fixing 404 when switching to models on
different providers (e.g. `/model kimi-k2.5` with anthropic default)
- Conversation history is no longer cleared on `/model` or `/models` —
users can explicitly reset via `/new`
- Matrix channel now supports `/model`, `/models`, and `/new` commands
- `/model` (no args) lists configured model routes with hints
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The onboard wizard futures exceed clippy's large_futures threshold
(16KB+). Wrap in Box::pin to heap-allocate and fix the lint.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Import ordering in config/mod.rs and line-wrapping in config/schema.rs
were left unformatted by PR #2994. Run cargo fmt to fix.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
HTTP clients had no timeouts and could hang forever; add 30s total and
10s connect timeouts. Replace clock-nanos-based jitter with rand::random
for proper randomness. Add a 1000-entry cap to the user display name
cache with expired-entry pruning. Fix truncate_text to avoid scanning
the full string twice when checking for truncation.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Google TTS was passing the API key as a URL query parameter, which can
appear in logs and proxy access records. Move it to the x-goog-api-key
header instead. Add input validation for ElevenLabs voice IDs (reject
non-alphanumeric/dash/underscore characters) and restrict Edge TTS
binary_path to allowed basenames (edge-tts, edge-playback).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Slack private file fetching was sending the bot token on every redirect
hop. Since Slack CDN redirects use pre-signed URLs, sending the bearer
token to CDN hosts is unnecessary credential exposure.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Admin endpoints (/admin/shutdown, /admin/paircode, /admin/paircode/new) were
completely unauthenticated, allowing any network client to shut down the gateway
or read/generate pairing codes. Add require_localhost() guard that returns 403
for non-loopback IPs.
Replace std::process::exit(0) in shutdown handler with a tokio watch channel
for graceful shutdown, allowing proper destructor cleanup and connection
draining. Replace the 500ms sleep race in the restart command with a poll loop
that waits for the port to actually become free.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add --new flag to GetPaircode command in src/lib.rs
- Update main.rs to handle GetPaircode { new } parameter
- Add /admin/paircode/new POST endpoint in gateway/mod.rs
- Enhance documentation for constant_time_eq security function
Refs: #3015
The bitwise & operator is intentional in constant_time_eq() to prevent
timing side-channel attacks. Both comparisons must always execute to
ensure constant-time behavior regardless of the first comparison result.
- Revert logical && back to bitwise &
- Add #[allow(clippy::needless_bitwise_bool)] annotation
- Add explanatory comment documenting the intentional use
- Add security warning for 0.0.0.0 binding in help text
- Implement proper gateway shutdown before restart via /admin/shutdown endpoint
- Fetch live pairing code from running gateway via /admin/paircode endpoint
- Extract duplicate code into helper functions
- Fix clippy warnings
- Fix Critical: Split illegal or-pattern (Some(...) | None) into separate match arms
- Fix Major: Implement restart command with graceful shutdown check
- Fix Major: Improve get-paircode to check gateway status and provide clear instructions
- Fix Minor: Update help text to document public-bind precondition
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Add GatewayCommands enum with three subcommands:
- start: Start the gateway server (default behavior preserved)
- restart: Restart the gateway server
- get-paircode: Show current pairing status without restarting
This improves gateway management by allowing users to:
1. Restart gateway without manual stop/start
2. Check pairing status without disrupting running gateway
Closes#3014Closes#3015🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Modern macOS (Ventura+) stores iMessage content in the attributedBody
column as a binary typedstream blob rather than the text column. The
existing SQL filter `AND m.text IS NOT NULL` silently dropped all
incoming messages on affected systems.
Add a length-prefix extractor for the typedstream format and fall back
to attributedBody when text is NULL or empty. Includes real captured
blob fixtures and 14 new parser/integration tests.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Introduced a new function `check_api_key_prefix` to validate API key prefixes against their associated providers. This helps catch mismatches early in the process. Added unit tests to ensure correct functionality for various scenarios, including known and unknown key formats. This enhancement improves error handling and user guidance when incorrect provider keys are used.
Apply cargo fmt to fix formatting diffs in openrouter.rs and serial.rs.
Add web/dist placeholder step to lint, test, and build jobs so
RustEmbed compiles without the gitignored frontend assets.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Align src/peripherals/ and docs/hardware/ with the firmware directory renames
- Covers both compiled references (include_str!, constants) and documentation
Adds pluggable Text-to-Speech subsystem with TtsProvider trait,
TtsManager for provider selection, and per-provider config structs.
Includes secret encryption for TTS API keys.
- CI now builds across all 5 targets (linux x86/arm64, macOS x86/arm64,
Windows) matching the release matrix
- Fix chat_fails_without_credentials test to accept "builder error"
which occurs in CI environments without native TLS
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When a Telegram message originates from a forum topic, the thread_id was
extracted and used for reply routing but never stored in ChannelMessage.thread_ts.
This caused all messages from the same sender to share conversation history
regardless of which topic they were posted in.
Changes:
- Set thread_ts to the extracted thread_id in parse_update_message,
try_parse_voice_message, and try_parse_attachment_message
- Use 'ref' in if-let patterns to avoid moving thread_id before it's assigned
- Update conversation_history_key() to include thread_ts when present,
producing keys like 'telegram_<thread_id>_<sender>' for forum topics
- Update conversation_memory_key() to also include thread_ts for memory isolation
This enables proper per-topic session isolation in Telegram forum groups while
preserving existing behavior for regular groups and DMs (where thread_ts is None).
Closes#1532
Replace line-based TOML masking with structured config masking so secret fields keep their original types (including reliability.api_keys arrays).\nHydrate dashboard PUT payloads with runtime config_path/workspace_dir and restore masked secret placeholders from current config before validation/save.\nAlso allow GET on /api/doctor for dashboard/client compatibility to avoid 405 responses.
- security: honor explicit command paths in allowed_commands list
- security: respect workspace_only=false in resolved path checks
- config: enforce 0600 permissions on every config save (unix)
- config: reject temp-directory paths in active workspace marker
- provider: preserve reasoning_content in tool-call conversation history
- provider: add allow_user_image_parts parameter for minimax compatibility
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The supports_native_tools() method was hardcoded to return true,
but it should return the value of self.native_tool_calling to
properly disable native tool calling for providers like MiniMax.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(channels,providers): remap Docker /workspace paths and enable vision for custom provider
Two fixes:
1. Telegram channel: when a Docker-containerised runtime writes a file to
/workspace/<path>, the host-side sender couldn't find it because the
container mount point differs from the host workspace dir. Remap
/workspace/<rel> → <host_workspace_dir>/<rel> in send_attachment before
the path-exists check so generated media is delivered correctly.
2. Provider factory: custom: provider was created with vision disabled,
causing all image messages to be rejected with a capability error even
though the underlying OpenAI-compatible endpoint supports vision. Switch
to new_with_vision(..., true) so image inputs are forwarded correctly.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat(memory): restore Qdrant vector database backend
Re-adds the Qdrant memory backend that was removed from main in a
recent upstream merge. Restores:
- src/memory/qdrant.rs — full QdrantMemory implementation with lazy
init, HTTP REST client, embeddings, and Memory trait
- src/memory/backend.rs — Qdrant variant in MemoryBackendKind, profile,
classify and profile dispatch
- src/memory/mod.rs — module export, factory routing with build_qdrant_memory
- src/config/schema.rs — QdrantConfig struct and qdrant field on MemoryConfig
- src/config/mod.rs — re-export QdrantConfig
- src/onboard/wizard.rs — qdrant field in MemoryConfig initializer
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
copilot is the only provider that performs a device-code flow automatically on
first run. openai-codex and gemini (when OAuth-backed) require an explicit
`zeroclaw auth login --provider <name>` step. Split the device-flow next-steps
block to reflect this distinction.
Addresses Copilot review comment on PR #1509.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace hardcoded OPENROUTER_API_KEY hint with provider-aware logic:
- keyless local providers (ollama, llamacpp, etc.) show chat/gateway/status hints
- device-flow providers (copilot, gemini, openai-codex) show OAuth/first-run hint
- all other providers show the correct provider-specific env var via provider_env_var()
Also adds canonical alias "github-copilot" -> "copilot" in canonical_provider_name(),
and a new provider_supports_device_flow() helper with accompanying test.
Additionally fixes pre-existing compile blockers that prevented CI from running:
- fix(security): correct raw string literals in leak_detector.rs that terminated
early due to unescaped " inside r"..." (use r#"..."# instead)
- fix(gateway): add missing wati: None in two test AppState initializations
- fix(gateway): use serde::Deserialize path on WatiVerifyQuery struct
- fix(security): add #[allow(unused_imports)] on new pub use re-exports in mod.rs
- fix(security): remove unused serde::{Deserialize, Serialize} import
- chore: apply cargo fmt to files that had pending formatting diffs
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Scheduled jobs created via channel conversations (Discord, Telegram, etc.)
never delivered output back to the channel because:
1. The agent had no channel context (channel name + reply_target) in its
system prompt, so it could not populate the delivery config.
2. The schedule tool only creates shell jobs with no delivery support,
and the cron_add tool's delivery schema was opaque.
3. OpenAiCompatibleProvider was missing the native_tool_calling field,
causing a compile error.
Changes:
- Inject channel context (channel name + reply_target) into the system
prompt so the agent knows how to address delivery when scheduling.
- Improve cron_add tool description and delivery parameter schema to
guide the agent toward correct delivery config.
- Update schedule tool description to warn that output is only logged
and redirect to cron_add for channel delivery.
- Fix missing native_tool_calling field in OpenAiCompatibleProvider.
Co-authored-by: Cursor <cursoragent@cursor.com>
* ci(homebrew): prefer HOMEBREW_UPSTREAM_PR_TOKEN with fallback
* ci(homebrew): handle existing upstream remote and main base
* fix: always emit toolResult blocks for tool_use responses
The Bedrock Converse API requires that every toolUse block in an
assistant message has a corresponding toolResult block in the
subsequent user message. Two bugs caused violations of this contract:
1. When parse_tool_result_message failed (e.g. malformed JSON or
missing tool_call_id), the fallback emitted a plain text user
message instead of a toolResult block, causing Bedrock to reject
the request with "Expected toolResult blocks at messages.N.content
for the following Ids: ..."
2. When the assistant made multiple tool calls in a single turn, each
tool result was pushed as a separate ConverseMessage with role
"user". Bedrock expects all toolResult blocks for a turn to appear
in a single user message.
Fix (1) by making the fallback construct a toolResult with status
"error" containing the raw content, and attempting to extract the
tool_use_id from the previous assistant message if JSON parsing fails.
Fix (2) by merging consecutive tool-result user messages into a single
ConverseMessage during convert_messages.
Also accept alternate field names (tool_use_id, toolUseId) in addition
to tool_call_id when parsing tool result messages.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Will Sarg <12886992+willsarg@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
MiniMax API does not support OpenAI-style native tool definitions
(`tools` parameter in chat completions). Sending them causes a 500
Internal Server Error with "unknown error (1000)" on every request.
Add a `native_tool_calling` field to `OpenAiCompatibleProvider` so each
constructor can declare its tool-calling capability independently.
MiniMax (via `new_merge_system_into_user`) now sets this to `false`,
causing the agent loop to inject tool instructions into the system
prompt as text instead of sending native JSON tool definitions.
Closes#1387
(cherry picked from commit 2b92a774fb)
(cherry picked from commit 1816e8a829)
Co-authored-by: keiten arch <tang.zhengliang@ivis-sh.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Replace 🙌 and 💪 with 🔥 and 👍 in the TELEGRAM_ACK_REACTIONS pool.
The removed emojis are not in Telegram's allowed reaction set, causing
~40% of ACK reactions to fail with REACTION_INVALID (400 Bad Request).
All replacements verified against the Telegram Bot API setMessageReaction
endpoint in a live private chat.
Closes#1475
* feat(composio): fix v3 compatibility with parameter discovery, NLP text execution, and error enrichment
Three-layer fix for the Composio v3 API compatibility issue where the LLM
agent cannot discover parameter schemas, leading to repeated guessing and
execution failures.
Layer 1 – Surface parameter hints in list output:
- Add input_parameters field to ComposioV3Tool and ComposioAction structs
- Pass through input_parameters from v3 list response via map_v3_tools_to_actions
- Add format_input_params_hint() to show required/optional param names in list output
Layer 2 – Support natural-language text execution:
- Add text parameter to tool schema (mutually exclusive with params)
- Thread text through execute handler → execute_action → execute_action_v3
- Update build_execute_action_v3_request to send text instead of arguments
- Skip v2 fallback when text-mode is used (v2 has no NLP support)
Layer 3 – Enrich execute errors with parameter schema:
- Add get_tool_schema() to fetch full tool metadata from GET /api/v3/tools/{slug}
- Add format_schema_hint() to render parameter names, types, and descriptions
- On execute failure, auto-fetch schema and append to error message
Root cause: The v3 API returns input_parameters in list responses but
ComposioV3Tool was silently discarding them. The LLM had no way to discover
parameter schemas before calling execute, and error messages provided no
remediation guidance — creating an infinite guessing loop.
Co-Authored-By: unknown <>
(cherry picked from commit fd92cc5eb0)
* fix(composio): use floor_char_boundary for safe UTF-8 truncation in format_schema_hint
Co-Authored-By: unknown <>
(cherry picked from commit 18e72b6344)
* fix(composio): restore coherent v3 execute flow after replay
---------
Co-authored-by: Devin AI <158243242+devin-ai-integration[bot]@users.noreply.github.com>
- Problem: The existing http_request tool returns raw HTML/JSON, which is nearly unusable for LLMs to extract
meaningful content from web pages.
- Why it matters: All mainstream AI agents (Claude Code, Gemini CLI, Aider) have dedicated web content extraction
tools. ZeroClaw lacks this capability, limiting its ability to research and gather information from the web.
- What changed: Added a new web_fetch tool that fetches web pages and converts HTML to clean plain text using
nanohtml2text. Includes domain allowlist/blocklist, SSRF protection, redirect following, and content-type aware
processing.
- What did not change (scope boundary): http_request tool is untouched. No shared code extracted between http_request
and web_fetch (DRY rule-of-three: only 2 callers). No changes to existing tool behavior or defaults.
Label Snapshot (required)
- Risk label: risk: medium
- Size label: size: M
- Scope labels: tool, config
- Module labels: tool: web_fetch
- If any auto-label is incorrect, note requested correction: N/A
Change Metadata
- Change type: feature
- Primary scope: tool
Linked Issue
- Closes #
- Related #
- Depends on #
- Supersedes #
Supersede Attribution (required when Supersedes # is used)
N/A
Validation Evidence (required)
cargo fmt --all -- --check # pass
cargo clippy --all-targets -- -D warnings # no new warnings (pre-existing warnings only)
cargo test --lib -- web_fetch # 26/26 passed
cargo test --lib -- tools::tests # 12/12 passed
cargo test --lib -- config::schema::tests # 134/134 passed
- Evidence provided: unit test results (26 new tests), manual end-to-end test with Ollama + qwen2.5:72b
- If any command is intentionally skipped, explain why: Full cargo clippy --all-targets has 43 pre-existing errors
unrelated to this PR (e.g. await_holding_lock, format! appended to String). Zero errors from web_fetch code.
Security Impact (required)
- New permissions/capabilities? Yes — new web_fetch tool can make outbound HTTP GET requests
- New external network calls? Yes — fetches web pages from allowed domains
- Secrets/tokens handling changed? No
- File system access scope changed? No
- If any Yes, describe risk and mitigation:
- Deny-by-default: enabled = false by default; tool is not registered unless explicitly enabled
- Domain filtering: allowed_domains (default ["*"] = all public hosts) + blocked_domains (takes priority).
Blocklist always wins over allowlist.
- SSRF protection: Blocks localhost, private IPs (RFC 1918), link-local, multicast, reserved ranges, IPv4-mapped
IPv6, .local TLD — identical coverage to http_request
- Rate limiting: can_act() + record_action() enforce autonomy level and rate limits
- Read-only mode: Blocked when autonomy is ReadOnly
- Response size cap: 500KB default truncation prevents context window exhaustion
- Proxy support: Honors [proxy] config via tool.web_fetch service key
Privacy and Data Hygiene (required)
- Data-hygiene status: pass
- Redaction/anonymization notes: No personal data in code, tests, or fixtures
- Neutral wording confirmation: All test identifiers use neutral project-scoped labels
Compatibility / Migration
- Backward compatible? Yes — new tool, no existing behavior changed
- Config/env changes? Yes — new [web_fetch] section in config.toml (all fields have defaults)
- Migration needed? No — #[serde(default)] on all fields; existing configs without [web_fetch] section work unchanged
i18n Follow-Through (required when docs or user-facing wording changes)
- i18n follow-through triggered? No — no docs or user-facing wording changes
Human Verification (required)
- Verified scenarios:
- End-to-end test: zeroclaw agent with Ollama qwen2.5:72b successfully called web_fetch to fetch
https://github.com/zeroclaw-labs/zeroclaw, returned clean plain text with project description, features, star count
- Tool registration: tool_count increased from 22 to 23 when enabled = true
- Config: enabled = false (default) → tool not registered; enabled = true → tool available
- Edge cases checked:
- Missing [web_fetch] section in existing config.toml → works (serde defaults)
- Blocklist priority over allowlist
- SSRF with localhost, private IPs, IPv6
- What was not verified:
- Proxy routing (no proxy configured in test environment)
- Very large page truncation with real-world content
Side Effects / Blast Radius (required)
- Affected subsystems/workflows: all_tools_with_runtime() signature gained one parameter (web_fetch_config); all 5
call sites updated
- Potential unintended effects: None — new tool only, existing tools unchanged
- Guardrails/monitoring for early detection: enabled = false default; tool_count in debug logs
Agent Collaboration Notes (recommended)
- Agent tools used: Claude Code (Opus 4.6)
- Workflow/plan summary: Plan mode → approval → implementation → validation
- Verification focus: Security (SSRF, domain filtering, rate limiting), config compatibility, tool registration
- Confirmation: naming + architecture boundaries followed (CLAUDE.md + CONTRIBUTING.md): Yes — trait implementation +
factory registration pattern, independent security helpers (DRY rule-of-three), deny-by-default config
Rollback Plan (required)
- Fast rollback command/path: git revert <commit>
- Feature flags or config toggles: [web_fetch] enabled = false (default) disables completely
- Observable failure symptoms: tool_count in debug logs drops by 1; LLM cannot call web_fetch
Risks and Mitigations
- Risk: SSRF bypass via DNS rebinding (attacker-controlled domain resolving to private IP)
- Mitigation: Pre-request host validation blocks known private/local patterns. Same defense level as existing
http_request tool. Full DNS-level protection would require async DNS resolution before connect, which is out of scope
for this PR.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
(cherry picked from commit 04597352cc)
Addresses the unbounded-map gap left by #951: entries below the lockout
threshold (count < MAX_PAIR_ATTEMPTS, lockout = None) were never evicted,
allowing distributed brute-force (>1024 unique IPs, <5 attempts each) to
permanently fill the tracking map and disable accounting for new attackers.
Hardening delta on top of #951:
- Replace raw tuple with typed FailedAttemptState (count, lockout_until,
last_attempt) for clarity and to enable retention-based sweep.
- Bump MAX_TRACKED_CLIENTS from 1024 to 10_000.
- Add 15-min retention sweep (prune_failed_attempts) on 5-min interval.
- Switch lockout from relative (locked_at + elapsed) to absolute
(lockout_until) for simpler and monotonic comparison.
- Add LRU eviction fallback when map is at capacity after pruning.
- Add normalize_client_key() to sanitize whitespace/empty client IDs.
- Add 3 focused tests: per-client reset isolation, bounded map capacity,
and sweep pruning of stale entries.
Supersedes:
- #670 by @fettpl (original hardening branch, rebased as delta)
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Prepends [YYYY-MM-DD HH:MM:SS TZ] to each user message before it
reaches the model. This gives the agent accurate temporal context
on every turn, not just session start.
Previously DateTimeSection only injected the time once when the
system prompt was built. Long conversations or cron jobs had
stale timestamps. Now every message carries the real time.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* ci(homebrew): prefer HOMEBREW_UPSTREAM_PR_TOKEN with fallback
* ci(homebrew): handle existing upstream remote and main base
* feat(tools): Use system default browser instead of hard-coded Brave Browser
---------
Co-authored-by: Will Sarg <12886992+willsarg@users.noreply.github.com>
Adds a `/new` runtime chat command for Telegram and Discord that clears
the sender's conversation history without changing provider or model.
Useful for starting a fresh session when stale context causes issues.
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
- Return output string from 'execute_and_persist_job' alongside job id and success flag.
- Include failure reason in 'tracing::warn' when a scheduler job fails.
- Makes failed cron job errors visible in logs without inspecting the database.
Gemini CLI oauth_creds.json can omit client_id/client_secret, causing refresh requests to fail with HTTP 400 invalid_request (could not determine client ID).
Parse id_token claims (aud/azp) as a client_id fallback, preserve env/file overrides, and keep refresh form logic explicit. Also add camelCase deserialization aliases and regression tests for refresh-form and id_token parsing edge cases.
Refs #1424
The previous emoji set included unsupported reactions (🦀, 👣) that Telegram API
rejects with REACTION_INVALID error in some chat contexts. Remove these while
keeping the working emojis.
Before: ["⚡️", "🦀", "🙌", "💪", "👌", "👀", "👣"]
After: ["⚡️", "🙌", "💪", "👌", "👀"]
Fixes warning: REACTION_INVALID 400 Bad Request
When max_response_size is set to 0, the condition `text.len() > 0` is
true for any non-empty response, causing all responses to be truncated
to empty strings. The conventional meaning of 0 for size limits is
"no limit" (matching ulimit, nginx client_max_body_size, curl, etc.).
Add an early return when max_response_size == 0 and update the doc
comment to document this behavior.
Fix OpenAI Codex vision support by converting file paths to data URIs
before sending requests to the API.
## Problem
OpenAI Codex API was rejecting vision requests with 400 error:
"Invalid 'input[0].content[1].image_url'. Expected a valid URL,
but got a value with an invalid format."
Root cause: provider was sending raw file paths (e.g. `/tmp/test.png`)
instead of data URIs (e.g. `data:image/png;base64,...`).
## Solution
Add image normalization in both `chat_with_system` and `chat_with_history`:
- Call `multimodal::prepare_messages_for_provider()` before building request
- Converts file paths to base64 data URIs
- Validates image size and MIME type
- Works with both local files and remote URLs
## Changes
- `src/providers/openai_codex.rs`:
- Normalize images in `chat_with_system()`
- Normalize images in `chat_with_history()`
- Simplify `ResponsesInputContent.image_url` from nested object to String
- Fix unit test assertion for flat image_url structure
- `tests/openai_codex_vision_e2e.rs`:
- Add E2E test for second profile vision support
- Validates capabilities, request success, and response content
## Verification
✅ Unit tests pass: `cargo test --lib openai_codex`
✅ E2E test passes: `cargo test openai_codex_second_vision -- --ignored`
✅ Second profile accepts vision requests (200 OK)
✅ Returns correct image descriptions
## Impact
- Enables vision support for all OpenAI Codex profiles
- Second profile works without rate limits
- Fallback chain: default → second → gemini
- No breaking changes to existing non-vision flows
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
- Add vision capability declaration (vision: true)
- Extend ResponsesInputContent to support image_url field
- Update build_responses_input() to parse [IMAGE:...] markers
- Add ImageUrlContent structure for data URI images
- Maintain backward compatibility with text-only messages
- Add comprehensive unit tests for image handling
Enables multimodal input for gpt-5.3-codex and similar models.
Image markers are parsed and sent as separate input_image content items.
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Добавлен автоматический refresh протухших OAuth токенов Gemini при вызове warmup().
## Проблема
При использовании Gemini как fallback провайдера, OAuth токены могут протухнуть пока daemon работает. Это приводит к ошибкам при попытке переключения с OpenAI Codex на Gemini.
Сценарий:
1. Daemon работает, но не делает запросов к Gemini
2. OAuth токены Gemini истекают (TTL = 1 час)
3. Происходит ошибка на OpenAI Codex → fallback на Gemini
4. Gemini провайдер использует протухшие токены → запрос падает
## Решение
### Изменения в `GeminiProvider::warmup()`
Добавлена проверка и обновление токенов для `ManagedOAuth`:
- Вызывается `AuthService::get_valid_gemini_access_token()` который автоматически обновляет токены если нужно
- Для `OAuthToken` (CLI): пропускается (существующее поведение)
- Для API key: проверяется через публичный API (существующее поведение)
### Тесты
**Unit тесты** (`src/providers/gemini.rs`):
- `warmup_managed_oauth_requires_auth_service()` — проверка что ManagedOAuth требует auth_service
- `warmup_cli_oauth_skips_validation()` — проверка что CLI OAuth пропускает валидацию
**E2E тест** (`tests/gemini_fallback_oauth_refresh.rs`):
- `gemini_warmup_refreshes_expired_oauth_token()` — live тест с expired токеном и реальным refresh
- `gemini_warmup_with_valid_credentials()` — простой тест что warmup работает с валидными credentials
### Зависимости
Добавлена dev-зависимость `scopeguard = "1.2"` для безопасного восстановления файлов в тестах.
## Верификация
Проверено на live daemon с Telegram ботом:
- OpenAI Codex упал с 429 rate limit
- Fallback на Gemini сработал успешно
- Бот ответил через Gemini без ошибок
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
When using streaming mode with Telegram, the finalize_draft function
would only edit the message text and never send actual image attachments
marked with [IMAGE:path] syntax.
This fix:
- Parses attachment markers in finalize_draft
- Deletes the draft message when attachments are present
- Sends text and attachments as separate messages
- Maintains backward compatibility for text-only messages
Fixes: Telegram finalize_draft edit failed; falling back to sendMessage
Issue: #1420
Some LLM providers (e.g., xAI grok) output tool calls in the format:
```tool file_write
{"path": "...", "content": "..."}
```
Previously, ZeroClaw only matched:
- ```tool_call
- ```tool-call
- ```toolcall
- ```invoke
This caused silent failures where:
1. Tool calls were not parsed
2. Agent reported success but no tools executed
3. LLM hallucinated tool execution results
Fix:
1. Added new regex `MD_TOOL_NAME_RE` to match ` ```tool <name>` format
2. Parse the tool name from the code block header
3. Parse JSON arguments from the block content
4. Updated `detect_tool_call_parse_issue()` to include this format
Added 3 tests:
- parse_tool_calls_handles_tool_name_fence_format
- parse_tool_calls_handles_tool_name_fence_shell
- parse_tool_calls_handles_multiple_tool_name_fences
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* ci(homebrew): prefer HOMEBREW_UPSTREAM_PR_TOKEN with fallback
* ci(homebrew): handle existing upstream remote and main base
* fix(skills): allow cross-skill references in open-skills audit
Issue: #1391
The skill audit was too strict when validating markdown links in
open-skills, causing many skills to fail loading with errors like:
- "absolute markdown link paths are not allowed (../other-skill/SKILL.md)"
- "markdown link points to a missing file (skill-name.md)"
Root cause:
1. `looks_like_absolute_path()` rejected paths starting with ".."
before canonicalization could validate they stay within root
2. Missing file errors were raised for cross-skill references that
are valid but point to skills not installed locally
Fix:
1. Allow ".." paths to pass through to canonicalization check which
properly validates they resolve within the skill root
2. Treat cross-skill references (parent dir traversal or bare .md
filenames) as non-fatal when pointing to missing files
Cross-skill references are identified by:
- Parent directory traversal: `../other-skill/SKILL.md`
- Bare skill filename: `other-skill.md`
- Explicit relative: `./other-skill.md`
Added 6 new tests to cover edge cases for cross-skill references.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* feat(config): warn on unknown config keys to prevent silent misconfig
Issue: #1304
When users configure `[providers.ollama]` with `api_url`, the setting is
silently ignored because `[providers.*]` sections don't exist in the
config schema. This causes Ollama to always use localhost:11434 regardless
of the configured URL.
Fix: Use serde_ignored to detect and warn about unknown config keys at
load time. This helps users identify misconfigurations like:
- `[providers.ollama]` (should be top-level `api_url`)
- Typos in section names
- Deprecated/removed options
The warning is non-blocking - config still loads, but users see:
```
WARN Unknown config key ignored: "providers". Check config.toml...
```
This follows the fail-fast/explicit errors principle (CLAUDE.md §3.5).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Will Sarg <12886992+willsarg@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Add a debug-level log line confirming when the startup probe succeeds
and the main long-poll loop is entered. Aids diagnostics when
troubleshooting persistent 409s (e.g. from an external competing poller).
Note: persistent 409 despite the startup probe and 35s backoff indicates
an external process is actively polling the same bot token from another
host. In that case, rotating the bot token via @BotFather is the fix.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Every daemon restart produced a flood of 409 Telegram polling conflicts for
up to several minutes. Two changes fix this:
1. **Startup probe (retry loop):** Before entering the long-poll loop,
repeatedly issue `getUpdates?timeout=0` until a 200 OK is received.
This claims the Telegram getUpdates slot before the 30-second long-poll
starts, preventing the first long-poll from racing a stale server-side
session left by the previous daemon. The probe retries every 5 seconds
until the slot is confirmed free.
2. **Extended 409 backoff:** Increased from 2 s → 35 s (> the 30-second
poll timeout). If a 409 still occurs despite the probe (e.g. in a genuine
dual-instance scenario), the retry now waits long enough for the competing
session to expire naturally before the next attempt, instead of hammering
Telegram with ~15 retries per minute.
Fixes#1281.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Issue: #1391
The skill audit was too strict when validating markdown links in
open-skills, causing many skills to fail loading with errors like:
- "absolute markdown link paths are not allowed (../other-skill/SKILL.md)"
- "markdown link points to a missing file (skill-name.md)"
Root cause:
1. `looks_like_absolute_path()` rejected paths starting with ".."
before canonicalization could validate they stay within root
2. Missing file errors were raised for cross-skill references that
are valid but point to skills not installed locally
Fix:
1. Allow ".." paths to pass through to canonicalization check which
properly validates they resolve within the skill root
2. Treat cross-skill references (parent dir traversal or bare .md
filenames) as non-fatal when pointing to missing files
Cross-skill references are identified by:
- Parent directory traversal: `../other-skill/SKILL.md`
- Bare skill filename: `other-skill.md`
- Explicit relative: `./other-skill.md`
Added 6 new tests to cover edge cases for cross-skill references.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
NVIDIA's NIM API (integrate.api.nvidia.com) does not support the
OpenAI Responses API endpoint. When chat completions returns a
non-success status, the fallback to /v1/responses also fails with
404, producing a confusing double-failure error.
Use `new_no_responses_fallback()` for the NVIDIA provider, matching
the approach already used for GLM and other chat-completions-only
providers.
Fixes#1282
Add a new WATI channel for WhatsApp Business API integration via the
WATI managed platform. WATI simplifies WhatsApp integration with its
own REST API and webhook system.
- New WatiChannel implementation (webhook mode, REST send)
- WatiConfig with api_token, api_url, tenant_id, allowed_numbers
- Gateway routes: GET/POST /wati for webhook verification and messages
- Flexible webhook parsing handles WATI's variable field names
- 15 unit tests covering parsing, allowlist, timestamps, phone normalization
- Register Novita AI in provider factory with NOVITA_API_KEY env var
- Add to integrations registry with active/available status detection
- Configure onboarding wizard with default model and API endpoint
- Add to PR labeler provider keyword hints
- Update providers reference documentation
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Thinking/reasoning models (Kimi K2.5, GLM-4.7, DeepSeek-R1) return a
reasoning_content field in assistant messages containing tool calls.
ZeroClaw was silently dropping this field when constructing conversation
history, causing provider APIs to reject follow-up requests with 400
errors: "thinking is enabled but reasoning_content is missing in
assistant tool call message".
Add reasoning_content: Option<String> as an opaque pass-through at every
layer of the pipeline: ChatResponse, ConversationMessage, NativeMessage
structs, parse/convert/build functions, and dispatcher. The field is
skip_serializing_if = None so it is invisible for non-thinking models.
Closes#1327
Add file extension validation before generating [IMAGE:] markers for
incoming Telegram attachments. Non-image files (e.g. .md, .txt, .pdf)
now always use [Document:] format regardless of how Telegram classifies
them, preventing false vision capability errors.
Extract format_attachment_content() and is_image_extension() helpers
to centralize the logic and make it testable.
Fixes#1274
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Models like GLM-4.7 emit malformed tool call formats that the existing
parser cannot handle: cross-alias close tags (e.g. <tool_call>...</invoke>),
shortened bodies (tool>value), YAML-style multi-line, and attribute-style
(tool key="value"). This adds defense-in-depth parsing for these formats
so tool calls are not silently dropped.
Changes:
- Add TOOL_CALL_CLOSE_TAGS constant for cross-alias close tag matching
- Add default_param_for_tool() for shortened body parameter inference
- Add parse_glm_shortened_body() for 3 GLM sub-formats inside tags
- Extend parse_tool_calls() with cross-alias resolution and GLM fallbacks
- Merge duplicate match arms in map_tool_name_alias() for clippy compliance
- Add 13 focused tests covering all new parsing paths
Fixes two related issues with Gemini OAuth:
1. CLI command `zeroclaw auth refresh --provider gemini` was hardcoded to
only support OpenAI Codex, making manual token refresh impossible for
Gemini profiles. Extended the CLI handler to support both providers.
2. GeminiProvider.build_generate_content_request() was missing bearer token
for ManagedOAuth auth type. The method applied OAuth bearer token only
for CLI OAuth (GeminiAuth::OAuthToken), but not for managed profiles
(GeminiAuth::ManagedOAuth), causing 401 Unauthorized errors even after
successful token refresh.
Changes:
- src/main.rs: AuthCommands::Refresh now handles both openai-codex and
gemini providers via pattern match
- src/providers/gemini.rs: Extended OAuth bearer token handling to include
GeminiAuth::ManagedOAuth case (line 837)
Verification:
- Manual test: zeroclaw auth refresh --provider gemini --profile second
- E2E test: echo "hello" | zeroclaw agent --provider gemini --model gemini-2.5-pro
- Unit tests: cargo test providers::gemini (38 passed)
Risk: Low (isolated auth flow changes, no API contract changes)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Replace hardcoded `version = "0.1.0"` in clap command attribute with
`version` (no value), which makes clap read from CARGO_PKG_VERSION
automatically. This ensures `zeroclaw -V` always reflects the version
defined in Cargo.toml.
Port the progress streaming code from the fork's 75fdeb0 commit.
The upstream run_tool_call_loop only uses on_delta for final response
streaming, missing real-time feedback during tool execution.
Added progress sends at 4 points in the tool loop:
- "Thinking..." / "Thinking (round N)..." before each LLM call
- "Got N tool call(s) (Xs)" after LLM responds with tool calls
- Tool start: "⏳ tool_name: hint..." before each tool execution
- Tool complete: "✅ tool_name (Xs)" or "❌ tool_name (Xs)" after
Also added DRAFT_CLEAR_SENTINEL handling in the channel draft updater
so progress lines are cleared before the final answer streams in.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
opentelemetry-otlp 0.31 does not automatically append /v1/traces
and /v1/metrics to the endpoint URL when configured via code,
causing telemetry data to be sent to / instead of correct paths.
Manually construct full endpoint URLs for both traces and metrics
exporters to ensure telemetry reaches the collector properly.
Add a complete web management panel for ZeroClaw, served directly from
the binary via rust-embed. The dashboard provides real-time monitoring,
agent chat, configuration editing, and system diagnostics — all
accessible at http://localhost:5555/ after pairing.
Backend (Rust):
- Add 15+ REST API endpoints under /api/* with bearer token auth
- Add WebSocket agent chat at /ws/chat with query param auth
- Add SSE event stream at /api/events via BroadcastObserver
- Add rust-embed static file serving at /_app/* with SPA fallback
- Extend AppState with tools_registry, cost_tracker, event_tx
- Extract doctor::diagnose() for structured diagnostic results
- Add Serialize derives to IntegrationStatus, CliCategory, DiscoveredCli
Frontend (React + Vite + Tailwind CSS):
- 10 dashboard pages: Dashboard, AgentChat, Tools, Cron, Integrations,
Memory, Config, Cost, Logs, Doctor
- WebSocket client with auto-reconnect for agent chat
- SSE client (fetch-based, supports auth headers) for live events
- Full EN/TR internationalization (~190 translation keys)
- Dark theme with responsive layouts
- Auth flow via 6-digit pairing code, token stored in localStorage
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(gateway): switch default port to 42617 across runtime and docs
* docs(changelog): record 42617 default port migration
* chore(release): bump crate version to 0.1.1
* fix(build): sync Cargo.lock with v0.1.1 manifest
Add `autonomy.allowed_roots` config option that lets the agent
read/write files under additional directory roots outside the
workspace (e.g. shared skills directories, project repos).
Resolved (canonical) paths under any allowed root pass
`is_resolved_path_allowed` alongside the workspace itself.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add Osaurus (https://github.com/dinoki-ai/osaurus) as a named provider,
following the established LM Studio / vLLM pattern with
OpenAiCompatibleProvider and Bearer auth.
Osaurus is a unified AI edge runtime for macOS (Apple Silicon) that goes
beyond traditional local inference servers:
- Local MLX inference (Llama, Qwen, Gemma, GLM, Phi, Nemotron, etc.)
- Cloud provider proxying through a single endpoint
- Multi-API: OpenAI, Anthropic, Ollama, and Open Responses simultaneously
- Built-in MCP (Model Context Protocol) support for tool/context servers
Provider wiring:
- Provider ID: "osaurus", default endpoint: http://localhost:1337/v1
- API key defaults to "osaurus" but is fully optional (keyless access)
- Credential env var: OSAURUS_API_KEY
- Registered as local provider in list_providers()
Onboard wizard:
- Added to all 10 wizard functions (auth, models, endpoints, env vars)
- Curated model list: qwen3-30b-a3b, gemma-3n-e4b, phi-4-mini-reasoning
- Tier 4 local provider with interactive endpoint/key prompts
Tests:
- factory_osaurus, factory_osaurus_uses_default_key_when_none
- factory_osaurus_custom_url, resolve_provider_credential_osaurus_env
- resilient_fallback_includes_osaurus
- Added to factory_all_providers_create_successfully array
Documentation:
- providers-reference.md: table row + Osaurus Server Notes section
- README.md: Osaurus Server Endpoint section
Daemon heartbeat and cron tasks called agent::run() which hardcoded
channel_name as "cli" and always created an ApprovalManager, causing
[Y]es / [N]o / [A]lways stdin prompts on the unattended daemon terminal.
Add interactive parameter to agent::run(): CLI passes true (preserving
approval flow), daemon/cron pass false (no ApprovalManager, channel
marked as "daemon").
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Fix 'Current Date & Time' section only emitting timezone string (e.g. 'Timezone: +08:00'), omitting actual date and time values.
- Caused AI to hallucinate incorrect dates when asked about current time.
- Emit full datetime in format 'YYYY-MM-DD HH:MM:SS (TZ)' instead.
SecurityPolicy::default() includes "date" in its allowed_commands list
(policy.rs:114), but AutonomyConfig::default() omits it (schema.rs:1809-1822).
Since SecurityPolicy::from_config() copies allowed_commands from AutonomyConfig,
the "date" command is effectively blocked at runtime despite appearing allowed
in the SecurityPolicy unit tests.
Add "date" to AutonomyConfig::default() to restore parity between the two
default lists.
When a channel message triggers an LLM error or idle timeout, the user
turn was already appended to conversation history (line 1517) but no
assistant turn was recorded. This orphan user turn caused the LLM to
treat the failed request as unfinished context on subsequent messages,
leading to unrelated replies (e.g., re-executing a timed-out GitHub
search when the user asked about WAL checkpoints).
Append a short assistant marker ("[Task failed — not continuing this
request]" or "[Task timed out — ...]") in the error and timeout
branches so the conversation history stays properly alternating and the
LLM sees the prior request as closed.
The cancel and context-overflow paths are intentionally left unchanged:
cancel is superseded by a newer message, and context-overflow prompts
the user to resend.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Add markdown_to_telegram_html() to TelegramChannel: converts **bold**,
*italic*, `code`, ```blocks```, [text](url) links, and ## headers
to Telegram HTML tags (<b>, <i>, <code>, <pre>, <a href>)
- Switch send_text_chunks() and finalize_draft() from parse_mode=Markdown
to parse_mode=HTML — more reliable and supports richer formatting
- Update channel_delivery_instructions() for Telegram: guide model to use
bold, emoji, and concise style (mirrors OpenClaw SOUL.md approach)
- Add wildcard support to http_request allowlist: allowed_domains=["*"]
now bypasses domain filtering entirely
- Expand system prompt URL fetching guidance: jina.ai reader-mode proxy
as fallback for paywalled/403 content
Upstream main now derives schemars::JsonSchema on all config structs.
Our HooksConfig and BuiltinHooksConfig were missing it, causing CI
Build (Smoke) failure when the merge commit was compiled.
- C1: Use real tool success boolean instead of starts_with("Error")
heuristic in after_tool_call hook
- C2: Wire HookRunner from config into ChannelRuntimeContext so hooks
actually fire in daemon/channel mode (was hardcoded to None)
- I1: Suppress unused_imports warning on HookHandler public API re-export
- I3: Remove session_memory and boot_script config fields that had no
backing implementation (YAGNI); keep only command_logger which is wired
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add a built-in hook that logs tool calls for auditing, recording
tool name, duration, and success status with timestamps.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Thread Option<&HookRunner> into run_tool_call_loop with hook fire points
for LLM input, before/after tool calls. Add hooks field to
ChannelRuntimeContext for message received/sending interception.
Build HookRunner from config in run_gateway and fire gateway_start.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add HooksConfig and BuiltinHooksConfig structs to src/config/schema.rs
with serde defaults for backward compatibility. Wire hooks field into
Config struct and all explicit Config constructors (Default impl,
wizard, test helpers).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds a full NostrChannel implementation enabling ZeroClaw to send and
receive private messages over the Nostr protocol via user-configured
relay WebSocket connections.
Key design decisions:
- Implements the Channel trait in src/channels/nostr.rs; registered via
the existing factory in channels/mod.rs
- Supports both NIP-04 (legacy encrypted DMs) and NIP-17 (gift-wrapped
private messages); replies automatically mirror the sender's protocol
- Deny-by-default allowlist (allowed_pubkeys = [] denies all)
- Private key encrypted at rest via SecretStore (ChaCha20-Poly1305 AEAD)
when secrets.encrypt = true (the default)
- nostr-sdk added with default-features = false and only nip04 + nip59
features to minimise binary size impact
- health_check() returns true if any relay reports is_connected()
Wiring:
- New NostrConfig struct and optional field in ChannelsConfig
- has_supervised_channels() in daemon updated to include nostr
- Onboarding wizard extended with a dedicated Nostr step (key
validation, relay selection, allowlist configuration)
Docs compliance:
- channels-reference.md: channel matrix, delivery modes table, allowlist
field names, numbered config section (4.12), log keyword table (7.2),
and log filter command all updated
- config-reference.md: [channels_config.nostr] sub-section with key
table and security notes added
- network-deployment.md and README.md updated
- .github/pull_request_template.md: resolved stale conflict markers from
chore/labeler-spacing-trusted-tier
Add cascading fallback to file_read tool: UTF-8 → PDF text extraction
(via pdf-extract) → lossy UTF-8 conversion. Binary files no longer
produce errors; PDFs return extracted text, other binaries get lossy
output with U+FFFD replacement characters.
Changes:
- Cargo.toml: add rag-pdf to default features
- file_read.rs: cascading fallback logic + try_extract_pdf_text helper
- file_read.rs: update tool description
- test_document.pdf: replace empty fixture with PDF containing "Hello PDF"
- Tests: remove file_read_rejects_binary_pdf, add unit + e2e tests for
PDF extraction and lossy binary reads (including live OpenAI Codex e2e)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Extract d.attachments from MESSAGE_CREATE payloads and fetch text/*
content from Discord CDN URLs, appending it to ChannelMessage.content
before the agent loop receives the message.
- Add process_attachments() async helper: fetches text/* attachments,
skips all other MIME types with debug log, warns on fetch errors
- Reuse existing build_runtime_proxy_client HTTP client (no new deps)
- Format inlined content as [filename]\n<content>, joined by ---
- Add unit tests: empty list, unsupported MIME type skip
Closes#1169
Move strip_tool_call_tags to channels/mod.rs as shared utility and
call it in Discord's send method. Telegram already stripped these tags
but Discord sent raw LLM output including <tool_call>...</tool_call>
XML, which leaked internal protocol to end users.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Remove the 200-char truncation of quoted reply text in Telegram
channel. The agent benefits from seeing the complete original message
when replying to a conversation thread.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Photos now use [IMAGE:/path] format instead of [Photo] /path, so the
existing multimodal pipeline validates vision capability and rejects
unsupported providers (Groq, OpenAI-compatible) with a user-facing
error before calling the LLM.
Tests added (all offline, no API keys required):
- attachment_photo_content_uses_image_marker
- attachment_document_content_uses_document_label
- photo_image_marker_detected_by_multimodal
- photo_image_marker_with_caption
- e2e_attachment_saves_file_and_formats_content
- groq_provider_rejects_photo_with_vision_error
- e2e_photo_attachment_rejected_by_non_vision_provider
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add an ignored integration test that exercises the full voice
transcription pipeline: load a pre-recorded MP3 fixture, transcribe via
Groq Whisper API, verify the result contains "hello", cache it in
TelegramChannel.voice_transcriptions, and assert extract_reply_context
returns "[Voice] <transcription>" instead of the fallback placeholder.
The test gracefully skips when GROQ_API_KEY is not set.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When a user swipes to reply to a specific message, the agent now
receives the quoted original message as a blockquote prefix, e.g.:
> @alice:
> original message text
translate this
This makes reply-to-voice ("translate this" → previous transcription)
and other reply-aware interactions work correctly.
Changes:
- Extract `extract_sender_info` helper (DRY: was duplicated in
parse_update_message and try_parse_voice_message)
- Add `extract_reply_context` helper: parses reply_to_message,
handles text/voice/photo/document/video/sticker, truncates >200
chars, falls back from username to first_name
- Wire reply context into both parse_update_message and
try_parse_voice_message
- Add 8 unit tests covering all branches
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add voice-to-text transcription for Telegram voice/audio messages using
any Whisper-compatible API (Groq by default, configurable endpoint).
- New TranscriptionConfig in config schema (enabled, api_url, model,
language, max_duration_secs) with serde defaults
- New transcription module: MIME detection, .oga→.ogg normalization,
size/format validation, Whisper API client
- Telegram: voice download pipeline (getFile → CDN download → transcribe),
listen loop fallback for voice messages, [Voice] prefix on transcribed text
- Proxy support via "transcription.groq" service key
- 18 new tests (MIME mapping, normalization, config roundtrip, voice
metadata parsing, builder wiring, format/size rejection)
Disabled by default (enabled: false). Fail-fast validation order:
size → format → API key.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Parse "provider:profile" entries (e.g. "openai-codex:second") in the
fallback chain so multiple OAuth profiles of the same provider can be
rotated on 429. The profile override is propagated via
auth_profile_override in ProviderRuntimeOptions.
Entries prefixed with "custom:" or "anthropic-custom:" are left
untouched since the colon is part of the URL scheme.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add `add_reaction` and `remove_reaction` methods to the Channel trait
with default no-op implementations, and implement them for Discord using
the REST API (PUT/DELETE reactions/@me endpoints).
Wire reactions into the channel message processing loop:
- React with 👀 when a message is received (acknowledgement)
- Swap to ✅ on success or ⚠️ on error after processing completes
Includes emoji URL-encoding helper, unit tests for encoding, trait
defaults, and an integration test verifying the full reaction flow.
Co-authored-by: Cursor <cursoragent@cursor.com>
When a user sends multiple messages before the assistant replies,
normalize_cached_channel_turns now concatenates them with \n\n
instead of silently dropping later turns. Memory-context enrichment
is also fixed to replace only the current message suffix, preserving
earlier merged segments.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Address clippy lints (redundant continue, as-cast, match arms, elided
lifetimes, format vs write!) and reformat long cfg attributes and assert
macros to pass `cargo fmt --check` and `cargo clippy -D warnings`.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add comprehensive tool name alias mapping:
- fileread -> file_read
- filewrite -> file_write
- memoryrecall -> memory_recall
- bash/sh/cmd -> shell
- etc.
Apply to all new parsers (XML attribute, Perl, FunctionCall).
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add parser for <FunctionCall> style that MiniMax also uses:
<FunctionCall>
file_read
<code>path>/Users/.../file.md</code>
</FunctionCall>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add parsers for two additional tool call formats that MiniMax LLM uses:
- XML attribute style: <minimax:toolcall><invoke name="shell"><parameter name="command">ls</parameter></invoke></minimax:toolcall>
- Perl/hash-ref style: {tool => "shell", args => { --command "ls" }}
Previously these were sent as plain text to Telegram channel instead of
being executed as tool calls.
Also fixes build warnings:
- Add #[allow(unused_imports)] to cost/mod.rs and onboard/mod.rs re-exports
- Change channels::handle_command visibility to pub(crate)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When the Telegram Bot API rejects a sendDocument/sendPhoto/etc by URL
(e.g. "wrong type of the web page content" or "failed to get HTTP URL
content"), the entire reply was lost because the error propagated
immediately via `?` with no fallback.
Now when any send-media-by-URL call fails, the channel logs a warning
and falls back to sending the URL as a plain text link. This ensures
the user always receives the agent's response, even when Telegram
can't fetch the linked content.
Also makes `api_base` configurable via `with_api_base()` for local
Bot API server support and testability.
Add native vLLM provider support to ZeroClaw
- First-class `vllm` provider with local endpoint defaults (`http://localhost:8000/v1`)
- Optional `VLLM_API_KEY` support
- Onboarding wizard integration (tier menu, endpoint prompt, model discovery, keyless local usage)
- Updated provider/docs references and command documentation
Add input_tokens and output_tokens fields to ObserverEvent::LlmResponse
so per-call token data flows through all observer backends. Prometheus
gains three new counters (llm_requests_total, tokens_input_total,
tokens_output_total) for granular token tracking by provider/model.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Parse provider-specific usage fields from API responses:
- Anthropic: input_tokens/output_tokens from usage object
- Gemini: promptTokenCount/candidatesTokenCount from usageMetadata
- Ollama: prompt_eval_count/eval_count from response root
- Bedrock: inputTokens/outputTokens from camelCase usage object
Gemini required refactoring send_generate_content to return
(String, Option<TokenUsage>) tuple, plus a chat() override to
thread usage into ChatResponse.
Add UsageInfo deserialization structs and wire usage data from API
responses through to ChatResponse for OpenRouter, OpenAI, Compatible,
and Copilot providers. All four share the OpenAI response format with
prompt_tokens/completion_tokens fields.
Add a lightweight TokenUsage struct to providers::traits with
input_tokens and output_tokens fields. Add usage: Option<TokenUsage>
to ChatResponse and update all construction sites across providers
and agent modules with usage: None.
This is the first step toward capturing token usage data from LLM
API responses. Currently all sites set usage: None — subsequent
commits will parse actual usage from each provider's response format.
- Remove duplicate `chat` method in reliable.rs (E0201)
- Fix `futures` → `futures_util` imports in agent.rs and loop_.rs (E0433)
- Gate PostgresMemory behind `memory-postgres` feature in cli.rs (E0433)
- Fix regex backreference in XML tool parser (unsupported by regex crate)
- Add missing `skills_prompt_mode` argument in test
- Apply rustfmt to files with formatting issues on main
Lucid memory tests used 500ms/400ms recall/store timeouts for shell
script execution. Under parallel test load, bash process spawning
often exceeded these limits, causing timeout kills before the script
could write to marker files — leading to consistent test failures
when run alongside other tests.
Widen test timeouts to 5s. The scripts themselves complete in <50ms;
the margin absorbs OS scheduling jitter under concurrent test load.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Gateway channels (WhatsApp, Linq, Nextcloud Talk) were returning raw
<tool> tags without executing tools or showing results. The CLI
correctly executed tools and returned results.
Root cause: gateway handlers used run_gateway_chat_with_multimodal which
explicitly disabled tools for simple chat-only mode.
Fix: Create run_gateway_chat_with_tools() which uses process_message()
for full tool support, while keeping run_gateway_chat_simple() for
the webhook endpoint to maintain backward compatibility with tests.
Changes:
- Add run_gateway_chat_with_tools() for channel handlers (uses process_message)
- Keep run_gateway_chat_simple() for webhook endpoint (uses state.provider)
- Remove unused provider_label variables from channel handlers
- Remove unused imports (ChatMessage, ProviderCapabilityError)
- Fix pre-existing test compilation issue (missing SkillsPromptInjectionMode)
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Previously, BedrockProvider only read credentials from environment
variables (AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY). When running
on EC2 with an IAM instance role, the env vars are not set, causing
all Bedrock calls to fail with 'credentials not set'.
Changes:
- Add AwsCredentials::from_imds(): fetches temporary credentials from
EC2 IMDSv2 (PUT token → get role name → get credentials → get region)
- Add AwsCredentials::resolve(): tries env vars first, falls back to IMDS
- Add BedrockProvider::resolve_credentials(): async method called per
request, so expired instance role tokens are automatically refreshed
- chat() and chat_with_system() now call resolve_credentials() instead
of require_credentials(), enabling seamless EC2 instance role auth
Gemini thinking models (e.g. gemini-3-pro-preview) return response parts
with `thought: true` for internal reasoning and `thoughtSignature` for
opaque signatures. The previous extraction logic blindly took the first
part, which was the thinking part, returning reasoning text instead of the
actual answer.
- Add `thought` field to `ResponsePart` to detect reasoning parts
- Add `effective_text()` on `CandidateContent` to skip thinking/signature
parts and extract only the real answer (falls back to thinking text if
no non-thinking content is available)
- Make `Candidate.content` optional to guard against candidates with no
content (e.g. safety-filtered responses)
- Add 7 focused tests covering thinking, non-thinking, fallback, empty,
multi-part, signature-only, and internal API responses
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Remove duplicate chat method in ReliableProvider impl (E0201)
The second chat fn (lines 662-769) was an exact duplicate of the
first (lines 540-647) in the same impl block.
- Gate PostgresMemory usage in memory CLI behind memory-postgres feature (E0433)
super::PostgresMemory is only exported when the feature is enabled;
the Postgres match arm now compiles to an explicit bail when the
feature is off.
- Replace utures::future::join_all with utures_util::future::join_all (E0433)
The crate depends on utures-util, not utures. Fixed in both
agent.rs and loop_.rs.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The previous commit filtered tool_specs (native API tools) but the
system prompt still contained text descriptions like "shell: Execute
terminal commands" which caused the model to generate XML-based
<function_calls> tool invocations in its text response.
Filter tool_descs using the same non_cli_excluded_tools config so
excluded tools are not mentioned anywhere the LLM can see them.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- channels/telegram.rs: support photo messages in parse_update_message;
add resolve_photo_data_uri() to fetch, download and resize images to
512px via Telegram getFile API before base64 encoding
- providers/bedrock.rs: add parse_user_content_blocks() to extract
[IMAGE:data:...] markers and build proper Bedrock image content blocks;
apply to both chat() and chat_with_system() paths; set vision: true
in provider capabilities
- Cargo.toml: add image crate v0.25 (jpeg/png) for server-side resize
Update the hardcoded synthetic provider base URL from https://api.synthetic.com
to https://api.synthetic.new/openai/v1 to match the actual API endpoint.
The user verified locally that the old URL doesn't work and confirmed the fix
works by using the custom provider syntax as a workaround:
default_provider = "custom:https://api.synthetic.new/openai/v1"
This change makes the synthetic provider work out of the box without requiring
users to use the custom provider workaround.
- Problem: Agent relies on `shell` + `find` for file search — fragile syntax, raw output, broad permissions
- Why it matters: Structured tool reduces failed tool calls and tightens security boundary
- What changed: New `glob_search` tool in `default_tools` and `all_tools`; searches workspace by glob pattern with
full security checks
- What did **not** change (scope boundary): No changes to security policy, config schema, providers, or agent loop
Two bugs caused Telegram replies to fail with "message is too long":
1. split_message_for_telegram splits at exactly 4096 chars, but send_text_chunks
then appends continuation markers ("(continued)\n\n" / "\n\n(continues...)"),
pushing the actual sent text over Telegram's 4096 limit. Fixed by reserving
30 chars of headroom in the split limit.
2. strip_tool_call_tags did not handle <function_calls> / <function_call> wrapper
tags. When the LLM returns raw XML function calls, the unstripped angle brackets
break Telegram's Markdown parser, and the full XML payload exceeds the length
limit on the plain-text fallback.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
ZeroClaw's memory system powers context injection, auto-save, and long-term agent identity — but until now users had
**zero visibility** into what's stored. No way to list, inspect, audit, or clean up memory outside the agent loop.
`zeroclaw memory` closes this gap with four subcommands:
- **`list`** — browse all entries with `--category`/`--session` filters and `--limit`/`--offset` pagination
- **`get`** — inspect a single entry by key (supports prefix match — no need to copy full UUID)
- **`stats`** — backend health, total count, per-category breakdown at a glance
- **`clear`** — batch delete by `--category`, single delete by `--key`, with confirmation prompt (`--yes` to skip)
| Before | After |
|--------|-------|
| Memory is a black box | `memory stats` shows health + distribution |
| Can't see what auto-save stored | `memory list --category conversation` |
| Can't inspect a specific entry | `memory get <key-or-prefix>` |
| Can't clean stale data without `/clear` in agent | `memory clear --category daily --yes` |
| Must enter agent loop to manage memory | Direct CLI, no LLM invocation needed |
| File | Change |
|------|--------|
| `src/memory/cli.rs` | **New** — CLI handler with list/get/stats/clear + unit tests |
| `src/memory/mod.rs` | Add `pub mod cli` |
| `src/lib.rs` | Add `MemoryCommands` public enum |
| `src/main.rs` | Add private `MemoryCommands`, `Commands::Memory` variant, match arm |
- **Lightweight backend creation**: CLI uses `create_memory_for_migration` (no embedding provider) since
list/get/stats/clear don't need vector search. Postgres handled separately.
- **Prefix matching**: Both `get` and `clear --key` fall back to prefix search when exact match fails — essential
since keys are UUIDs.
- **Confirmation by default**: All destructive operations require `dialoguer::Confirm`; `--yes` for
scripts/automation.
- **Record-style list output**: Full key displayed (no truncation), one entry per block — keys are too long for
tabular layout.
ReliableProvider was missing a chat() override, causing it to fall through
to the default Provider::chat() trait implementation. The default
implementation delegates to chat_with_history() which returns a plain
String and wraps it in ChatResponse with tool_calls: Vec::new() — so
native tool calling was completely broken through the retry/failover
wrapper even though the underlying provider properly supports it.
Changes:
- Add chat() with full retry/backoff/failover logic matching existing
chat_with_system(), chat_with_history(), and chat_with_tools() overrides
- Include context_window_exceeded early-exit matching other method patterns
- Add 7 focused tests: delegation with tool calls, retry recovery,
supports_native_tools propagation, aggregated error reporting,
model failover, non-retryable error skip, and system prompt zero-XML
verification
On non-CLI channels (Telegram, Discord, etc.), tools like shell and
file_write cannot receive interactive approval and are auto-denied,
causing the LLM to see confusing error responses and fabricate answers.
Add a new config option `non_cli_excluded_tools` under `[autonomy]`
that removes specified tools from the tool specs sent to the LLM on
non-CLI channels. This prevents the model from attempting tool calls
that would fail, forcing it to use data already in the system prompt.
The change filters tool_specs in run_tool_call_loop when the
excluded_tools parameter is non-empty. CLI channels are unaffected.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Linux managed daemon now falls back to systemd when OpenRC restart probe fails, instead of returning early with no action.
- OpenRC uninstall no longer fails hard if rc-update del fails; it warns and continues to remove the init script.
Switch OpenRC service generation from env exports
(ZEROCLAW_CONFIG_DIR/WORKSPACE) to explicit command_args with
--config-dir flag. Fixes startup crash with 'Permission denied (os error
13)' under OpenRC init system.
Add automatic runtime-state migration to /etc/zeroclaw with secure ownership/permissions. Implement env-based config resolution for service startup, eliminating the need for manual --service-init flags in the happy path.
- Add global --config-dir CLI flag that sets ZEROCLAW_CONFIG_DIR env
- Add ZEROCLAW_CONFIG_DIR override in config resolution (takes precedence)
- Update OpenRC script to use --config-dir and set env vars for config/workspace
- Prefer /usr/local/bin/zeroclaw for OpenRC executable
- Create /etc/zeroclaw/workspace directory with correct ownership on install
- Update docs to reflect --service-init flag order (service-level before subcommand)
- Alpine adduser -S doesn't create a group automatically
- Explicitly create group with addgroup -S zeroclaw first
- Then add user with -G zeroclaw to join the group
- Update error message commands to include group handling
OpenRC service runs as zeroclaw:zeroclaw, so group must exist.
- Detect Alpine Linux via /etc/alpine-release
- Use adduser/deluser on Alpine instead of useradd/userdel
- Auto-create zeroclaw system user during install
- Provide correct commands in error messages
Alpine uses BusyBox which has different user management commands:
- adduser -S -s /sbin/nologin -H -D zeroclaw (Alpine)
- useradd -r -s /sbin/nologin zeroclaw (Debian/RHEL)
- Add chown_to_zeroclaw() helper to change directory ownership
- Log directory /var/log/zeroclaw now owned by zeroclaw:zeroclaw
- Fix docs: config file should be owned by zeroclaw:zeroclaw
(service runs as zeroclaw user, needs read access)
Fixes permission denied error when service tries to write logs.
- Add InitSystem enum with auto-detection (systemd/OpenRC)
- Add --service-init CLI flag to override init system detection
- Generate OpenRC init script with security hardening:
- Runs as zeroclaw:zeroclaw user
- umask 027 for file permissions
- Logs to /var/log/zeroclaw/
- Depends on net and firewall
- Require root for OpenRC install with clear error message
- Warn if binary is in home directory
- Add OpenRC auto-restart support in channels module
- Document OpenRC setup in README and network-deployment.md
Non-goals:
- No changes to systemd behavior
- No user-level OpenRC services
- No other init systems (SysV, runit, s6)
Security: OpenRC install requires root, validates user, creates
directories with proper permissions
When autonomy is set to "supervised", the approval gate only prompted
interactively on CLI. On Telegram and other channels, all tool calls
were silently auto-approved with ApprovalResponse::Yes, including
high-risk tools like shell — completely bypassing supervised mode.
On non-CLI channels where interactive prompting is not possible, deny
tool calls that require approval instead of auto-approving. Users can
expand the auto_approve list in config to explicitly allow specific
tools on non-interactive channels.
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Restrict 19 internal-only modules from pub to pub(crate) in lib.rs,
reducing the public API surface of the library crate.
Modules kept pub (used by integration tests, benchmarks, or are
documented extension points per AGENTS.md):
agent, channels, config, gateway, memory, observability,
peripherals, providers, rag, runtime, tools
Modules restricted to pub(crate) (not imported via zeroclaw:: by any
external consumer):
approval, auth, cost, cron, daemon, doctor, hardware, health,
heartbeat, identity, integrations, migration, multimodal, onboard,
security, service, skills, tunnel, util
Also restrict 6 command enums (ServiceCommands, ChannelCommands,
SkillCommands, MigrateCommands, CronCommands, IntegrationCommands)
to pub(crate) — main.rs defines its own copies and does not import
these from the library crate. HardwareCommands and PeripheralCommands
remain pub as main.rs imports them via zeroclaw::.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
When parallel_tools is enabled, both code branches in execute_tools()
ran the same sequential for loop. The parallel path was a no-op.
Use futures::future::join_all to execute tool calls concurrently when
parallel_tools is true. The futures crate is already a dependency.
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Replace the single shared typing_handle with a HashMap keyed by
recipient channel ID. Previously, concurrent messages would fight
over one handle — starting typing for message B would cancel message
A's indicator, and stopping one would kill the other's.
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
rotate_key() selects the next key in the round-robin but never applies
it to the underlying provider (Provider trait has no set_api_key
method). The previous info-level log implied rotation was working.
Change to warn-level and explicitly state the key is not applied,
making the limitation visible to operators instead of silently
pretending rotation works.
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Network access (web search via DuckDuckGo) should require explicit user
consent rather than being enabled by default. This aligns with the
least-surprise principle and the project's secure-by-default policy:
users must opt in to external network requests.
Changes:
- WebSearchConfig::default() now sets enabled: false
- Serde default for enabled field changed from default_true to default
(bool defaults to false)
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Three issues prevented the Gemini OAuth path from working end-to-end:
1. Missing `project` field — the internal API returns 500 without it.
Added project field to InternalGenerateContentRequest and
resolve_oauth_project() to fetch it via loadCodeAssist endpoint.
2. No token refresh — stale access_token was read at construction time
and never refreshed. Google OAuth tokens expire after ~1 hour,
breaking long-lived daemon processes. Added runtime token refresh
with OAuthTokenState (Arc<Mutex>) that checks expiry before each
request and refreshes proactively (60s buffer).
3. Wrong response format — internal API nests candidates under a
`response` field. Added InternalGenerateContentResponse wrapper
and conditional deserialization in send_generate_content().
Also fixes OAuth warmup to call resolve_oauth_project() instead of
listing models on the public endpoint (which rejects OAuth tokens).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
AnthropicProvider declared supports_native_tools() = true but did not
override chat_with_tools(). The default trait implementation drops all
conversation history (sends only system + last user message), breaking
multi-turn conversations on Telegram and other channels.
Changes:
- Override chat_with_tools() in AnthropicProvider: converts OpenAI-format
tool JSON to ToolSpec and delegates to chat() which preserves full
message history
- Skip build_tool_instructions() XML protocol when provider supports
native tools (saves ~12k chars in system prompt)
- Remove duplicate Tool Use Protocol section from build_system_prompt()
for native-tool providers
- Update Your Task section to encourage conversational follow-ups
instead of XML tool_call tags when using native tools
- Add tracing::warn for malformed tool definitions in chat_with_tools
Two fixes for conversation history quality:
1. Store raw msg.content in ConversationHistoryMap instead of
enriched_message — memory context is ephemeral per-request and
pollutes future turns when persisted.
2. Skip memory recall when conversation history exists — prior turns
already provide context. Memory recall adds noise and can mislead
the model (e.g. old 'seen' entries overshadowing a code variable
named seen in the current conversation).