Narration text from native tool-call providers that doesn't end with a newline now gets one appended before dispatch to the draft updater. Prevents garbled output in Telegram drafts.
Closes#4348
* feat(tools): add cross-channel poll creation tool
Adds a poll tool that enables cross-channel poll creation with voting
support. Changes all_tools_with_runtime return type from 3-tuple to
4-tuple to accommodate the new reaction handle.
Original PR #4243 by rareba.
* ci: retrigger CI
* feat(agent): add loop detection guardrail for repetitive tool calls
Introduces a LoopDetector that monitors a sliding window of recent tool
calls and detects three repetitive patterns:
1. Exact repeat — same tool+args called consecutively (default 3+)
2. Ping-pong — two tools alternating for 4+ cycles
3. No progress — same tool with different args but identical results (5+)
Each pattern escalates through Warning -> Block -> CircuitBreaker.
Configurable via [pacing] section: loop_detection_enabled (default true),
loop_detection_window_size (default 20), loop_detection_max_repeats
(default 3).
Wired into run_tool_call_loop alongside the existing time-gated
identical-output detector. Respects loop_ignore_tools exclusion list.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(agent): fix channel test interaction with loop detector
The max_tool_iterations channel tests use an IterativeToolProvider that
intentionally repeats identical tool calls. The loop detector (enabled by
default) fires its circuit breaker before max_tool_iterations is reached,
causing the test to fail. Disable loop detection in these two tests so
they exercise only the max_tool_iterations boundary.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(agent): address PR #4240 review — loop detector correctness and test precision
Critical fixes:
- Fix result_index/tool_calls misalignment: use enumerate() before
filter_map() so the index stays aligned with tool_calls even when
ordered_results contains None entries from skipped tool calls.
- Fix hash_value JSON key-order sensitivity: canonicalise() recursively
sorts object keys before serialisation so {"a":1,"b":2} and
{"b":2,"a":1} hash identically.
Tightened test assertions:
- ping_pong_escalates_with_more_cycles: assert Block with 5 cycles
(was loose Warning|Block|Break match).
- no_progress_escalates_to_block_and_break: assert Break at 7 calls
(was loose Block|Break match).
- no_progress_not_triggered_when_all_args_identical: assert Warning
specifically (was accepting Ok as alternative).
New tests:
- ping_pong_detects_alternation_with_varying_args (item 3)
- window_eviction_prevents_stale_pattern_detection (item 4)
- hash_value_is_key_order_independent + nested variant (item 2)
- pacing_config_serde_defaults_match_manual_default (item 5)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
---------
Co-authored-by: rareba <rareba@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(agent): add thinking/reasoning level control per message
Users can set reasoning depth via /think:high etc. with resolution
hierarchy (inline > session > config > default). 6 levels from Off
to Max. Adjusts temperature and system prompt.
* fix(agent): prevent thinking level prefix from leaking across interactive turns
system_prompt was mutated in place for the first message's thinking
directive, then used as the "baseline" for restoration after each
interactive turn. This caused the first turn's thinking prefix to
persist across all subsequent turns.
Fix: save the original system_prompt before any thinking modifications
and restore from that saved copy between turns.
Apply exponential time decay (2^(-age/half_life), 7-day half-life) to
memory entry scores post-recall. Core memories are exempt (evergreen).
Consolidate duplicate half-life constants into a single public constant
in the decay module.
Based on PR #4266 by 5queezer with constant consolidation fix.
Skill tools defined in [[tools]] sections are now registered as first-class
callable tool specs via the Tool trait, rather than only appearing as XML
in the system prompt. This enables the LLM to invoke skill tools through
native function calling.
- Add SkillShellTool for shell/script kind skill tools
- Add SkillHttpTool for http kind skill tools
- Add skills_to_tools() conversion and register_skill_tools() wiring
- Wire registration into both CLI and process_message agent paths
- Update prompt rendering to mark registered tools as callable
- Update affected tests across skills, agent/prompt, and channels
When vision_provider is configured in [multimodal] config, messages
containing [IMAGE:] markers are automatically routed to the specified
vision-capable provider instead of failing on the default text provider.
Closes#4119
* feat(gateway): add Live Canvas (A2UI) tool and real-time web viewer
Add a Live Canvas system that enables the agent to push rendered content
(HTML, SVG, Markdown, text) to a web-visible canvas in real time.
Backend:
- src/tools/canvas.rs: CanvasTool with render/snapshot/clear/eval actions,
backed by a shared CanvasStore (Arc<RwLock<HashMap>>) with per-canvas
broadcast channels for real-time updates
- src/gateway/canvas.rs: REST endpoints (GET/POST/DELETE /api/canvas/:id,
GET /api/canvas/:id/history, GET /api/canvas) and WebSocket endpoint
(WS /ws/canvas/:id) for real-time frame delivery
Frontend:
- web/src/pages/Canvas.tsx: Canvas viewer page with WebSocket connection,
iframe sandbox rendering, canvas switcher, frame history panel
Registration:
- CanvasTool registered in all_tools_with_runtime (always available)
- Canvas routes wired into gateway router
- CanvasStore added to AppState
- Canvas page added to App.tsx router and Sidebar navigation
- i18n keys added for en/zh/tr locales
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(config): fix pre-existing test compilation errors in schema.rs
- Remove #[cfg(unix)] gate on `use tempfile::TempDir` import since
TempDir is used unconditionally in bootstrap file tests
- Add explicit type annotations on tokio::fs::* calls to resolve
type inference failures (create_dir_all, write, read_to_string)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(gateway): share CanvasStore between tool and REST API
The CanvasTool and gateway AppState each created their own CanvasStore,
so content rendered via the tool never appeared in the REST API.
Create the CanvasStore once in the gateway, pass it to
all_tools_with_runtime via a new optional parameter, and reuse the
same instance in AppState.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix(gateway): address critical security and reliability bugs in Live Canvas
- Validate content_type in REST POST endpoint against allowed set,
preventing injection of "eval" frames via the REST API
- Enforce MAX_CONTENT_SIZE (256KB) limit on REST POST endpoint,
matching tool-side validation to prevent memory exhaustion
- Add MAX_CANVAS_COUNT (100) limit to prevent unbounded canvas creation
and memory exhaustion from CanvasStore
- Handle broadcast RecvError::Lagged in WebSocket handler gracefully
instead of disconnecting the client
- Make MAX_CONTENT_SIZE and ALLOWED_CONTENT_TYPES pub for gateway reuse
- Update CanvasStore::render and subscribe to return Option for
canvas count enforcement
---------
Co-authored-by: Giulio V <vannini.gv@gmail.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: rareba <rareba@users.noreply.github.com>
For models with small context windows (e.g. glm-4.5-air ~8K tokens),
the system prompt alone can exceed the limit. This adds:
- max_system_prompt_chars config option (default 0 = unlimited)
- compact_context now also compacts the system prompt: skips the
Channel Capabilities section and shows only tool names
- Truncation with marker when prompt exceeds the budget
Users can set `max_system_prompt_chars = 8000` in [agent] config
to cap the system prompt for small-context models.
Closes#4124
Add ReactionTool that exposes Channel::add_reaction and
Channel::remove_reaction as an agent-callable tool. Uses a
late-binding ChannelMapHandle (Arc<RwLock<HashMap>>) pattern
so the tool can be constructed during tool registry init and
populated once channels are available in start_channels.
Parameters: channel, message_id, emoji, action (add/remove).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* feat(config): add configurable pacing controls for slow/local LLM workloads (#2963)
Add a new `[pacing]` config section with four opt-in parameters that
let users tune timeout and loop-detection behavior for local LLMs
(Ollama, llama.cpp, vLLM) without disabling safety features entirely:
- `step_timeout_secs`: per-step LLM inference timeout independent of
the overall message budget, catching hung model responses early.
- `loop_detection_min_elapsed_secs`: time-gated loop detection that
only activates after a configurable grace period, avoiding false
positives on long-running browser/research workflows.
- `loop_ignore_tools`: per-tool loop-detection exclusions so tools
like `browser_screenshot` that structurally resemble loops are not
counted toward identical-output detection.
- `message_timeout_scale_max`: overrides the hardcoded 4x ceiling in
the channel message timeout scaling formula.
All parameters are strictly optional with no effect when absent,
preserving full backwards compatibility.
Closes#2963
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(config): add missing pacing fields in tests and call sites
* fix(config): add pacing arg to remaining cost-tracking test call sites
---------
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: argenis de la rosa <theonlyhennygod@gmail.com>
Adopted from #3705 by @fangxueshun with fixes:
- Added input validation for date strings (RFC 3339)
- Used chrono DateTime comparison instead of string comparison
- Added since < until validation
- Updated mem0 backend
Supersedes #3705
Adds per-channel cost tracking via task-local context in the tool call
loop. Budget enforcement blocks further API calls when limits are
exceeded. Resolves merge conflicts with model-switch retry loop,
reply_target parameter, and autonomy level additions on master.
Supersedes #3758
* feat(hardware): add RPi GPIO, Aardvark I2C/SPI/GPIO, and hardware plugin system
Extends the hardware subsystem with three clusters of functionality,
all feature-gated (hardware / peripheral-rpi) with no impact on default builds.
Raspberry Pi native support:
- src/hardware/rpi.rs: board self-discovery (model, serial, revision),
sysfs GPIO pin read/write, and ACT LED control
- scripts/99-act-led.rules: udev rule for non-root ACT LED access
- scripts/deploy-rpi.sh, scripts/rpi-config.toml, scripts/zeroclaw.service:
one-shot deployment helper and systemd service template
Total Phase Aardvark USB adapter (I2C / SPI / GPIO):
- crates/aardvark-sys/: new workspace crate with FFI bindings loaded at
runtime via libloading; graceful stub fallback when .so is absent or
arch mismatches (Rosetta 2 detection)
- src/hardware/aardvark.rs: AardvarkTransport implementing Transport trait
- src/hardware/aardvark_tools.rs: agent tools i2c_scan, i2c_read,
i2c_write, spi_transfer, gpio_aardvark
- src/hardware/datasheet.rs: datasheet search/download for detected devices
- docs/aardvark-integration.md, examples/hardware/aardvark/: guide + examples
Hardware plugin / ToolRegistry system:
- src/hardware/tool_registry.rs: ToolRegistry for hardware module tool sets
- src/hardware/loader.rs, src/hardware/manifest.rs: manifest-driven loader
- src/hardware/subprocess.rs: subprocess execution helper for board I/O
- src/gateway/hardware_context.rs: POST /api/hardware/reload endpoint
- src/hardware/mod.rs: exports all new modules; merge_hardware_tools and
load_hardware_context_prompt helpers
Integration hooks (minimal surface):
- src/hardware/device.rs: DeviceKind::Aardvark, DeviceRuntime::Aardvark,
has_aardvark / resolve_aardvark_device on DeviceRegistry
- src/hardware/transport.rs: TransportKind::Aardvark
- src/peripherals/mod.rs: gate create_board_info_tools behind hardware feature
- src/agent/loop_.rs: TOOL_CHOICE_OVERRIDE task-local for Anthropic provider
- src/providers/anthropic.rs: read TOOL_CHOICE_OVERRIDE; add tool_choice field
- Cargo.toml: add aardvark-sys to workspace and as dependency
- firmware/zeroclaw-nucleo/: update Cargo.toml and Cargo.lock
Non-goals:
- No changes to agent orchestration, channels, providers, or security policy
- No new config keys outside existing [hardware] / [peripherals] sections
- No CI workflow changes
Risk: Low. All new paths are feature-gated; aardvark.so loads at runtime
only when present. No schema migrations or persistent state introduced.
Rollback: revert this single commit.
* fix(hardware): resolve clippy and rustfmt CI failures
- struct_excessive_bools: allow on DeviceCapabilities (7 bool fields needed)
- unnecessary_debug_formatting: use .display() instead of {:?} for paths
- stable_sort_primitive: replace .sort() with .sort_unstable() on &str slices
* fix(hardware): add missing serial/uf2/pico modules declared in mod.rs
cargo fmt was exiting with code 1 because mod.rs declared pub mod serial,
uf2, pico_flash, pico_code but those files were missing from the branch.
Also apply auto-formatting to loader.rs.
* fix(hardware): apply rustfmt 1.92.0 formatting (matches CI toolchain)
* docs(scripts): add RPi deployment and interaction guide
* push
* feat(firmware): add initial Pico firmware and serial device handling
- Introduced main.py for ZeroClaw Pico firmware with a placeholder for MicroPython implementation.
- Added binary UF2 file for Pico deployment.
- Implemented serial device enumeration and validation in the hardware module, enhancing security by restricting allowed serial paths.
- Updated related modules to integrate new serial device functionality.
---------
Co-authored-by: ehushubhamshaw <eshaw1@wpi.edu>
tool_search activates deferred MCP tools into ActivatedToolSet at runtime.
When tool_search runs in parallel with the tools it activates, a race
condition occurs where tool lookups happen before activation completes,
resulting in "Unknown tool" errors.
Force sequential execution in should_execute_tools_in_parallel() whenever
tool_search is present in the tool call batch.
Co-authored-by: Claude Code (claude-opus-4-6) <noreply@anthropic.com>
Preserve assistant text from native tool-call turns in draft updates. Falls back to response_text when parsed_text is empty and native tool calls are present. Relays text through on_delta for draft-capable channels like Telegram.
Supersedes #3976. Closes#3974
Resolve conflict in src/channels/mod.rs Safety section. Keeps the
PR's AutonomyConfig-based prompt construction (build_system_prompt_with_mode_and_autonomy)
while incorporating master's granular safety rules (conditional
destructive-command and ask-before-acting lines based on autonomy level).
Also fixes missing autonomy_level arg in refresh-skills test and removes
duplicate autonomy.level args from auto-merged call sites.
- Channel tool filtering (`non_cli_excluded_tools`) now respects
`autonomy.level = "full"` — full-autonomy agents keep all tools
available regardless of channel.
- Gateway `process_message` now creates and passes an `ApprovalManager`
to `agent_turn`, so `ReadOnly`/`Supervised` policies are enforced
instead of silently skipped.
- Gateway also applies `non_cli_excluded_tools` filtering with the same
full-autonomy bypass.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
When autonomy.level is set to "full", the channel/web system prompt no
longer includes instructions telling the model to ask for permission
before executing tools. Previously these safety lines were hardcoded
regardless of autonomy config, causing the LLM to simulate approval
dialogs in channel and web-interface modes even though the
ApprovalManager correctly allowed execution.
The fix adds an autonomy_level parameter to build_system_prompt_with_mode
and conditionally omits the "ask before acting" instructions when the
level is Full. Core safety rules (no data exfiltration, prefer trash)
are always included.
Add a locale-aware tool description system that loads translations from
TOML files in tool_descriptions/. This enables non-English users to see
tool descriptions in their language.
- Add src/i18n.rs module with ToolDescriptions loader, locale detection
(ZEROCLAW_LOCALE, LANG, LC_ALL env vars), and English fallback chain
- Add locale config field to Config struct for explicit locale override
- Create tool_descriptions/en.toml with all 47 tool descriptions
- Create tool_descriptions/zh-CN.toml with Chinese translations
- Integrate with ToolsSection::build() and build_tool_instructions()
to resolve descriptions from locale files before hardcoded fallback
- Add PromptContext.tool_descriptions field for prompt-time resolution
- Add AgentBuilder.tool_descriptions() setter for Agent construction
- Include tool_descriptions/ in Cargo.toml package include list
- Add 8 unit tests covering locale loading, fallback chains, env
detection, and config override
Closes#3901
The seen_tool_signatures HashSet was initialized outside the iteration loop, causing cross-iteration deduplication of legitimate tool calls. This triggered a self-correction spiral where the agent repeatedly attempted skipped calls until hitting max_iterations.
Moving the HashSet inside the loop ensures deduplication only applies within a single iteration, as originally intended.
Fixes#3798
Add support for switching AI models at runtime during a conversation.
The model_switch tool allows users to:
- Get current model state
- List available providers
- List models for a provider
- Switch to a different model
The switch takes effect immediately for the current conversation by
recreating the provider with the new model after tool execution.
Risk: Medium - internal state changes and provider recreation
* feat(runtime): add configurable reasoning effort
* fix(test): add missing reasoning_effort field in live test
Add reasoning_effort: None to ProviderRuntimeOptions construction in
openai_codex_vision_e2e.rs to fix E0063 compile error.
---------
Co-authored-by: Alix-007 <267018309+Alix-007@users.noreply.github.com>
* fix(tools): wire ActivatedToolSet into tool dispatch and spec advertisement
When deferred MCP tools are activated via tool_search, they are stored
in ActivatedToolSet but never consulted by the tool call loop.
tool_specs is built once before the iteration loop and never refreshed,
so the provider API tools[] parameter never includes activated tools.
find_tool only searches the static registry, so execution dispatch also
fails silently.
Thread Arc<Mutex<ActivatedToolSet>> from creation sites through to
run_tool_call_loop. Rebuild tool_specs each iteration to merge base
registry specs with activated specs. Add fallback in execute_one_tool
to check the activated set when the static registry lookup misses.
Change ActivatedToolSet internal storage from Box<dyn Tool> to
Arc<dyn Tool> so we can clone the Arc out of the mutex guard before
awaiting tool.execute() (std::sync::MutexGuard is not Send).
* fix(tools): add activated_tools field to new ChannelRuntimeContext test site
* fix(agent): remove bare URL → curl fallback in GLM-style tool call parser
The `parse_glm_style_tool_calls` function had a "Plain URL" fallback
that converted any bare URL line (e.g. `https://example.com`) into a
`shell` tool call running `curl -s '<url>'`. This caused:
- False positives: normal URLs in LLM replies misinterpreted as tool calls
- Swallowed replies: text with URLs not forwarded to the channel
- Unintended shell commands: `curl` executed without user intent
Explicit GLM-format tool calls like `browser_open/url>https://...` and
`shell/command>...` are unaffected — only the bare URL catch-all is
removed.
* style: cargo fmt
---------
Co-authored-by: argenis de la rosa <theonlyhennygod@gmail.com>
Add an optional `allowed_tools` parameter that restricts which tools are
available to the agent. When `Some(list)`, only tools whose name appears
in the list are retained; when `None`, all tools remain available
(backward compatible). This enables fine-grained capability control for
cron jobs, heartbeat tasks, and CLI invocations.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Channel-driven runs (Telegram, Matrix, Discord, etc.) previously bypassed
the ApprovalManager entirely — `None` was passed into the tool-call loop,
so `auto_approve`, `always_ask`, and supervised approval checks were
silently skipped for all non-CLI execution paths.
Add a non-interactive mode to ApprovalManager that enforces the same
autonomy config policies but auto-denies tools requiring interactive
approval (since no operator is present on channel runs). Specifically:
- Add `ApprovalManager::for_non_interactive()` constructor that creates
a manager which auto-denies tools needing approval instead of prompting
- Add `is_non_interactive()` method so the tool-call loop can distinguish
interactive (CLI prompt) from non-interactive (auto-deny) managers
- Update tool-call loop: non-interactive managers auto-deny instead of
the previous auto-approve behavior for non-CLI channels
- Wire the non-interactive approval manager into ChannelRuntimeContext
so channel runs enforce the full approval policy
- Add 8 tests covering non-interactive approval behavior
Security implications:
- `always_ask` tools are now denied on channels (previously bypassed)
- Supervised-mode unknown tools are now denied on channels (previously
bypassed)
- `auto_approve` tools continue to work on channels unchanged
- `full` autonomy mode is unaffected (no approval needed regardless)
- `read_only` mode is unaffected (blocks execution elsewhere)
Closes#3487
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When a tool call fails (security policy block, hook cancellation, user
denial, or execution error), the failure reason is now included in the
progress message sent to the chat channel via on_delta. Previously only
a ❌ icon was shown; now users see the actual reason (e.g. "Command not
allowed by security policy") without needing to check `zeroclaw doctor
traces`.
Closes#3628
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Comprehensive long-running context upgrades:
- Token-based compaction: replace message-count trigger with token
estimation (~4 chars/token). Compaction fires when estimated tokens
exceed max_context_tokens (default 32K) OR message count exceeds
max_history_messages. Cuts at user-turn boundaries only.
- Persistent sessions: JSONL append-only session files per channel
sender in {workspace}/sessions/. Sessions survive daemon restarts.
Hydrates in-memory history from disk on startup.
- LLM-driven memory consolidation: two-phase extraction after each
conversation turn. Phase 1 writes a timestamped history entry (Daily).
Phase 2 extracts new facts/preferences to Core memory (if any).
Replaces raw message auto-save with semantic extraction.
- New config fields: agent.max_context_tokens (32000),
channels_config.session_persistence (true).
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
* feat(provider): support custom API path suffix for custom: endpoints
Allow users to configure a custom API path for custom/compatible
providers instead of hardcoding /v1/chat/completions. Some self-hosted
LLM servers use different API paths.
Adds an optional `api_path` field to:
- Config (top-level and model_providers profile)
- ProviderRuntimeOptions
- OpenAiCompatibleProvider
When set, the custom path is appended to base_url instead of the
default /chat/completions suffix.
Closes#3125
* fix: add missing api_path field to test ModelProviderConfig initializers
Add deferred MCP tool activation to reduce context window waste.
When mcp.deferred_loading is true (the default), MCP tool schemas
are not eagerly included in the LLM context. Instead, only tool
names appear in an <available-deferred-tools> system prompt section,
and the LLM calls the built-in tool_search tool to fetch full schemas
on demand. Setting deferred_loading to false preserves the existing
eager behavior.
Closes#3095
MCP tools were not visible to delegate subagents because parent_tools
was a static snapshot taken before MCP tool wiring. Switch to interior
mutability (parking_lot::RwLock) so MCP wrappers pushed after
DelegateTool construction are visible at sub-agent execution time.
Closes#3069