Commit Graph

34 Commits

Author SHA1 Message Date
Argenis 1f7c3c99e4 feat(i18n): externalize tool descriptions for translation (#3912)
Add a locale-aware tool description system that loads translations from
TOML files in tool_descriptions/. This enables non-English users to see
tool descriptions in their language.

- Add src/i18n.rs module with ToolDescriptions loader, locale detection
  (ZEROCLAW_LOCALE, LANG, LC_ALL env vars), and English fallback chain
- Add locale config field to Config struct for explicit locale override
- Create tool_descriptions/en.toml with all 47 tool descriptions
- Create tool_descriptions/zh-CN.toml with Chinese translations
- Integrate with ToolsSection::build() and build_tool_instructions()
  to resolve descriptions from locale files before hardcoded fallback
- Add PromptContext.tool_descriptions field for prompt-time resolution
- Add AgentBuilder.tool_descriptions() setter for Agent construction
- Include tool_descriptions/ in Cargo.toml package include list
- Add 8 unit tests covering locale loading, fallback chains, env
  detection, and config override

Closes #3901
2026-03-18 17:01:39 -04:00
Argenis fc2aac7c94 feat(gateway): persist WS chat sessions across restarts (#3813)
Gateway WebSocket chat sessions were in-memory only — conversation
history was lost on gateway restart, macOS sleep/wake, or client
reconnect. This wires up the existing SessionBackend (SQLite) to
the gateway WS handler so sessions survive restarts and reconnections.

Changes:
- Add delete_session() to SessionBackend trait + SQLite implementation
- Add session_persistence and session_ttl_hours to GatewayConfig
- Add Agent::seed_history() to hydrate agent from persisted messages
- Initialize SqliteSessionBackend in run_gateway() when enabled
- Send session_start message on WS connect with session_id + resumed
- Persist user/assistant messages after each turn
- Add GET /api/sessions and DELETE /api/sessions/{id} REST endpoints
- Bump version to 0.5.0

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-17 14:26:39 -04:00
Argenis f3fbd1b094 fix(web): preserve provider runtime options in ws agent (#3807)
Co-authored-by: Alix-007 <267018309+Alix-007@users.noreply.github.com>
2026-03-17 14:06:22 -04:00
Ericsunsk 83803cef5b fix(memory): filter autosave noise and scope recall/store by session (#3695)
* fix(memory): filter autosave noise and scope memory by session

* style: format rebase-resolved gateway and memory loader

* fix(tests): update memory loader mock for session-aware context

* fix(openai-codex): decode utf-8 safely across stream chunks
2026-03-16 16:36:35 -04:00
argenis de la rosa 98688c61ff feat(cache): wire two-tier response cache, multi-provider token tracking, and cache analytics
- Two-tier response cache: in-memory LRU (hot) + SQLite (warm) with TTL-aware eviction
- Wire response cache into agent turn loop (temp==0.0, text-only responses only)
- Parse Anthropic cache_creation_input_tokens/cache_read_input_tokens
- Parse OpenAI prompt_tokens_details.cached_tokens
- Add cached_input_tokens to TokenUsage, prompt_caching to ProviderCapabilities
- Add CacheHit/CacheMiss observer events with Prometheus counters
- Add response_cache_hot_entries config field (default: 256)
2026-03-16 12:44:48 -04:00
argenis de la rosa fabd35c4ea feat(security): add capability-based tool access control
Add an optional `allowed_tools` parameter that restricts which tools are
available to the agent. When `Some(list)`, only tools whose name appears
in the list are retained; when `None`, all tools remain available
(backward compatible). This enables fine-grained capability control for
cron jobs, heartbeat tasks, and CLI invocations.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-15 19:34:34 -04:00
Argenis 939edf5e86 fix: expose MCP tools to delegate subagents (#3436)
MCP tools were not visible to delegate subagents because parent_tools
was a static snapshot taken before MCP tool wiring. Switch to interior
mutability (parking_lot::RwLock) so MCP wrappers pushed after
DelegateTool construction are visible at sub-agent execution time.

Closes #3069
2026-03-13 16:26:01 -04:00
reidliu41 96700d7952 Summary
- Problem: The existing http_request tool returns raw HTML/JSON, which is nearly unusable for LLMs to extract
  meaningful content from web pages.
- Why it matters: All mainstream AI agents (Claude Code, Gemini CLI, Aider) have dedicated web content extraction
  tools. ZeroClaw lacks this capability, limiting its ability to research and gather information from the web.
- What changed: Added a new web_fetch tool that fetches web pages and converts HTML to clean plain text using
  nanohtml2text. Includes domain allowlist/blocklist, SSRF protection, redirect following, and content-type aware
  processing.
- What did not change (scope boundary): http_request tool is untouched. No shared code extracted between http_request
   and web_fetch (DRY rule-of-three: only 2 callers). No changes to existing tool behavior or defaults.

Label Snapshot (required)

  - Risk label: risk: medium
  - Size label: size: M
  - Scope labels: tool, config
  - Module labels: tool: web_fetch
  - If any auto-label is incorrect, note requested correction: N/A

  Change Metadata

  - Change type: feature
  - Primary scope: tool

  Linked Issue

  - Closes #
  - Related #
  - Depends on #
  - Supersedes #

  Supersede Attribution (required when Supersedes # is used)

  N/A

  Validation Evidence (required)

  cargo fmt --all -- --check   # pass
  cargo clippy --all-targets -- -D warnings  # no new warnings (pre-existing warnings only)
  cargo test --lib -- web_fetch  # 26/26 passed
  cargo test --lib -- tools::tests  # 12/12 passed
  cargo test --lib -- config::schema::tests  # 134/134 passed

  - Evidence provided: unit test results (26 new tests), manual end-to-end test with Ollama + qwen2.5:72b
  - If any command is intentionally skipped, explain why: Full cargo clippy --all-targets has 43 pre-existing errors
  unrelated to this PR (e.g. await_holding_lock, format! appended to String). Zero errors from web_fetch code.

  Security Impact (required)

  - New permissions/capabilities? Yes — new web_fetch tool can make outbound HTTP GET requests
  - New external network calls? Yes — fetches web pages from allowed domains
  - Secrets/tokens handling changed? No
  - File system access scope changed? No
  - If any Yes, describe risk and mitigation:
    - Deny-by-default: enabled = false by default; tool is not registered unless explicitly enabled
    - Domain filtering: allowed_domains (default ["*"] = all public hosts) + blocked_domains (takes priority).
  Blocklist always wins over allowlist.
    - SSRF protection: Blocks localhost, private IPs (RFC 1918), link-local, multicast, reserved ranges, IPv4-mapped
  IPv6, .local TLD — identical coverage to http_request
    - Rate limiting: can_act() + record_action() enforce autonomy level and rate limits
    - Read-only mode: Blocked when autonomy is ReadOnly
    - Response size cap: 500KB default truncation prevents context window exhaustion
    - Proxy support: Honors [proxy] config via tool.web_fetch service key

  Privacy and Data Hygiene (required)

  - Data-hygiene status: pass
  - Redaction/anonymization notes: No personal data in code, tests, or fixtures
  - Neutral wording confirmation: All test identifiers use neutral project-scoped labels

  Compatibility / Migration

  - Backward compatible? Yes — new tool, no existing behavior changed
  - Config/env changes? Yes — new [web_fetch] section in config.toml (all fields have defaults)
  - Migration needed? No — #[serde(default)] on all fields; existing configs without [web_fetch] section work unchanged

  i18n Follow-Through (required when docs or user-facing wording changes)

  - i18n follow-through triggered? No — no docs or user-facing wording changes

  Human Verification (required)

  - Verified scenarios:
    - End-to-end test: zeroclaw agent with Ollama qwen2.5:72b successfully called web_fetch to fetch
  https://github.com/zeroclaw-labs/zeroclaw, returned clean plain text with project description, features, star count
    - Tool registration: tool_count increased from 22 to 23 when enabled = true
    - Config: enabled = false (default) → tool not registered; enabled = true → tool available
  - Edge cases checked:
    - Missing [web_fetch] section in existing config.toml → works (serde defaults)
    - Blocklist priority over allowlist
    - SSRF with localhost, private IPs, IPv6
  - What was not verified:
    - Proxy routing (no proxy configured in test environment)
    - Very large page truncation with real-world content

  Side Effects / Blast Radius (required)

  - Affected subsystems/workflows: all_tools_with_runtime() signature gained one parameter (web_fetch_config); all 5
  call sites updated
  - Potential unintended effects: None — new tool only, existing tools unchanged
  - Guardrails/monitoring for early detection: enabled = false default; tool_count in debug logs

  Agent Collaboration Notes (recommended)

  - Agent tools used: Claude Code (Opus 4.6)
  - Workflow/plan summary: Plan mode → approval → implementation → validation
  - Verification focus: Security (SSRF, domain filtering, rate limiting), config compatibility, tool registration
  - Confirmation: naming + architecture boundaries followed (CLAUDE.md + CONTRIBUTING.md): Yes — trait implementation +
   factory registration pattern, independent security helpers (DRY rule-of-three), deny-by-default config

  Rollback Plan (required)

  - Fast rollback command/path: git revert <commit>
  - Feature flags or config toggles: [web_fetch] enabled = false (default) disables completely
  - Observable failure symptoms: tool_count in debug logs drops by 1; LLM cannot call web_fetch

  Risks and Mitigations

  - Risk: SSRF bypass via DNS rebinding (attacker-controlled domain resolving to private IP)
    - Mitigation: Pre-request host validation blocks known private/local patterns. Same defense level as existing
  http_request tool. Full DNS-level protection would require async DNS resolution before connect, which is out of scope
   for this PR.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
(cherry picked from commit 04597352cc)
2026-02-24 16:03:00 +08:00
Edvard baa01dab66 feat(agent): inject current datetime into every user message
Prepends [YYYY-MM-DD HH:MM:SS TZ] to each user message before it
reaches the model. This gives the agent accurate temporal context
on every turn, not just session start.

Previously DateTimeSection only injected the time once when the
system prompt was built. Long conversations or cron jobs had
stale timestamps. Now every message carries the real time.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-24 16:03:00 +08:00
argenis de la rosa 055507bd18 feat(agent): log query classification route decisions 2026-02-24 16:02:59 +08:00
Vernon Stinebaker 7e6491142e fix(provider): preserve reasoning_content in tool-call conversation history
Thinking/reasoning models (Kimi K2.5, GLM-4.7, DeepSeek-R1) return a
reasoning_content field in assistant messages containing tool calls.
ZeroClaw was silently dropping this field when constructing conversation
history, causing provider APIs to reject follow-up requests with 400
errors: "thinking is enabled but reasoning_content is missing in
assistant tool call message".

Add reasoning_content: Option<String> as an opaque pass-through at every
layer of the pipeline: ChatResponse, ConversationMessage, NativeMessage
structs, parse/convert/build functions, and dispatcher. The field is
skip_serializing_if = None so it is invisible for non-thinking models.

Closes #1327
2026-02-22 17:40:48 +08:00
s04 6f1cf8bc81 feat(provider): add usage field to ChatResponse
Add a lightweight TokenUsage struct to providers::traits with
input_tokens and output_tokens fields. Add usage: Option<TokenUsage>
to ChatResponse and update all construction sites across providers
and agent modules with usage: None.

This is the first step toward capturing token usage data from LLM
API responses. Currently all sites set usage: None — subsequent
commits will parse actual usage from each provider's response format.
2026-02-21 12:29:02 +08:00
Alex Gorevski 357a938174 fix: resolve three compilation errors breaking release-fast build
- Remove duplicate chat method in ReliableProvider impl (E0201)
  The second chat fn (lines 662-769) was an exact duplicate of the
  first (lines 540-647) in the same impl block.

- Gate PostgresMemory usage in memory CLI behind memory-postgres feature (E0433)
  super::PostgresMemory is only exported when the feature is enabled;
  the Postgres match arm now compiles to an explicit bail when the
  feature is off.

- Replace utures::future::join_all with utures_util::future::join_all (E0433)
  The crate depends on utures-util, not utures. Fixed in both
  agent.rs and loop_.rs.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-02-20 11:38:00 -08:00
Chummy e081010983 feat(skills): add configurable compact skills prompt injection 2026-02-21 00:00:51 +08:00
Will Sarg a9a35d50d1 fix(ci): restore containerized validation on main (#1096) 2026-02-20 07:48:58 -05:00
Edvard Schøyen f35a365d83 fix(agent): implement actual concurrent tool execution (#1001)
When parallel_tools is enabled, both code branches in execute_tools()
ran the same sequential for loop. The parallel path was a no-op.

Use futures::future::join_all to execute tool calls concurrently when
parallel_tools is true. The futures crate is already a dependency.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-20 05:05:33 -05:00
Chummy a2e9c0d1e1 fix(skills): make open-skills sync opt-in and configurable 2026-02-20 16:45:50 +08:00
Alex Gorevski 22bd03c65a test(quality): replace bare .unwrap() with .expect() in agent and shell tests
Replace bare .unwrap() calls with descriptive .expect() messages in
src/agent/agent.rs and src/tools/shell.rs test modules. Adds meaningful
failure context for memory creation, agent builder, and tool execution
assertions. Addresses audit finding on test assertion quality (§5.2).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-02-19 13:23:33 -08:00
Chummy d714d3984e fix(memory): stop autosaving assistant summaries and filter legacy entries 2026-02-20 01:14:08 +08:00
Chummy 572aa77c2a feat(memory): add embedding hint routes and upgrade guidance 2026-02-19 20:49:53 +08:00
Chummy 483acccdb7 feat(memory): add configurable postgres storage backend 2026-02-18 20:29:26 +08:00
Edvard 6e53341bb1 feat(agent): add rule-based query classification for automatic model routing
Classify incoming user messages by keyword/pattern and route to the
appropriate model hint automatically, feeding into the existing
RouterProvider. Disabled by default; opt-in via [query_classification]
config section.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 14:41:58 +08:00
Edvard 8a1e7cc7ef fix(agent): use config max_tool_iterations, add memory relevance filtering, rebalance search weights
Three fixes for conversation quality issues:

1. loop_.rs and channels now read max_tool_iterations from AgentConfig
   instead of using a hardcoded constant of 10, making it configurable.

2. Memory recall now filters entries below a configurable
   min_relevance_score threshold (default 0.4), preventing unrelated
   memories from bleeding into conversation context.

3. Default hybrid search weights rebalanced from 70/30 vector/keyword
   to 40/60, reducing cross-topic semantic bleed.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 14:14:33 +08:00
Chummy 2560399423 feat(observability): focus PR 596 on Prometheus backend 2026-02-18 12:06:05 +08:00
argenis de la rosa eba544dbd4 feat(observability): implement Prometheus metrics backend with /metrics endpoint
- Adds PrometheusObserver backend with counters, histograms, and gauges
- Tracks agent starts/duration, tool calls, channel messages, heartbeat ticks, errors, request latency, tokens, sessions, queue depth
- Adds GET /metrics endpoint to gateway for Prometheus scraping
- Adds provider/model labels to AgentStart and AgentEnd events for better observability
- Adds as_any() method to Observer trait for backend-specific downcast

Metrics exposed:
- zeroclaw_agent_starts_total (Counter) with provider/model labels
- zeroclaw_agent_duration_seconds (Histogram) with provider/model labels
- zeroclaw_tool_calls_total (Counter) with tool/success labels
- zeroclaw_tool_duration_seconds (Histogram) with tool label
- zeroclaw_channel_messages_total (Counter) with channel/direction labels
- zeroclaw_heartbeat_ticks_total (Counter)
- zeroclaw_errors_total (Counter) with component label
- zeroclaw_request_latency_seconds (Histogram)
- zeroclaw_tokens_used_last (Gauge)
- zeroclaw_active_sessions (Gauge)
- zeroclaw_queue_depth (Gauge)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 12:06:05 +08:00
Will Sarg ee05d62ce4 Merge branch 'main' into pr-484-clean 2026-02-17 08:54:24 -05:00
argenis de la rosa 1908af3248 fix(discord): use channel_id instead of sender for replies (fixes #483)
fix(misc): complete parking_lot::Mutex migration (fixes #505)

- DiscordChannel: store actual channel_id in ChannelMessage.channel
  instead of hardcoded "discord" string
- channels/mod.rs: use msg.channel instead of msg.sender for replies
- Migrate all std::sync::Mutex to parking_lot::Mutex:
  * src/security/audit.rs
  * src/memory/sqlite.rs
  * src/memory/response_cache.rs
  * src/memory/lucid.rs
  * src/channels/email_channel.rs
  * src/gateway/mod.rs
  * src/observability/traits.rs
  * src/providers/reliable.rs
  * src/providers/router.rs
  * src/agent/agent.rs
- Remove all .lock().unwrap() and .map_err(PoisonError) patterns
  since parking_lot::Mutex never poisons

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-17 08:05:25 -05:00
fettpl ebb78afda4 feat(memory): add session_id isolation to Memory trait (#530)
* feat(memory): add session_id isolation to Memory trait

Add optional session_id parameter to store(), recall(), and list()
methods across the Memory trait and all four backends (sqlite, markdown,
lucid, none). This enables per-session memory isolation so different
agent sessions cannot cross-read each other's stored memories.

Changes:
- traits.rs: Add session_id: Option<&str> to store/recall/list
- sqlite.rs: Schema migration (ALTER TABLE ADD COLUMN session_id),
  index, persist/filter by session_id in all query paths
- markdown.rs, lucid.rs, none.rs: Updated signatures
- All callers pass None for backward compatibility
- 5 new tests: session-filtered recall, cross-session isolation,
  session-filtered list, no-filter returns all, migration idempotency

Closes #518

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(channels): fix discord _channel_id typo and lark missing reply_to

Pre-existing compilation errors on main after reply_to was added to
ChannelMessage: discord.rs used _channel_id (underscore prefix) but
referenced channel_id, and lark.rs was missing the reply_to field.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-17 07:44:05 -05:00
Kieran 808450c48e feat: custom global api_url 2026-02-17 18:48:45 +08:00
Chummy 8371f412f8 feat(observability): propagate optional cost_usd on agent end 2026-02-17 18:16:12 +08:00
mai1015 0e9852ec06 feat: pass a cloned config to all_tools_with_runtime for improved tool initialization 2026-02-17 17:06:28 +08:00
Chummy 413ecfd143 fix(rebase): resolve main drift and restore CI contracts 2026-02-17 01:01:57 +08:00
mai1015 dc5e14d7d2 refactor: improve code formatting and structure across multiple files 2026-02-17 01:01:56 +08:00
mai1015 b341fdb368 feat: add agent structure and improve tooling for provider 2026-02-17 01:01:56 +08:00