Commit Graph

31 Commits

Author SHA1 Message Date
Ericsunsk
b5af73cac6
fix(memory): filter autosave noise and scope recall/store by session (#3695)
* fix(memory): filter autosave noise and scope memory by session

* style: format rebase-resolved gateway and memory loader

* fix(tests): update memory loader mock for session-aware context

* fix(openai-codex): decode utf-8 safely across stream chunks
2026-03-24 15:17:18 +03:00
argenis de la rosa
18cb38b09e
feat(cache): wire two-tier response cache, multi-provider token tracking, and cache analytics
- Two-tier response cache: in-memory LRU (hot) + SQLite (warm) with TTL-aware eviction
- Wire response cache into agent turn loop (temp==0.0, text-only responses only)
- Parse Anthropic cache_creation_input_tokens/cache_read_input_tokens
- Parse OpenAI prompt_tokens_details.cached_tokens
- Add cached_input_tokens to TokenUsage, prompt_caching to ProviderCapabilities
- Add CacheHit/CacheMiss observer events with Prometheus counters
- Add response_cache_hot_entries config field (default: 256)
2026-03-24 15:17:14 +03:00
argenis de la rosa
6fcb64489b
feat(security): add capability-based tool access control
Add an optional `allowed_tools` parameter that restricts which tools are
available to the agent. When `Some(list)`, only tools whose name appears
in the list are retained; when `None`, all tools remain available
(backward compatible). This enables fine-grained capability control for
cron jobs, heartbeat tasks, and CLI invocations.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-24 15:16:07 +03:00
Argenis
939edf5e86
fix: expose MCP tools to delegate subagents (#3436)
MCP tools were not visible to delegate subagents because parent_tools
was a static snapshot taken before MCP tool wiring. Switch to interior
mutability (parking_lot::RwLock) so MCP wrappers pushed after
DelegateTool construction are visible at sub-agent execution time.

Closes #3069
2026-03-13 16:26:01 -04:00
reidliu41
96700d7952
Summary
- Problem: The existing http_request tool returns raw HTML/JSON, which is nearly unusable for LLMs to extract
  meaningful content from web pages.
- Why it matters: All mainstream AI agents (Claude Code, Gemini CLI, Aider) have dedicated web content extraction
  tools. ZeroClaw lacks this capability, limiting its ability to research and gather information from the web.
- What changed: Added a new web_fetch tool that fetches web pages and converts HTML to clean plain text using
  nanohtml2text. Includes domain allowlist/blocklist, SSRF protection, redirect following, and content-type aware
  processing.
- What did not change (scope boundary): http_request tool is untouched. No shared code extracted between http_request
   and web_fetch (DRY rule-of-three: only 2 callers). No changes to existing tool behavior or defaults.

Label Snapshot (required)

  - Risk label: risk: medium
  - Size label: size: M
  - Scope labels: tool, config
  - Module labels: tool: web_fetch
  - If any auto-label is incorrect, note requested correction: N/A

  Change Metadata

  - Change type: feature
  - Primary scope: tool

  Linked Issue

  - Closes #
  - Related #
  - Depends on #
  - Supersedes #

  Supersede Attribution (required when Supersedes # is used)

  N/A

  Validation Evidence (required)

  cargo fmt --all -- --check   # pass
  cargo clippy --all-targets -- -D warnings  # no new warnings (pre-existing warnings only)
  cargo test --lib -- web_fetch  # 26/26 passed
  cargo test --lib -- tools::tests  # 12/12 passed
  cargo test --lib -- config::schema::tests  # 134/134 passed

  - Evidence provided: unit test results (26 new tests), manual end-to-end test with Ollama + qwen2.5:72b
  - If any command is intentionally skipped, explain why: Full cargo clippy --all-targets has 43 pre-existing errors
  unrelated to this PR (e.g. await_holding_lock, format! appended to String). Zero errors from web_fetch code.

  Security Impact (required)

  - New permissions/capabilities? Yes — new web_fetch tool can make outbound HTTP GET requests
  - New external network calls? Yes — fetches web pages from allowed domains
  - Secrets/tokens handling changed? No
  - File system access scope changed? No
  - If any Yes, describe risk and mitigation:
    - Deny-by-default: enabled = false by default; tool is not registered unless explicitly enabled
    - Domain filtering: allowed_domains (default ["*"] = all public hosts) + blocked_domains (takes priority).
  Blocklist always wins over allowlist.
    - SSRF protection: Blocks localhost, private IPs (RFC 1918), link-local, multicast, reserved ranges, IPv4-mapped
  IPv6, .local TLD — identical coverage to http_request
    - Rate limiting: can_act() + record_action() enforce autonomy level and rate limits
    - Read-only mode: Blocked when autonomy is ReadOnly
    - Response size cap: 500KB default truncation prevents context window exhaustion
    - Proxy support: Honors [proxy] config via tool.web_fetch service key

  Privacy and Data Hygiene (required)

  - Data-hygiene status: pass
  - Redaction/anonymization notes: No personal data in code, tests, or fixtures
  - Neutral wording confirmation: All test identifiers use neutral project-scoped labels

  Compatibility / Migration

  - Backward compatible? Yes — new tool, no existing behavior changed
  - Config/env changes? Yes — new [web_fetch] section in config.toml (all fields have defaults)
  - Migration needed? No — #[serde(default)] on all fields; existing configs without [web_fetch] section work unchanged

  i18n Follow-Through (required when docs or user-facing wording changes)

  - i18n follow-through triggered? No — no docs or user-facing wording changes

  Human Verification (required)

  - Verified scenarios:
    - End-to-end test: zeroclaw agent with Ollama qwen2.5:72b successfully called web_fetch to fetch
  https://github.com/zeroclaw-labs/zeroclaw, returned clean plain text with project description, features, star count
    - Tool registration: tool_count increased from 22 to 23 when enabled = true
    - Config: enabled = false (default) → tool not registered; enabled = true → tool available
  - Edge cases checked:
    - Missing [web_fetch] section in existing config.toml → works (serde defaults)
    - Blocklist priority over allowlist
    - SSRF with localhost, private IPs, IPv6
  - What was not verified:
    - Proxy routing (no proxy configured in test environment)
    - Very large page truncation with real-world content

  Side Effects / Blast Radius (required)

  - Affected subsystems/workflows: all_tools_with_runtime() signature gained one parameter (web_fetch_config); all 5
  call sites updated
  - Potential unintended effects: None — new tool only, existing tools unchanged
  - Guardrails/monitoring for early detection: enabled = false default; tool_count in debug logs

  Agent Collaboration Notes (recommended)

  - Agent tools used: Claude Code (Opus 4.6)
  - Workflow/plan summary: Plan mode → approval → implementation → validation
  - Verification focus: Security (SSRF, domain filtering, rate limiting), config compatibility, tool registration
  - Confirmation: naming + architecture boundaries followed (CLAUDE.md + CONTRIBUTING.md): Yes — trait implementation +
   factory registration pattern, independent security helpers (DRY rule-of-three), deny-by-default config

  Rollback Plan (required)

  - Fast rollback command/path: git revert <commit>
  - Feature flags or config toggles: [web_fetch] enabled = false (default) disables completely
  - Observable failure symptoms: tool_count in debug logs drops by 1; LLM cannot call web_fetch

  Risks and Mitigations

  - Risk: SSRF bypass via DNS rebinding (attacker-controlled domain resolving to private IP)
    - Mitigation: Pre-request host validation blocks known private/local patterns. Same defense level as existing
  http_request tool. Full DNS-level protection would require async DNS resolution before connect, which is out of scope
   for this PR.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
(cherry picked from commit 04597352cc)
2026-02-24 16:03:00 +08:00
Edvard
baa01dab66
feat(agent): inject current datetime into every user message
Prepends [YYYY-MM-DD HH:MM:SS TZ] to each user message before it
reaches the model. This gives the agent accurate temporal context
on every turn, not just session start.

Previously DateTimeSection only injected the time once when the
system prompt was built. Long conversations or cron jobs had
stale timestamps. Now every message carries the real time.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-24 16:03:00 +08:00
argenis de la rosa
055507bd18
feat(agent): log query classification route decisions 2026-02-24 16:02:59 +08:00
Vernon Stinebaker
7e6491142e fix(provider): preserve reasoning_content in tool-call conversation history
Thinking/reasoning models (Kimi K2.5, GLM-4.7, DeepSeek-R1) return a
reasoning_content field in assistant messages containing tool calls.
ZeroClaw was silently dropping this field when constructing conversation
history, causing provider APIs to reject follow-up requests with 400
errors: "thinking is enabled but reasoning_content is missing in
assistant tool call message".

Add reasoning_content: Option<String> as an opaque pass-through at every
layer of the pipeline: ChatResponse, ConversationMessage, NativeMessage
structs, parse/convert/build functions, and dispatcher. The field is
skip_serializing_if = None so it is invisible for non-thinking models.

Closes #1327
2026-02-22 17:40:48 +08:00
s04
6f1cf8bc81 feat(provider): add usage field to ChatResponse
Add a lightweight TokenUsage struct to providers::traits with
input_tokens and output_tokens fields. Add usage: Option<TokenUsage>
to ChatResponse and update all construction sites across providers
and agent modules with usage: None.

This is the first step toward capturing token usage data from LLM
API responses. Currently all sites set usage: None — subsequent
commits will parse actual usage from each provider's response format.
2026-02-21 12:29:02 +08:00
Alex Gorevski
357a938174 fix: resolve three compilation errors breaking release-fast build
- Remove duplicate chat method in ReliableProvider impl (E0201)
  The second chat fn (lines 662-769) was an exact duplicate of the
  first (lines 540-647) in the same impl block.

- Gate PostgresMemory usage in memory CLI behind memory-postgres feature (E0433)
  super::PostgresMemory is only exported when the feature is enabled;
  the Postgres match arm now compiles to an explicit bail when the
  feature is off.

- Replace utures::future::join_all with utures_util::future::join_all (E0433)
  The crate depends on utures-util, not utures. Fixed in both
  agent.rs and loop_.rs.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-02-20 11:38:00 -08:00
Chummy
e081010983 feat(skills): add configurable compact skills prompt injection 2026-02-21 00:00:51 +08:00
Will Sarg
a9a35d50d1
fix(ci): restore containerized validation on main (#1096) 2026-02-20 07:48:58 -05:00
Edvard Schøyen
f35a365d83
fix(agent): implement actual concurrent tool execution (#1001)
When parallel_tools is enabled, both code branches in execute_tools()
ran the same sequential for loop. The parallel path was a no-op.

Use futures::future::join_all to execute tool calls concurrently when
parallel_tools is true. The futures crate is already a dependency.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-20 05:05:33 -05:00
Chummy
a2e9c0d1e1 fix(skills): make open-skills sync opt-in and configurable 2026-02-20 16:45:50 +08:00
Alex Gorevski
22bd03c65a test(quality): replace bare .unwrap() with .expect() in agent and shell tests
Replace bare .unwrap() calls with descriptive .expect() messages in
src/agent/agent.rs and src/tools/shell.rs test modules. Adds meaningful
failure context for memory creation, agent builder, and tool execution
assertions. Addresses audit finding on test assertion quality (§5.2).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-02-19 13:23:33 -08:00
Chummy
d714d3984e fix(memory): stop autosaving assistant summaries and filter legacy entries 2026-02-20 01:14:08 +08:00
Chummy
572aa77c2a feat(memory): add embedding hint routes and upgrade guidance 2026-02-19 20:49:53 +08:00
Chummy
483acccdb7 feat(memory): add configurable postgres storage backend 2026-02-18 20:29:26 +08:00
Edvard
6e53341bb1 feat(agent): add rule-based query classification for automatic model routing
Classify incoming user messages by keyword/pattern and route to the
appropriate model hint automatically, feeding into the existing
RouterProvider. Disabled by default; opt-in via [query_classification]
config section.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 14:41:58 +08:00
Edvard
8a1e7cc7ef fix(agent): use config max_tool_iterations, add memory relevance filtering, rebalance search weights
Three fixes for conversation quality issues:

1. loop_.rs and channels now read max_tool_iterations from AgentConfig
   instead of using a hardcoded constant of 10, making it configurable.

2. Memory recall now filters entries below a configurable
   min_relevance_score threshold (default 0.4), preventing unrelated
   memories from bleeding into conversation context.

3. Default hybrid search weights rebalanced from 70/30 vector/keyword
   to 40/60, reducing cross-topic semantic bleed.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 14:14:33 +08:00
Chummy
2560399423 feat(observability): focus PR 596 on Prometheus backend 2026-02-18 12:06:05 +08:00
argenis de la rosa
eba544dbd4 feat(observability): implement Prometheus metrics backend with /metrics endpoint
- Adds PrometheusObserver backend with counters, histograms, and gauges
- Tracks agent starts/duration, tool calls, channel messages, heartbeat ticks, errors, request latency, tokens, sessions, queue depth
- Adds GET /metrics endpoint to gateway for Prometheus scraping
- Adds provider/model labels to AgentStart and AgentEnd events for better observability
- Adds as_any() method to Observer trait for backend-specific downcast

Metrics exposed:
- zeroclaw_agent_starts_total (Counter) with provider/model labels
- zeroclaw_agent_duration_seconds (Histogram) with provider/model labels
- zeroclaw_tool_calls_total (Counter) with tool/success labels
- zeroclaw_tool_duration_seconds (Histogram) with tool label
- zeroclaw_channel_messages_total (Counter) with channel/direction labels
- zeroclaw_heartbeat_ticks_total (Counter)
- zeroclaw_errors_total (Counter) with component label
- zeroclaw_request_latency_seconds (Histogram)
- zeroclaw_tokens_used_last (Gauge)
- zeroclaw_active_sessions (Gauge)
- zeroclaw_queue_depth (Gauge)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-18 12:06:05 +08:00
Will Sarg
ee05d62ce4
Merge branch 'main' into pr-484-clean 2026-02-17 08:54:24 -05:00
argenis de la rosa
1908af3248 fix(discord): use channel_id instead of sender for replies (fixes #483)
fix(misc): complete parking_lot::Mutex migration (fixes #505)

- DiscordChannel: store actual channel_id in ChannelMessage.channel
  instead of hardcoded "discord" string
- channels/mod.rs: use msg.channel instead of msg.sender for replies
- Migrate all std::sync::Mutex to parking_lot::Mutex:
  * src/security/audit.rs
  * src/memory/sqlite.rs
  * src/memory/response_cache.rs
  * src/memory/lucid.rs
  * src/channels/email_channel.rs
  * src/gateway/mod.rs
  * src/observability/traits.rs
  * src/providers/reliable.rs
  * src/providers/router.rs
  * src/agent/agent.rs
- Remove all .lock().unwrap() and .map_err(PoisonError) patterns
  since parking_lot::Mutex never poisons

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-17 08:05:25 -05:00
fettpl
ebb78afda4
feat(memory): add session_id isolation to Memory trait (#530)
* feat(memory): add session_id isolation to Memory trait

Add optional session_id parameter to store(), recall(), and list()
methods across the Memory trait and all four backends (sqlite, markdown,
lucid, none). This enables per-session memory isolation so different
agent sessions cannot cross-read each other's stored memories.

Changes:
- traits.rs: Add session_id: Option<&str> to store/recall/list
- sqlite.rs: Schema migration (ALTER TABLE ADD COLUMN session_id),
  index, persist/filter by session_id in all query paths
- markdown.rs, lucid.rs, none.rs: Updated signatures
- All callers pass None for backward compatibility
- 5 new tests: session-filtered recall, cross-session isolation,
  session-filtered list, no-filter returns all, migration idempotency

Closes #518

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(channels): fix discord _channel_id typo and lark missing reply_to

Pre-existing compilation errors on main after reply_to was added to
ChannelMessage: discord.rs used _channel_id (underscore prefix) but
referenced channel_id, and lark.rs was missing the reply_to field.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-17 07:44:05 -05:00
Kieran
808450c48e feat: custom global api_url 2026-02-17 18:48:45 +08:00
Chummy
8371f412f8 feat(observability): propagate optional cost_usd on agent end 2026-02-17 18:16:12 +08:00
mai1015
0e9852ec06 feat: pass a cloned config to all_tools_with_runtime for improved tool initialization 2026-02-17 17:06:28 +08:00
Chummy
413ecfd143 fix(rebase): resolve main drift and restore CI contracts 2026-02-17 01:01:57 +08:00
mai1015
dc5e14d7d2 refactor: improve code formatting and structure across multiple files 2026-02-17 01:01:56 +08:00
mai1015
b341fdb368 feat: add agent structure and improve tooling for provider 2026-02-17 01:01:56 +08:00