Commit Graph

931 Commits

Author SHA1 Message Date
Chummy
c876a03819 feat(gateway): add experimental node-control scaffold API 2026-02-24 22:03:53 +08:00
reidliu41
56ffcd4477 feat(tool): add background process management tool (spawn/list/output/kill) 2026-02-24 21:53:23 +08:00
Chummy
f31a8efd7b supersede: replay changes from #1247
Automated replay on latest dev.
2026-02-24 21:18:50 +08:00
reidliu41
d6d32400fa feat(tool): add session-scoped task_plan tool for multi-step work tracking
- Base branch target: dev
  - Problem: ZeroClaw agents have no structured way to decompose complex tasks into trackable steps, falling behind
  every comparable agent runtime
  - Why it matters: Without task tracking, multi-step work is fragile (lost on context compression), invisible to users
   (no progress signal), and error-prone (agent loses track of what's done vs. pending)
  - What changed: Added a session-scoped task_plan tool with create/add/update/list/delete actions, integrated with
  SecurityPolicy, registered in the tool factory
  - What did not change: No config schema changes, no persistence layer, no CLI subcommand, no changes to agent loop or
   any other subsystem

  Label Snapshot

  - Risk label: risk: low
  - Size label: size: S
  - Scope labels: tool
  - Module labels: tool: task_plan
  - Contributor tier label: (auto-managed)
  - If any auto-label is incorrect: N/A

  Change Metadata

  - Change type: feature
  - Primary scope: tool

  Linked Issue

  - Closes #(issue number)
  - Related: N/A
  - Depends on: N/A
  - Supersedes: N/A

  Supersede Attribution

  N/A — no superseded PRs.

  Validation Evidence

  cargo fmt --all -- --check    # pass (no output)
  cargo clippy --all-targets -- -D warnings  # task_plan.rs: 0 warnings (pre-existing warnings in other files
  unrelated)
  cargo test --lib tools::task_plan  # 15/15 passed

  - Evidence provided: test output (15 passed, 0 failed)
  - If any command is intentionally skipped: cargo clippy reports pre-existing warnings in unrelated files
  (onboard/wizard.rs etc.); task_plan.rs itself has zero clippy warnings

  Security Impact

  - New permissions/capabilities? No — uses existing ToolOperation::Act enforcement
  - New external network calls? No
  - Secrets/tokens handling changed? No
  - File system access scope changed? No

  Privacy and Data Hygiene

  - Data-hygiene status: pass
  - Redaction/anonymization notes: No identity data in code or tests. Test fixtures use neutral strings ("step one",
  "do thing", "first")
  - Neutral wording confirmation: All naming follows ZeroClaw/project-native conventions

  Compatibility / Migration

  - Backward compatible? Yes
  - Config/env changes? No
  - Migration needed? No

  i18n Follow-Through

  - i18n follow-through triggered? No — no docs or user-facing wording changes

  Human Verification

  - Verified scenarios: Ran ./target/debug/zeroclaw agent -m "调用 task_plan 工具,action=list" — agent correctly
  identified and called task_plan, returned "No tasks."
  - Edge cases checked: read-only mode blocks mutations, empty task list, invalid action names, missing required
  parameters, create replaces existing list, ID auto-increment after add
  - What was not verified: Behavior with non-CLI channels (Telegram, Discord); behavior with XML-fallback dispatcher
  (non-native-tool providers)

  Side Effects / Blast Radius

  - Affected subsystems/workflows: src/tools/ only — tool factory gains one additional entry
  - Potential unintended effects: Marginally increases tool spec payload size sent to LLM (one more tool definition).
  Could theoretically cause tool name confusion with schedule if LLM descriptions are ambiguous — mitigated by distinct
   naming (task_plan vs schedule) and different description wording.
  - Guardrails/monitoring for early detection: Standard tool dispatch logging. Tool is session-scoped so no persistent
  side effects on failure.

  Agent Collaboration Notes

  - Agent tools used: Claude Code for implementation assistance and review
  - Workflow/plan summary: Implement Tool trait → register in factory → validate with tests → manual agent session test
  - Verification focus: Security policy enforcement, parameter validation edge cases, all 5 action paths
  - Confirmation: naming + architecture boundaries followed (CLAUDE.md §6.3, §6.4, §7.3): Yes

  Rollback Plan

  - Fast rollback command/path: git revert <commit> — removes 3 lines from mod.rs and deletes task_plan.rs
  - Feature flags or config toggles: None needed — tool is stateless and session-scoped
  - Observable failure symptoms: Tool not appearing in agent tool list, or tool returning errors on valid input

  Risks and Mitigations

  - Risk: LLM may occasionally confuse task_plan (action: list) with schedule (action: list) due to similar parameter
  structure
    - Mitigation: Distinct tool names and descriptions; task_plan description emphasizes "session checklist" while
  schedule emphasizes "cron/recurring tasks"
2026-02-24 20:52:31 +08:00
guitaripod
bd924a90dd fix(telegram): route image-extension Documents through vision pipeline
format_attachment_content was matching only Photo for [IMAGE:] routing.
Documents with image extensions (jpg, png, gif, webp, bmp) were formatted as
[Document: name] /path, bypassing the multimodal pipeline entirely.

Extend the match arm to cover Document when is_image_extension returns true,
so both Photos and image Documents produce [IMAGE:/path] and reach the provider
as proper vision input blocks.

Adds regression tests covering Document+image extension → [IMAGE:] and
Document+non-image extension → [Document:] paths.
2026-02-24 20:41:34 +08:00
guitaripod
d9c6dc4e04 fix(anthropic): send image content as proper API vision blocks
The Anthropic provider had no Image variant in NativeContentOut, so
[IMAGE:data:image/jpeg;base64,...] markers produced by the multimodal
pipeline were sent to the API as plain text. The API counted every
base64 character as a token, reliably exceeding the 200k token limit
for any real image (a typical Telegram-compressed photo produced
~130k tokens of base64 text alone).

Fix:
- Add ImageSource struct and Image variant to NativeContentOut that
  serializes to the Anthropic Messages API image content block format
- Add parse_inline_image() to decode data URI markers into Image blocks
- Add build_user_content_blocks() to split user message content into
  Text and Image blocks using the existing parse_image_markers helper
- Update convert_messages() user arm to use build_user_content_blocks()
- Handle Image in the apply_cache_to_last_message no-op arm

Fixes #1626
2026-02-24 20:28:15 +08:00
guitaripod
b61f7403bf fix(anthropic): implement capabilities() to enable vision support
Set vision: true so image inputs are accepted by the capability gate.
Set native_tool_calling: true to align capabilities() with the existing
supports_native_tools() which always returned true, eliminating the
silent inconsistency between the two.

Adds a unit test that fails if either capability regresses.
2026-02-24 20:08:36 +08:00
Chummy
54dd7a4a9b feat(qq): add webhook receive mode with challenge validation 2026-02-24 19:30:36 +08:00
Chummy
0083aece57 fix(gateway): normalize masked reliability api_keys in config PUT 2026-02-24 19:03:50 +08:00
Chummy
99bf8f29be fix(unsafe-debt): remove runtime unsafe UID check and forbid unsafe code (RMN-37 RMN-38) 2026-02-24 18:30:36 +08:00
reidliu41
8f263cd336 feat(agent): add CLI parameters for runtime config overrides 2026-02-24 18:12:33 +08:00
Chummy
d78a6712ef fix: stabilize UTF-8 truncation and dashboard message IDs (RMN-25 RMN-33) 2026-02-24 16:52:26 +08:00
Chummy
36c4e923f1 chore: suppress strict-delta clippy bool-count lint on compatible provider 2026-02-24 15:59:49 +08:00
Chummy
5505465f93 chore: fix lint gate formatting and codex test runtime options 2026-02-24 15:59:49 +08:00
Chummy
b3b5055080 feat: replay custom provider api mode, route max_tokens, and lark image support 2026-02-24 15:59:49 +08:00
Chummy
3d5a5c3d3c fix(clippy): satisfy strict delta in websocket url mapping 2026-02-24 15:08:03 +08:00
Chummy
57cbb49d65 fix(fmt): align compatible provider websocket changes 2026-02-24 15:08:03 +08:00
Chummy
666f1a7d10 feat(provider): add responses websocket transport fallback 2026-02-24 15:08:03 +08:00
Chummy
ffb5942e60 style(qq): format channel changes 2026-02-24 14:46:42 +08:00
Chummy
f72c87dd26 fix(qq): support passive replies and media image send 2026-02-24 14:46:42 +08:00
Chummy
57f8979df1 fix(test): serialize openai codex env variable tests 2026-02-24 14:32:01 +08:00
Chummy
04e5950020 fix(gateway): remove unused websocket sink import 2026-02-24 14:21:34 +08:00
Chummy
68f1ba1617 chore(fmt): normalize gateway import order for webchat fix 2026-02-24 14:21:34 +08:00
Preventnetworkhacking
35a5815513 fix(gateway): enable tool execution in web chat agent
Web chat was calling provider.chat_with_history() directly, bypassing
the agent loop. Tool calls were rendered as raw XML instead of executing.

Changes:
- Add tools_registry_exec to AppState for executable tools
- Replace chat_with_history with run_tool_call_loop in ws.rs
- Maintain conversation history per WebSocket session
- Add multimodal and max_tool_iterations config to AppState

Closes #1524
2026-02-24 14:21:34 +08:00
Chummy
fb95fc61a0 fix(browser): harden rust_native interactability for click/fill/type 2026-02-24 14:12:08 +08:00
Chummy
0377a35811 chore(fmt): fix loop_ test formatting after #1505 2026-02-24 13:51:43 +08:00
Chummy
8ab75fdda9 test: add regression coverage for provider parser cron and telegram 2026-02-24 13:45:13 +08:00
Chummy
15b54670ff fix: improve tool-call parsing and shell expansion checks 2026-02-24 13:45:13 +08:00
Preventnetworkhacking
82c7fe8d8b fix(telegram): populate thread_ts for per-topic session isolation
When a Telegram message originates from a forum topic, the thread_id was
extracted and used for reply routing but never stored in ChannelMessage.thread_ts.
This caused all messages from the same sender to share conversation history
regardless of which topic they were posted in.

Changes:
- Set thread_ts to the extracted thread_id in parse_update_message,
  try_parse_voice_message, and try_parse_attachment_message
- Use 'ref' in if-let patterns to avoid moving thread_id before it's assigned
- Update conversation_history_key() to include thread_ts when present,
  producing keys like 'telegram_<thread_id>_<sender>' for forum topics
- Update conversation_memory_key() to also include thread_ts for memory isolation

This enables proper per-topic session isolation in Telegram forum groups while
preserving existing behavior for regular groups and DMs (where thread_ts is None).

Closes #1532
2026-02-24 13:40:04 +08:00
Chummy
ace493b32f chore(fmt): format gateway api after dashboard-save fix 2026-02-24 13:30:43 +08:00
argenis de la rosa
9751433803 fix(gateway): preserve masked config values on dashboard save
Replace line-based TOML masking with structured config masking so secret fields keep their original types (including reliability.api_keys arrays).\nHydrate dashboard PUT payloads with runtime config_path/workspace_dir and restore masked secret placeholders from current config before validation/save.\nAlso allow GET on /api/doctor for dashboard/client compatibility to avoid 405 responses.
2026-02-24 13:22:07 +08:00
Chummy
3157867a71 test(file_read): align outside-workspace case with workspace_only=false policy 2026-02-24 13:12:03 +08:00
Chummy
5e581eabfe fix(security): preserve workspace allowlist before forbidden-root checks 2026-02-24 12:58:59 +08:00
Allen Huang
752877051c fix: security, config, and provider hardening
- security: honor explicit command paths in allowed_commands list
- security: respect workspace_only=false in resolved path checks
- config: enforce 0600 permissions on every config save (unix)
- config: reject temp-directory paths in active workspace marker
- provider: preserve reasoning_content in tool-call conversation history
- provider: add allow_user_image_parts parameter for minimax compatibility

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-02-24 12:58:59 +08:00
Chummy
705e5b5a80 fix(ci): align codex tests with provider runtime API 2026-02-24 12:47:26 +08:00
Chummy
f4f6f5f48a test(codex): align provider init with runtime option changes 2026-02-24 12:38:48 +08:00
Chummy
d4f5f2ce95 fix(security): tighten prompt-guard detection thresholds and phrases 2026-02-24 12:38:48 +08:00
argenis de la rosa
09b6a2db0b fix(providers): use native_tool_calling field in supports_native_tools
The supports_native_tools() method was hardcoded to return true,
but it should return the value of self.native_tool_calling to
properly disable native tool calling for providers like MiniMax.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-02-24 12:38:48 +08:00
Chummy
005cd38d27 fix(onboard): resolve rebase conflict in models command helpers 2026-02-24 12:24:51 +08:00
Chummy
1290b73faa fix: align codex provider runtime options with current interfaces 2026-02-24 12:24:51 +08:00
Chummy
59d4f7d36d feat: stabilize codex oauth and add provider model connectivity workflow 2026-02-24 12:24:51 +08:00
Chummy
fefd0a1cc8 style: apply rustfmt normalization 2026-02-24 12:02:18 +08:00
Dominik Horváth
b8e4f1f803 fix(channels,memory): Docker workspace path remapping, vision support, and Qdrant backend restore (#1)
* fix(channels,providers): remap Docker /workspace paths and enable vision for custom provider

Two fixes:

1. Telegram channel: when a Docker-containerised runtime writes a file to
   /workspace/<path>, the host-side sender couldn't find it because the
   container mount point differs from the host workspace dir. Remap
   /workspace/<rel> → <host_workspace_dir>/<rel> in send_attachment before
   the path-exists check so generated media is delivered correctly.

2. Provider factory: custom: provider was created with vision disabled,
   causing all image messages to be rejected with a capability error even
   though the underlying OpenAI-compatible endpoint supports vision. Switch
   to new_with_vision(..., true) so image inputs are forwarded correctly.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

* feat(memory): restore Qdrant vector database backend

Re-adds the Qdrant memory backend that was removed from main in a
recent upstream merge. Restores:

- src/memory/qdrant.rs — full QdrantMemory implementation with lazy
  init, HTTP REST client, embeddings, and Memory trait
- src/memory/backend.rs — Qdrant variant in MemoryBackendKind, profile,
  classify and profile dispatch
- src/memory/mod.rs — module export, factory routing with build_qdrant_memory
- src/config/schema.rs — QdrantConfig struct and qdrant field on MemoryConfig
- src/config/mod.rs — re-export QdrantConfig
- src/onboard/wizard.rs — qdrant field in MemoryConfig initializer

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-24 12:02:18 +08:00
Mike Johnson-Maxted
d80a653552 fix(onboard): split device-flow hint — copilot auto-prompts, others use auth login
copilot is the only provider that performs a device-code flow automatically on
first run. openai-codex and gemini (when OAuth-backed) require an explicit
`zeroclaw auth login --provider <name>` step. Split the device-flow next-steps
block to reflect this distinction.

Addresses Copilot review comment on PR #1509.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-24 11:46:49 +08:00
Mike Johnson-Maxted
2f29ec75ef fix(onboard): use provider-aware env var hint in quick setup next steps
Replace hardcoded OPENROUTER_API_KEY hint with provider-aware logic:
- keyless local providers (ollama, llamacpp, etc.) show chat/gateway/status hints
- device-flow providers (copilot, gemini, openai-codex) show OAuth/first-run hint
- all other providers show the correct provider-specific env var via provider_env_var()

Also adds canonical alias "github-copilot" -> "copilot" in canonical_provider_name(),
and a new provider_supports_device_flow() helper with accompanying test.

Additionally fixes pre-existing compile blockers that prevented CI from running:
- fix(security): correct raw string literals in leak_detector.rs that terminated
  early due to unescaped " inside r"..." (use r#"..."# instead)
- fix(gateway): add missing wati: None in two test AppState initializations
- fix(gateway): use serde::Deserialize path on WatiVerifyQuery struct
- fix(security): add #[allow(unused_imports)] on new pub use re-exports in mod.rs
- fix(security): remove unused serde::{Deserialize, Serialize} import
- chore: apply cargo fmt to files that had pending formatting diffs

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-02-24 11:46:49 +08:00
NB😈
5386414666 fix(cron): enable delivery for crons created from external channels
Scheduled jobs created via channel conversations (Discord, Telegram, etc.)
never delivered output back to the channel because:

1. The agent had no channel context (channel name + reply_target) in its
   system prompt, so it could not populate the delivery config.
2. The schedule tool only creates shell jobs with no delivery support,
   and the cron_add tool's delivery schema was opaque.
3. OpenAiCompatibleProvider was missing the native_tool_calling field,
   causing a compile error.

Changes:
- Inject channel context (channel name + reply_target) into the system
  prompt so the agent knows how to address delivery when scheduling.
- Improve cron_add tool description and delivery parameter schema to
  guide the agent toward correct delivery config.
- Update schedule tool description to warn that output is only logged
  and redirect to cron_add for channel delivery.
- Fix missing native_tool_calling field in OpenAiCompatibleProvider.

Co-authored-by: Cursor <cursoragent@cursor.com>
2026-02-24 11:34:12 +08:00
Adam Singer
388e168158 [bug] Regex build failure 2026-02-24 11:34:12 +08:00
argenis de la rosa
5c63ec380a Merge branch 'main' into dev — consolidate all upstream releases 2026-02-23 14:03:17 -05:00
Ken Yeung
ecc8865cb7
feat: add WATI WhatsApp Business API channel (#1472)
Add a new WATI channel for WhatsApp Business API integration via the
WATI managed platform. WATI simplifies WhatsApp integration with its
own REST API and webhook system.

- New WatiChannel implementation (webhook mode, REST send)
- WatiConfig with api_token, api_url, tenant_id, allowed_numbers
- Gateway routes: GET/POST /wati for webhook verification and messages
- Flexible webhook parsing handles WATI's variable field names
- 15 unit tests covering parsing, allowlist, timestamps, phone normalization
2026-02-23 08:02:00 -05:00
Alex
10dd428de1
feat(providers): add Novita AI as OpenAI-compatible provider (#1496)
- Register Novita AI in provider factory with NOVITA_API_KEY env var
- Add to integrations registry with active/available status detection
- Configure onboarding wizard with default model and API endpoint
- Add to PR labeler provider keyword hints
- Update providers reference documentation

Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-02-23 07:58:49 -05:00