- Base branch target: dev
- Problem: ZeroClaw agents have no structured way to decompose complex tasks into trackable steps, falling behind
every comparable agent runtime
- Why it matters: Without task tracking, multi-step work is fragile (lost on context compression), invisible to users
(no progress signal), and error-prone (agent loses track of what's done vs. pending)
- What changed: Added a session-scoped task_plan tool with create/add/update/list/delete actions, integrated with
SecurityPolicy, registered in the tool factory
- What did not change: No config schema changes, no persistence layer, no CLI subcommand, no changes to agent loop or
any other subsystem
Label Snapshot
- Risk label: risk: low
- Size label: size: S
- Scope labels: tool
- Module labels: tool: task_plan
- Contributor tier label: (auto-managed)
- If any auto-label is incorrect: N/A
Change Metadata
- Change type: feature
- Primary scope: tool
Linked Issue
- Closes #(issue number)
- Related: N/A
- Depends on: N/A
- Supersedes: N/A
Supersede Attribution
N/A — no superseded PRs.
Validation Evidence
cargo fmt --all -- --check # pass (no output)
cargo clippy --all-targets -- -D warnings # task_plan.rs: 0 warnings (pre-existing warnings in other files
unrelated)
cargo test --lib tools::task_plan # 15/15 passed
- Evidence provided: test output (15 passed, 0 failed)
- If any command is intentionally skipped: cargo clippy reports pre-existing warnings in unrelated files
(onboard/wizard.rs etc.); task_plan.rs itself has zero clippy warnings
Security Impact
- New permissions/capabilities? No — uses existing ToolOperation::Act enforcement
- New external network calls? No
- Secrets/tokens handling changed? No
- File system access scope changed? No
Privacy and Data Hygiene
- Data-hygiene status: pass
- Redaction/anonymization notes: No identity data in code or tests. Test fixtures use neutral strings ("step one",
"do thing", "first")
- Neutral wording confirmation: All naming follows ZeroClaw/project-native conventions
Compatibility / Migration
- Backward compatible? Yes
- Config/env changes? No
- Migration needed? No
i18n Follow-Through
- i18n follow-through triggered? No — no docs or user-facing wording changes
Human Verification
- Verified scenarios: Ran ./target/debug/zeroclaw agent -m "调用 task_plan 工具,action=list" — agent correctly
identified and called task_plan, returned "No tasks."
- Edge cases checked: read-only mode blocks mutations, empty task list, invalid action names, missing required
parameters, create replaces existing list, ID auto-increment after add
- What was not verified: Behavior with non-CLI channels (Telegram, Discord); behavior with XML-fallback dispatcher
(non-native-tool providers)
Side Effects / Blast Radius
- Affected subsystems/workflows: src/tools/ only — tool factory gains one additional entry
- Potential unintended effects: Marginally increases tool spec payload size sent to LLM (one more tool definition).
Could theoretically cause tool name confusion with schedule if LLM descriptions are ambiguous — mitigated by distinct
naming (task_plan vs schedule) and different description wording.
- Guardrails/monitoring for early detection: Standard tool dispatch logging. Tool is session-scoped so no persistent
side effects on failure.
Agent Collaboration Notes
- Agent tools used: Claude Code for implementation assistance and review
- Workflow/plan summary: Implement Tool trait → register in factory → validate with tests → manual agent session test
- Verification focus: Security policy enforcement, parameter validation edge cases, all 5 action paths
- Confirmation: naming + architecture boundaries followed (CLAUDE.md §6.3, §6.4, §7.3): Yes
Rollback Plan
- Fast rollback command/path: git revert <commit> — removes 3 lines from mod.rs and deletes task_plan.rs
- Feature flags or config toggles: None needed — tool is stateless and session-scoped
- Observable failure symptoms: Tool not appearing in agent tool list, or tool returning errors on valid input
Risks and Mitigations
- Risk: LLM may occasionally confuse task_plan (action: list) with schedule (action: list) due to similar parameter
structure
- Mitigation: Distinct tool names and descriptions; task_plan description emphasizes "session checklist" while
schedule emphasizes "cron/recurring tasks"
format_attachment_content was matching only Photo for [IMAGE:] routing.
Documents with image extensions (jpg, png, gif, webp, bmp) were formatted as
[Document: name] /path, bypassing the multimodal pipeline entirely.
Extend the match arm to cover Document when is_image_extension returns true,
so both Photos and image Documents produce [IMAGE:/path] and reach the provider
as proper vision input blocks.
Adds regression tests covering Document+image extension → [IMAGE:] and
Document+non-image extension → [Document:] paths.
The Anthropic provider had no Image variant in NativeContentOut, so
[IMAGE:data:image/jpeg;base64,...] markers produced by the multimodal
pipeline were sent to the API as plain text. The API counted every
base64 character as a token, reliably exceeding the 200k token limit
for any real image (a typical Telegram-compressed photo produced
~130k tokens of base64 text alone).
Fix:
- Add ImageSource struct and Image variant to NativeContentOut that
serializes to the Anthropic Messages API image content block format
- Add parse_inline_image() to decode data URI markers into Image blocks
- Add build_user_content_blocks() to split user message content into
Text and Image blocks using the existing parse_image_markers helper
- Update convert_messages() user arm to use build_user_content_blocks()
- Handle Image in the apply_cache_to_last_message no-op arm
Fixes#1626
Set vision: true so image inputs are accepted by the capability gate.
Set native_tool_calling: true to align capabilities() with the existing
supports_native_tools() which always returned true, eliminating the
silent inconsistency between the two.
Adds a unit test that fails if either capability regresses.
Web chat was calling provider.chat_with_history() directly, bypassing
the agent loop. Tool calls were rendered as raw XML instead of executing.
Changes:
- Add tools_registry_exec to AppState for executable tools
- Replace chat_with_history with run_tool_call_loop in ws.rs
- Maintain conversation history per WebSocket session
- Add multimodal and max_tool_iterations config to AppState
Closes#1524
When a Telegram message originates from a forum topic, the thread_id was
extracted and used for reply routing but never stored in ChannelMessage.thread_ts.
This caused all messages from the same sender to share conversation history
regardless of which topic they were posted in.
Changes:
- Set thread_ts to the extracted thread_id in parse_update_message,
try_parse_voice_message, and try_parse_attachment_message
- Use 'ref' in if-let patterns to avoid moving thread_id before it's assigned
- Update conversation_history_key() to include thread_ts when present,
producing keys like 'telegram_<thread_id>_<sender>' for forum topics
- Update conversation_memory_key() to also include thread_ts for memory isolation
This enables proper per-topic session isolation in Telegram forum groups while
preserving existing behavior for regular groups and DMs (where thread_ts is None).
Closes#1532
Replace line-based TOML masking with structured config masking so secret fields keep their original types (including reliability.api_keys arrays).\nHydrate dashboard PUT payloads with runtime config_path/workspace_dir and restore masked secret placeholders from current config before validation/save.\nAlso allow GET on /api/doctor for dashboard/client compatibility to avoid 405 responses.
- security: honor explicit command paths in allowed_commands list
- security: respect workspace_only=false in resolved path checks
- config: enforce 0600 permissions on every config save (unix)
- config: reject temp-directory paths in active workspace marker
- provider: preserve reasoning_content in tool-call conversation history
- provider: add allow_user_image_parts parameter for minimax compatibility
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The supports_native_tools() method was hardcoded to return true,
but it should return the value of self.native_tool_calling to
properly disable native tool calling for providers like MiniMax.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* fix(channels,providers): remap Docker /workspace paths and enable vision for custom provider
Two fixes:
1. Telegram channel: when a Docker-containerised runtime writes a file to
/workspace/<path>, the host-side sender couldn't find it because the
container mount point differs from the host workspace dir. Remap
/workspace/<rel> → <host_workspace_dir>/<rel> in send_attachment before
the path-exists check so generated media is delivered correctly.
2. Provider factory: custom: provider was created with vision disabled,
causing all image messages to be rejected with a capability error even
though the underlying OpenAI-compatible endpoint supports vision. Switch
to new_with_vision(..., true) so image inputs are forwarded correctly.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
* feat(memory): restore Qdrant vector database backend
Re-adds the Qdrant memory backend that was removed from main in a
recent upstream merge. Restores:
- src/memory/qdrant.rs — full QdrantMemory implementation with lazy
init, HTTP REST client, embeddings, and Memory trait
- src/memory/backend.rs — Qdrant variant in MemoryBackendKind, profile,
classify and profile dispatch
- src/memory/mod.rs — module export, factory routing with build_qdrant_memory
- src/config/schema.rs — QdrantConfig struct and qdrant field on MemoryConfig
- src/config/mod.rs — re-export QdrantConfig
- src/onboard/wizard.rs — qdrant field in MemoryConfig initializer
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
copilot is the only provider that performs a device-code flow automatically on
first run. openai-codex and gemini (when OAuth-backed) require an explicit
`zeroclaw auth login --provider <name>` step. Split the device-flow next-steps
block to reflect this distinction.
Addresses Copilot review comment on PR #1509.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace hardcoded OPENROUTER_API_KEY hint with provider-aware logic:
- keyless local providers (ollama, llamacpp, etc.) show chat/gateway/status hints
- device-flow providers (copilot, gemini, openai-codex) show OAuth/first-run hint
- all other providers show the correct provider-specific env var via provider_env_var()
Also adds canonical alias "github-copilot" -> "copilot" in canonical_provider_name(),
and a new provider_supports_device_flow() helper with accompanying test.
Additionally fixes pre-existing compile blockers that prevented CI from running:
- fix(security): correct raw string literals in leak_detector.rs that terminated
early due to unescaped " inside r"..." (use r#"..."# instead)
- fix(gateway): add missing wati: None in two test AppState initializations
- fix(gateway): use serde::Deserialize path on WatiVerifyQuery struct
- fix(security): add #[allow(unused_imports)] on new pub use re-exports in mod.rs
- fix(security): remove unused serde::{Deserialize, Serialize} import
- chore: apply cargo fmt to files that had pending formatting diffs
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Scheduled jobs created via channel conversations (Discord, Telegram, etc.)
never delivered output back to the channel because:
1. The agent had no channel context (channel name + reply_target) in its
system prompt, so it could not populate the delivery config.
2. The schedule tool only creates shell jobs with no delivery support,
and the cron_add tool's delivery schema was opaque.
3. OpenAiCompatibleProvider was missing the native_tool_calling field,
causing a compile error.
Changes:
- Inject channel context (channel name + reply_target) into the system
prompt so the agent knows how to address delivery when scheduling.
- Improve cron_add tool description and delivery parameter schema to
guide the agent toward correct delivery config.
- Update schedule tool description to warn that output is only logged
and redirect to cron_add for channel delivery.
- Fix missing native_tool_calling field in OpenAiCompatibleProvider.
Co-authored-by: Cursor <cursoragent@cursor.com>
Add a new WATI channel for WhatsApp Business API integration via the
WATI managed platform. WATI simplifies WhatsApp integration with its
own REST API and webhook system.
- New WatiChannel implementation (webhook mode, REST send)
- WatiConfig with api_token, api_url, tenant_id, allowed_numbers
- Gateway routes: GET/POST /wati for webhook verification and messages
- Flexible webhook parsing handles WATI's variable field names
- 15 unit tests covering parsing, allowlist, timestamps, phone normalization
- Register Novita AI in provider factory with NOVITA_API_KEY env var
- Add to integrations registry with active/available status detection
- Configure onboarding wizard with default model and API endpoint
- Add to PR labeler provider keyword hints
- Update providers reference documentation
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>