Commit Graph

1944 Commits

Author SHA1 Message Date
李龙 0668001470
a2fdf64a5e
fix(security): block runtime config state edits 2026-03-24 15:29:02 +03:00
Alix-007
d43ce63b4d
feat(skills): add read_skill for compact mode 2026-03-24 15:29:02 +03:00
Alix-007
40e2aaa318
Fix /new regression test lint scope 2026-03-24 15:29:02 +03:00
Alix-007
f9c7d18caf
Refresh skills after new channel sessions 2026-03-24 15:29:02 +03:00
Alix-007
68f4fbd550
fix(tools): normalize workspace-prefixed paths 2026-03-24 15:29:02 +03:00
Alix-007
b19b6822db
style(zai): satisfy rustfmt in tool_stream request 2026-03-24 15:29:02 +03:00
Alix-007
bf4c4d81a0
fix(zai): send tool_stream for tool-capable requests 2026-03-24 15:29:01 +03:00
Alix-007
8275649517
test(claude_code): isolate echo script per test run 2026-03-24 15:29:01 +03:00
Alix-007
50cef2544a
chore(pr): restore merge-base docs file for #3356 2026-03-24 15:29:01 +03:00
李龙 0668001470
7283f3c6fe
test(config): move initialized log regression away from merge hotspot 2026-03-24 15:28:49 +03:00
李龙 0668001470
41f8773a3c
fix(config): avoid clippy used_underscore_binding 2026-03-24 15:28:48 +03:00
Alix-007
4544947a15
fix(config): log existing config as initialized 2026-03-24 15:28:47 +03:00
argenis de la rosa
94ee9746e7
fix(docs): update banner image and add Instagram badge to all READMEs
- Replace zeroclaw.png (broken/outdated) with zeroclaw-banner.png
  across all 30 translated README files
- Add Instagram social badge (@therealzeroclaw) to all translations

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-24 15:27:55 +03:00
Argenis
39cf6ab867
fix: add Node.js build dependency to Homebrew formula template (#3996)
The Homebrew publish workflow downloads a source tarball but never
ensures Node.js is available during the build. Without it, build.rs
cannot run npm to compile the web frontend, so Homebrew-installed
ZeroClaw ships without the web dashboard.

Add `depends_on "node" => :build` to the formula by inserting it
after the existing `depends_on "rust" => :build` line. This lets
build.rs detect npm and automatically run `npm ci && npm run build`
to produce the web/dist assets.

Fixes #3991
2026-03-24 15:27:55 +03:00
Argenis
296fff7af9
fix(config): enable compact_context by default (#3995)
* fix: change compact_context default to true

Local LLMs with limited context windows immediately run out of context
when compact_context defaults to false. The system prompt alone can
consume 25K+ tokens, exceeding even 55K context windows with history.

Setting compact_context=true by default limits system prompt injection
to 6000 chars and RAG results to 2 chunks, making the agent usable
with smaller models out of the box.

Fixes #3987

* docs: update compact_context default to true in config reference

Update all locale variants (en, zh-CN, vi) to reflect the new default.

* test: update tests to expect compact_context default of true

Update assertions in schema.rs unit tests and config_persistence.rs
component tests to match the new default value.
2026-03-24 15:27:55 +03:00
Alix-007
1866888e9f
build(release): drop stale docker feature args 2026-03-24 15:27:53 +03:00
Alix-007
46b56b1f5f
build(release): ship postgres-capable release artifacts 2026-03-24 15:27:15 +03:00
Argenis
c41009d29f
fix(cron): persist allowed_tools for agent jobs (#3993)
Persist allowed_tools in cron_jobs table, threading it through CLI add/update and cron_add/cron_update tool APIs. Add regression coverage for store, tool, and CLI roundtrip paths.

Fixups over original PR #3929: add allowed_tools to all_overdue_jobs SELECT (merge gap), resolve merge conflicts.

Closes #3920
Supersedes #3929
2026-03-24 15:26:29 +03:00
Alix-007
b6bc332b68
feat(slack): add thread_replies channel option (#3930)
Add a thread_replies option to Slack channel config (default true). When false, replies go to channel root instead of the originating thread.

Closes #3888
2026-03-24 15:26:28 +03:00
Argenis
53802d6d04
fix(anthropic): always apply cache_control to system prompts (#3990)
* fix: always use Blocks format for system prompts with cache_control

System prompts under 3KB were wrapped in SystemPrompt::String which
cannot carry cache_control headers, resulting in 0% cache hit rate
on Haiku 4.5. Always use SystemPrompt::Blocks with ephemeral
cache_control regardless of prompt size.

Fixes #3977

* fix: lower conversation caching threshold from >4 to >1 messages

The previous threshold of >4 non-system messages was too restrictive,
delaying cache benefits until 5+ turns. Lower to >1 so caching kicks
in after the first user+assistant exchange.

Fixes #3977

* test: update anthropic cache tests for new thresholds and Blocks format

- convert_messages_small_system_prompt now expects Blocks with
  cache_control instead of String variant
- should_cache_conversation tests updated for >1 threshold
- backward_compatibility test replaced with blocks-system test
2026-03-24 15:26:28 +03:00
Argenis
1ff411d8f3
fix(security): wire sandbox into shell command execution (#3989)
* fix: add sandbox field to ShellTool struct

Add `sandbox: Arc<dyn Sandbox>` field to `ShellTool` and a
`new_with_sandbox()` constructor so callers can inject the configured
sandbox backend. The existing `new()` constructor defaults to
`NoopSandbox` for backward compatibility.

Ref: #3983

* fix: apply sandbox wrapping in ShellTool::execute()

Call `self.sandbox.wrap_command()` on the underlying std::process::Command
(via `as_std_mut()`) after building the shell command and before clearing
the environment. This ensures every shell command passes through the
configured sandbox backend before execution.

Ref: #3983

* fix: wire up sandbox creation at ShellTool callsites

In `all_tools_with_runtime()`, create a sandbox from
`root_config.security` via `create_sandbox()` and pass it to
`ShellTool::new_with_sandbox()`. The `default_tools_with_runtime()`
path retains `ShellTool::new()` which defaults to `NoopSandbox`.

Ref: #3983

* test: add sandbox integration tests for ShellTool

Verify that ShellTool can be constructed with a sandbox via
`new_with_sandbox()`, that NoopSandbox leaves commands unmodified,
and that command execution works end-to-end with a sandbox attached.

Ref: #3983
2026-03-24 15:26:28 +03:00
Argenis
d33127840b
fix(web): restore accidentally deleted logo file (#3988)
* fix: restore accidentally deleted logo file

The logo.png was removed in commit 48bdbde2 but is still referenced
by the web UI components. Restore it from git history.

Fixes #3984

* fix: copy logo to web/dist for rust-embed

The Rust binary embeds files from web/dist/ via rust-embed, so the
logo must also be present there to be served without a rebuild.

Fixes #3984
2026-03-24 15:26:28 +03:00
Alix-007
6f08b15d8b
fix(cron): default channel delivery to active reply target 2026-03-24 15:26:28 +03:00
Alix-007
ffa1ee4d3b
fix(onboard): warn when Homebrew service uses another workspace 2026-03-24 15:26:28 +03:00
Alix-007
2ca9ca3285
fix(skills): narrow shell shebang detection (#3944)
Co-authored-by: Alix-007 <267018309+Alix-007@users.noreply.github.com>
2026-03-24 15:26:27 +03:00
Martin
eae2fac48a
fix(install): fix guided installer in LXC/container environments (#3947)
Replace subshell-based /dev/stdin probing in guided_input_stream with a
file-descriptor approach (guided_open_input) that works reliably in LXC
containers accessed over SSH.

The previous implementation probed /dev/stdin and /proc/self/fd/0 via
subshells before falling back to /dev/tty. In LXC containers these
probes fail even when FD 0 is perfectly usable, causing the guided
installer to exit with "requires an interactive terminal".

The fix:
- When stdin is a terminal (-t 0), assign GUIDED_FD=0 directly without
  any subshell probing — trusting the kernel's own tty check
- Otherwise, open /dev/tty as an explicit fd (exec {GUIDED_FD}</dev/tty)
- guided_read uses `read -u "$GUIDED_FD"` instead of `< "$file_path"`
- Add echo after silent reads (password prompts) for correct line handling

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-24 15:26:27 +03:00
Giulio V
49d68e55f2
fix(cron): add startup catch-up and drop login shell flag (#3948)
* fix(cron): add startup catch-up and drop login shell flag

Problems:
1. When ZeroClaw started after downtime (late boot, daemon restart),
   overdue jobs were picked up via `due_jobs()` but limited by
   `max_tasks` per poll cycle — with many overdue jobs, catch-up
   could take many cycles.
2. Cron shell jobs used `sh -lc` (login shell), which loads the
   full user profile on every execution — slow and may cause
   unexpected side effects.

Fixes:
- Add `all_overdue_jobs()` store query without `max_tasks` limit
- Add `catch_up_overdue_jobs()` startup phase that runs ALL overdue
  jobs once before entering the normal polling loop
- Extract `build_cron_shell_command()` helper using `sh -c` (non-login)
- Add structured tracing for catch-up progress
- Add tests for all new functions

* feat(cron): make catch-up configurable via API and control panel

Add `catch_up_on_startup` boolean to `[cron]` config (default: true).
When enabled, the scheduler runs all overdue jobs at startup before
entering the normal polling loop. Users can toggle this from:

- The Cron page toggle switch in the control panel
- PATCH /api/cron/settings { "catch_up_on_startup": false }
- The `[cron]` section of the TOML config editor

Also adds GET /api/cron/settings endpoint to read cron subsystem
settings without parsing the full config.

* fix(config): add catch_up_on_startup to CronConfig test constructors

The CI Lint job failed because the `cron_config_serde_roundtrip` test
constructs CronConfig directly and was missing the new field.
2026-03-24 15:26:27 +03:00
argenis de la rosa
4b6f351a13
fix(ci): harden AUR SSH key setup and add diagnostics (#3952)
The AUR publish step fails with "Permission denied (publickey)".
Root cause is likely key formatting (Windows line endings from
GitHub secrets UI) or missing public key registration on AUR.

Changes:
- Normalize line endings (strip \r) when writing SSH key
- Set correct permissions on ~/.ssh (700) and ~/.ssh/config (600)
- Validate key with ssh-keygen before attempting clone
- Add SSH connectivity test for clearer error diagnostics

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-24 15:26:27 +03:00
argenis de la rosa
ba9b41de8a
chore: bump version to 0.5.1
Release highlights:
- Autonomy enforcement in gateway and channel paths (#3952)
- conversational_ai startup warning for unimplemented config (#3958)
- Heartbeat default interval 30→5min (#3938)
- Provider timeout and error handling improvements (#3973, #3978)
- Docker/CI postgres and Lark feature fixes (#3971, #3933)
- Tool path resolution fix (#3937)
- OTP config fix (#3936)
- README: Instagram badge + banner image (#3979)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-24 15:26:27 +03:00
argenis de la rosa
c1791eaec3
docs(readme): add Instagram social badge and switch to banner image
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-24 15:26:27 +03:00
Argenis
8db392de7c
fix(providers): exempt tool schema errors from non-retryable classification (#3978)
* fix: exempt tool schema validation errors from non-retryable classification

Groq returns 400 "tool call validation failed" which was classified as
non-retryable by is_non_retryable(), preventing the provider-level
fallback in compatible.rs from executing. Add is_tool_schema_error()
to detect these errors and return false from is_non_retryable(), allowing
the retry loop to pass control back to the provider's built-in fallback.

Fixes #3757

* test: add unit tests for tool schema error detection in reliable.rs

Verify is_tool_schema_error detects Groq-style validation failures and
that is_non_retryable returns false for tool schema 400s while still
returning true for other 400 errors like invalid API key.

* fix: escape format braces in test string literals for cargo check

The anyhow::anyhow! macro interprets curly braces as format
placeholders. Use explicit format argument to pass JSON-containing
strings in tests.
2026-03-24 15:26:26 +03:00
argenis de la rosa
64efb72605
fix(agent): enforce autonomy level in gateway and channel paths (#3952)
- Channel tool filtering (`non_cli_excluded_tools`) now respects
  `autonomy.level = "full"` — full-autonomy agents keep all tools
  available regardless of channel.
- Gateway `process_message` now creates and passes an `ApprovalManager`
  to `agent_turn`, so `ReadOnly`/`Supervised` policies are enforced
  instead of silently skipped.
- Gateway also applies `non_cli_excluded_tools` filtering with the same
  full-autonomy bypass.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-24 15:26:26 +03:00
argenis de la rosa
bbc470fe07
fix(config): warn when conversational_ai.enabled is set (#3958)
The conversational_ai config section is parsed but not yet consumed by
any runtime code. Emit a startup warning so users know the setting is
ignored, and update the doc comment to mark it as reserved for future use.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-24 15:26:26 +03:00
Argenis
be3ba4f9a9
fix(ci): include channel-lark feature in precompiled release binaries (#3933) (#3972)
Add channel-lark to the cargo --features flag in all release and
cross-platform build workflows, and to the Docker build-args.  This
gives users Feishu/Lark channel support out of the box without needing
to compile from source.

The channel-lark feature depends only on dep:prost (pure Rust protobuf),
so it is safe to enable on all platforms (Linux, macOS, Windows, Android).
2026-03-24 15:26:26 +03:00
Argenis
8ee3e71e90
fix(openrouter): respect provider_timeout_secs and improve error messages (#3973)
* fix(openrouter): wire provider_timeout_secs through factory

Apply the configured provider_timeout_secs to OpenRouterProvider
in the provider factory, matching the pattern used for compatible
providers.

* fix(openrouter): add timeout_secs field to OpenRouterProvider

Add a configurable timeout_secs field (default 120s) and a
with_timeout_secs() builder method so the HTTP client timeout
can be overridden via provider config instead of being hardcoded.

* refactor(openrouter): improve response decode error messages

Read the response body as text first, then parse with
serde_json::from_str so that decode failures include a truncated
snippet of the raw body for easier debugging.

* test(openrouter): add timeout_secs configuration tests

Verify that the default timeout is 120s and that with_timeout_secs
correctly overrides it.

* style: run rustfmt on openrouter.rs
2026-03-24 15:26:24 +03:00
Alix-007
8f241cd4b7
fix(openrouter): respect provider timeout config 2026-03-24 15:17:35 +03:00
Roman Tataurov
f01ec415a5
Fix models refresh 2026-03-24 15:17:35 +03:00
Argenis
72bf00a409
fix(docker): add memory-postgres feature to Docker and CI builds (#3971)
The GHCR image v0.5.0 fails to start when users configure the postgres
memory backend because the binary was compiled without the
memory-postgres cargo feature. Add it to all build configurations:
Dockerfiles (default ARG), release workflows (cargo build --features),
and Docker push steps (ZEROCLAW_CARGO_FEATURES build-arg).

Fixes #3946
2026-03-24 15:17:35 +03:00
Argenis
96e2a324d1
fix: make channel system prompt respect autonomy.level = full (#3952) (#3970)
When autonomy.level is set to "full", the channel/web system prompt no
longer includes instructions telling the model to ask for permission
before executing tools. Previously these safety lines were hardcoded
regardless of autonomy config, causing the LLM to simulate approval
dialogs in channel and web-interface modes even though the
ApprovalManager correctly allowed execution.

The fix adds an autonomy_level parameter to build_system_prompt_with_mode
and conditionally omits the "ask before acting" instructions when the
level is Full. Core safety rules (no data exfiltration, prefer trash)
are always included.
2026-03-24 15:17:35 +03:00
Argenis
8856c5fa95
fix: omit experimental conversational_ai section from default config (#3969)
The [conversational_ai] config section was serialized into every
freshly-generated config.toml despite the feature being experimental
and not yet wired into the agent runtime. This confused new users who
found an undocumented section in their config.

Add skip_serializing_if = "ConversationalAiConfig::is_disabled" so the
section is only written when a user has explicitly enabled it. Existing
configs that already contain the section continue to deserialize
correctly via #[serde(default)].

Fixes #3958
2026-03-24 15:17:34 +03:00
Argenis
ae98c91596
feat(heartbeat): default interval 30→5min + prune heartbeat from auto-save (#3938)
Lower the default heartbeat interval to 5 minutes to match the renewable
partial wake-lock cadence. Add `[heartbeat task` to the memory auto-save
skip filter so heartbeat prompts (both Phase 1 decision and Phase 2 task
execution) do not pollute persistent conversation memory.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-24 15:17:34 +03:00
Argenis
84e4adbcd2
fix(tools): use resolve_tool_path for consistent path resolution (#3937)
Replace workspace_dir.join(path) with resolve_tool_path(path) in
file_write, file_edit, and pdf_read tools to correctly handle absolute
paths within the workspace directory, preventing path doubling.

Closes #3774
2026-03-24 15:17:34 +03:00
Argenis
7a65850d45
fix(config): add missing challenge_max_attempts field to OtpConfig (#3919) (#3936)
The OtpConfig struct uses deny_unknown_fields but was missing the
challenge_max_attempts field, causing zeroclaw config schema to fail
with a TOML parse error when the field appeared in config files.

Add challenge_max_attempts as an Option<u32>-style field with a default
of 3 and a validation check ensuring it is greater than 0.
2026-03-24 15:17:34 +03:00
Argenis
b0cd596f9a
fix: move misplaced include key from [lib] to [package] in Cargo.toml (#3935)
The `include` array was placed after `[lib]` without a section header,
causing Cargo to parse it as `lib.include` — an invalid manifest key.
This triggered a warning during builds and caused lockfile mismatch
errors when building with --locked in Docker (Dockerfile.debian).

Move the `include` key to the `[package]` section where it belongs and
regenerate Cargo.lock to stay in sync.

Fixes #3925
2026-03-24 15:17:34 +03:00
argenis de la rosa
9361f7e7eb
fix(gateway): move pairing code below dashboard URL in terminal banner
Repositions the one-time pairing code display to appear directly below
the dashboard URL for cleaner terminal output, and removes the duplicate
display that was showing at the bottom of the route list.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-24 15:17:34 +03:00
Argenis
121e8b86a2
feat(skills): autonomous skill creation from multi-step tasks (#3916)
Add SkillCreator module that persists successful multi-step task
executions as reusable SKILL.toml definitions under the workspace
skills directory.

- SkillCreationConfig in [skills.skill_creation] (disabled by default)
- Slug validation, TOML generation, embedding-based deduplication
- LRU eviction when max_skills limit is reached
- Agent loop integration post-success
- Gated behind `skill-creation` compile-time feature flag

Closes #3825.
2026-03-24 15:17:34 +03:00
Argenis
ba23480e01
fix: ensure SOUL.md and IDENTITY.md exist in non-tty sessions (#3915)
When the workspace is created outside of `zeroclaw onboard` (e.g., via
cron, daemon, or `< /dev/null`), SOUL.md and IDENTITY.md were never
scaffolded, causing the agent to activate without identity files.

Added `ensure_bootstrap_files()` in `Config::load_or_init()` that
idempotently creates default SOUL.md and IDENTITY.md if missing.

Closes #3819.
2026-03-24 15:17:33 +03:00
Argenis
cc463c0df4
feat(delegate): make sub-agent timeouts configurable via config.toml (#3909)
Add `timeout_secs` and `agentic_timeout_secs` fields to
`DelegateAgentConfig` so users can tune per-agent timeouts instead
of relying on the hardcoded 120s / 300s defaults.

Validation rejects values of 0 or above 3600s, matching the pattern
used by MCP timeout validation.

Closes #3898
2026-03-24 15:17:33 +03:00
Argenis
81d99f513c
feat(i18n): externalize tool descriptions for translation (#3912)
Add a locale-aware tool description system that loads translations from
TOML files in tool_descriptions/. This enables non-English users to see
tool descriptions in their language.

- Add src/i18n.rs module with ToolDescriptions loader, locale detection
  (ZEROCLAW_LOCALE, LANG, LC_ALL env vars), and English fallback chain
- Add locale config field to Config struct for explicit locale override
- Create tool_descriptions/en.toml with all 47 tool descriptions
- Create tool_descriptions/zh-CN.toml with Chinese translations
- Integrate with ToolsSection::build() and build_tool_instructions()
  to resolve descriptions from locale files before hardcoded fallback
- Add PromptContext.tool_descriptions field for prompt-time resolution
- Add AgentBuilder.tool_descriptions() setter for Agent construction
- Include tool_descriptions/ in Cargo.toml package include list
- Add 8 unit tests covering locale loading, fallback chains, env
  detection, and config override

Closes #3901
2026-03-24 15:17:33 +03:00
argenis de la rosa
3430f9bf1a
fix(test): use PID-scoped script path to prevent ETXTBSY in CI
The echo_provider() test helper writes a fake_claude.sh script to
a shared temp directory. When lib and bin test binaries run in
parallel (separate processes, separate OnceLock statics), one
process can overwrite the script while the other is executing it,
causing "Text file busy" (ETXTBSY). Scope the filename with PID
to isolate each test process.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-24 15:17:33 +03:00