Compare commits

..

921 Commits

Author SHA1 Message Date
Argenis
feaca20582
feat: desktop companion app + device-aware installer + CI/CD (#4500)
* feat: add desktop companion app integration and CI/CD pipeline

- Add `zeroclaw desktop` CLI command to launch/install companion app
- Add device-aware installer (desktop/server/mobile/embedded/container)
- Replace from-source Tauri build with pre-built .dmg download flow
- Add `build-desktop` job to beta and stable release workflows
- Build universal macOS .dmg via Tauri on macos-14 runners
- Include .dmg in GitHub Release assets alongside CLI binaries

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* chore: bump version to 0.6.1

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* chore: update Cargo.lock for 0.6.1

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-24 03:59:11 -04:00
Nim G
40af505e90
feat(security): wire LeakDetector into outbound message path (#4457)
* feat(security): wire LeakDetector into outbound message path

Wire LeakDetector.scan() into channel response sanitization (via
sanitize_channel_response) and cron job delivery (via
deliver_announcement) to prevent credential leakage to external
channels.

Changes:
- src/channels/mod.rs: Add leak detection to sanitize_channel_response()
  with tracing::warn! on detection. Follows prior art from commit fb3b7b8e.
- src/cron/scheduler.rs: Add leak detection to deliver_announcement()
  before sending output to any channel (Telegram, Discord, Slack,
  Mattermost, Signal, Matrix, WhatsApp, QQ).
- Added unit tests in both modules to verify redaction behavior.

All outbound channel messages and cron job delivery are now scanned for
API keys, AWS credentials, JWTs, database URLs, private keys, and
high-entropy tokens. Detected leaks are redacted before delivery and
logged via tracing::warn! with pattern details.

Ref: SPEC-1.3-leak-detector-outbound.md (gap closure §1.3)
Prior art: commit fb3b7b8e (zeroclaw_homecomming branch)

* fix(security): enforce leak redaction with RedactedOutput newtype

Replace raw String return from scan_and_redact_output with a
RedactedOutput newtype. All channel dispatch in deliver_announcement
must go through this type, so the compiler rejects any path that
skips leak detection. Replaces tests that called LeakDetector in
isolation with tests that exercise scan_and_redact_output through
the RedactedOutput boundary.
2026-03-24 01:43:49 -04:00
Nim G
f812dbcb85
feat(channels): add message redaction API to Channel trait (#4458)
* feat(channels): add message redaction API to Channel trait

Add redact_message() method to Channel trait with default no-op implementation.
Implement for MatrixChannel using existing room.redact() SDK method.
Add comprehensive test coverage for trait default and Matrix implementation.

Changes:
- src/channels/traits.rs: Add redact_message() default method and test
- src/channels/matrix.rs: Implement redact_message() for MatrixChannel
- tests/integration/channel_matrix.rs: Add RedactMessage event, test channel impl, and lifecycle test

Tests added:
- default_redact_message_returns_success (trait default)
- redact_message_lifecycle (MatrixTestChannel)
- Updated minimal_channel_all_defaults_succeed

All tests pass. No breaking changes.

* test(channels): clarify redact_message test coverage

* fix(channels): use OwnedRoomId for matrix redact_message room lookup

RoomId is unsized (wraps str), so inlining .parse() in
get_room(&target_room.parse()?) fails to compile with the
channel-matrix feature. Use an intermediate OwnedRoomId binding,
matching the pattern used in send() and listen().
2026-03-24 01:43:41 -04:00
Nim G
59225d97b3
feat(memory): add bulk memory deletion by namespace and session (#4459)
Add purge_namespace() and purge_session() methods to Memory trait with
default bail! implementations. Implement both in SQLite backend using
spawn_blocking pattern matching forget(). Expose via new memory_purge
tool with security enforcement.

Changes:
- src/memory/traits.rs: Add purge_namespace/purge_session trait methods
- src/memory/sqlite.rs: Implement purge methods + 8 backend tests
- src/tools/memory_purge.rs: New tool following memory_forget pattern
- src/tools/mod.rs: Register memory_purge module and export

Tests:
- SQLite backend: 8 tests (namespace/session purge, preservation, count, noop)
- Tool: 8 tests (name/schema, purge success, noop, missing param, readonly/rate-limit)

All tests pass. Clippy clean. Formatted with cargo fmt.
2026-03-24 01:43:39 -04:00
Nim G
483f773e1d
feat(ci): add per-component path labels and labeler workflow (#4461)
* feat(ci): add per-component labels for channels, providers, and tools

labeler.yml only had base scope labels (channel, provider, tool) matching
entire directories. The PR workflow and ci-map specs already describe
fine-grained module labels (channel:telegram, provider:kimi, tool:shell)
but the labeler config never implemented them.

Adds 27 channel, 13 provider, and 12 tool group entries. Base labels are
preserved for the compaction rule in pr-labeler.yml.

* feat(ci): add pr-path-labeler workflow to invoke actions/labeler

labeler.yml has path-label config but no workflow was invoking it.
Adds pr-path-labeler.yml on pull_request_target to apply path and
per-component labels automatically. Updates actions source policy
doc to reflect the new workflow and action.

* docs(contributing): add label registry as single reference for all PR labels

Label definitions were scattered across labeler.yml, label-policy.json,
pr-workflow.md, and ci-map.md with no consolidated view. Adds a single
registry documenting all 99 labels by category, their definitions, how
each is applied, and which automation exists vs. is specced but missing.

* docs(contributing): correct label-registry automation status

The CI was simplified from 20+ workflows to 4. Size, risk,
contributor tier, and triage label automation (pr-labeler.yml,
pr-auto-response.yml, pr-check-stale.yml) were removed during
that simplification. Update the registry to reflect that these
labels are now applied manually, not "not yet implemented."

See PR #2931 for the upstream docs cleanup.
2026-03-24 01:43:35 -04:00
Argenis
b4bbe820a2
feat(memory): add pgvector support and Postgres knowledge graph (#4028) (#4488)
* feat(memory): add pgvector support and Postgres knowledge graph (#4028)

Enhance PostgresMemory with optional pgvector extension for hybrid
vector+keyword recall. Add namespace and importance columns. Create
PgKnowledgeGraph with recursive CTEs for graph traversal (no AGE
dependency). Extend ConsolidationResult with facts/trend fields for
richer extraction. All behind memory-postgres feature flag.

New file: knowledge_graph_pg.rs (5 unit tests)
Modified: postgres.rs (pgvector init, namespace/importance columns),
consolidation.rs (facts/trend fields), StorageProviderConfig

* fix: update PostgresMemory::new call in CLI with pgvector params
2026-03-24 01:04:01 -04:00
Argenis
1702bb2747
fix: route WebSocket connections through configured proxy (#4408)
* fix: route WebSocket connections through configured proxy (#4408)

tokio_tungstenite::connect_async does not honour proxy settings. Add
proxy-aware WebSocket connect helpers (HTTP CONNECT and SOCKS5) in
config::schema and update all six channel WebSocket connections
(discord, discord_history, slack, dingtalk, lark, qq) to use
ws_connect_with_proxy instead of connect_async.

* fix: update Cargo.lock with tokio-socks dependency
2026-03-24 01:02:46 -04:00
Argenis
b6f661c3c5
feat(agent): add complexity-based eval and auto-classify routing (#3928) (#4486)
Add heuristic complexity estimation (Simple/Standard/Complex) and
post-response quality evaluation to enable cheapest-capable model
routing. When rule-based classification misses, auto_classify falls
back to complexity-tier hints for model selection. Eval gate checks
response quality and suggests retry with higher-tier model if below
threshold.

New file: eval.rs (14 unit tests)
Modified: AgentConfig (eval, auto_classify fields), classify_model()
2026-03-24 00:51:18 -04:00
Argenis
ac543cff20
fix(tools): allow git_operations to run in workspace subdirectories (#4483)
Fixes #4409. The git_operations tool now accepts an optional 'path'
parameter to specify a subdirectory within the workspace. Path
traversal outside the workspace is rejected for security.
2026-03-24 00:44:41 -04:00
Argenis
c88affa020
fix(cron): treat empty allowed_tools as None and sanitize tool names (#4482)
Fixes #4442. Empty allowed_tools arrays caused all tools to be filtered
out, making cron jobs fail. Also adds tool name validation in the
OpenRouter provider to skip tools with names that violate the OpenAI
naming regex.
2026-03-24 00:44:39 -04:00
Argenis
67998ad702
fix(docker): add [autonomy] section with auto_approve to Docker configs (#4481)
Fixes #4445. Docker/podman containers running in daemon mode had no
[autonomy] section in the baked config.toml, causing file_write and
file_edit tools to be auto-denied in non-interactive mode.
2026-03-24 00:44:32 -04:00
Argenis
c104b23ddb
feat(agent): add token efficiency analyzer layer (#3892) (#4484)
Add history pruning and context-aware tool filtering to reduce token
usage per API call. History pruner collapses old tool call/result pairs
and drops oldest messages when over budget. Context analyzer suggests
relevant tools per iteration based on keyword matching.

New files: history_pruner.rs, context_analyzer.rs
Modified: AgentConfig (new fields), ToolFilterGroup (filter_builtins),
agent loop (pruning integration)
2026-03-24 00:38:15 -04:00
Argenis
698adca707
fix: add channel-lark to default features (#4472)
Resolves #3540. Lark/Feishu channel was not included in default features, causing builds via cargo install, brew, and other standard install methods to produce binaries without Lark support. Also fixes pre-existing lark audio test failures that were previously hidden.
2026-03-24 00:23:43 -04:00
Argenis
50a877b4c1
fix(ci): use shared concurrency group for master push events (#4479)
The concurrency group used github.sha for push events, giving each
merge its own group. Back-to-back merges queued multiple full CI runs
that never cancelled each other, exhausting runner capacity.

Use a fixed 'push-master' group so only the latest push to master
runs CI — older runs are cancelled automatically.
2026-03-24 00:15:26 -04:00
Argenis
fa7b615508
feat(providers): support Bearer token API keys for Amazon Bedrock (#4473)
Add BedrockAuth enum supporting both SigV4 and BearerToken authentication.
BEDROCK_API_KEY env var takes precedence over AWS AKSK credentials.
When a Bearer token is provided, requests use Authorization: Bearer header
instead of SigV4 signing. Existing SigV4 auth remains fully backward
compatible.

Closes #3742
2026-03-24 00:08:13 -04:00
Argenis
ea9eccfe8b
feat: make shell tool timeout configurable via config.toml (#4468)
Add `[shell_tool]` section to config schema with `timeout_secs` field
(default 60s). The ShellTool struct now carries the timeout as an
instance field instead of relying on the hardcoded constant, and the
full tool registry reads the value from Config at construction time.

Closes #4331
2026-03-24 00:07:46 -04:00
Argenis
eb036b4d95
feat(skills): add TEST.sh validation/testing framework (#4476)
Add a `zeroclaw skills test [name] [--verbose]` command that discovers
and executes TEST.sh files inside skill directories. Each line in a
TEST.sh follows the format `command | expected_exit_code | pattern` and
is validated against actual exit code and output (substring or regex).

- Add `Test` variant to `SkillCommands` enum in lib.rs
- Create `src/skills/testing.rs` with test runner, parser, and
  pretty-printed results using the console crate
- Wire the new subcommand into `handle_command()` in skills/mod.rs
- Add example TEST.sh for the browser skill
- Include 14 unit tests covering parsing, pattern matching, execution,
  aggregation, and edge cases

Closes #3697
2026-03-23 23:58:52 -04:00
Argenis
1c07d5b411
feat(tools): add built-in ask_user tool for interactive prompts (#4474)
Implement a new ask_user tool that sends a question to a messaging
channel and waits for the user's response with configurable timeout.
Supports optional structured choices and works across all channels
(Telegram, Discord, Slack, CLI, Web) via the late-bound channel map
pattern used by PollTool and ReactionTool.

Closes #3698
2026-03-23 23:58:50 -04:00
Argenis
0fe3834349
feat(channels): migrate inline transcription to TranscriptionManager (#4475)
Migrate Telegram, Discord, Slack, WhatsApp Web, and Matrix channels
from inline TranscriptionConfig/transcribe_audio() calls to the
centralized TranscriptionManager pattern for unified provider selection,
fallback, and global cap enforcement.

Each channel now:
- Stores a transcription_manager: Option<Arc<TranscriptionManager>> field
- Builds the manager in with_transcription() from config
- Calls manager.transcribe() instead of the legacy transcribe_audio() shim
- Matrix replaces the whisper-cpp shell-out with the manager API

Closes #4311
2026-03-23 23:58:05 -04:00
Argenis
33f9d66b54
feat(channel): add transcribe_non_ptt_audio config for WhatsApp STT (#4470)
Allow transcription of forwarded/regular audio messages on WhatsApp,
not just push-to-talk voice notes. Controlled by the new
`transcribe_non_ptt_audio` option in `[transcription]` (default: false).

Closes #4343
2026-03-23 23:43:28 -04:00
Argenis
368f39829f
fix: strip [Used tools: xxx] logs from channel responses (#4400)
The tool-context summary (e.g. `[Used tools: browser_open]`) was
prepended to assistant history entries for all non-Telegram channels so
the LLM could retain awareness of prior tool usage.  This caused the
model to learn and reproduce the bracket format in its own output,
delivering raw log lines to end-users instead of meaningful tool results.

Three changes:

1. Stop prepending `[Used tools: …]` to stored history for all channels.
   The LLM already receives tool context through the tool-call/result
   messages built by `run_tool_call_loop`, making the summary redundant.

2. Strip any `[Used tools: …]` prefix from cached history on reload, so
   existing sessions are cleaned up retroactively.

3. Strip the prefix in `sanitize_channel_response` as a safety net, in
   case the model still echoes it from older context.
2026-03-23 23:34:29 -04:00
Argenis
9376c26018
feat(tools): add Claude Code task runner with Slack progress and SSH handoff (#4466)
Implement ClaudeCodeRunner that spawns Claude Code in a tmux session with
HTTP hooks to POST tool execution events back to ZeroClaw's gateway,
updating a Slack message in-place with progress, plus an SSH handoff link.

Components:
- src/tools/claude_code_runner.rs: session lifecycle (tmux spawn, hook
  config generation, event handling, TTL cleanup)
- src/gateway/api.rs: POST /hooks/claude-code endpoint for hook events
- src/channels/slack.rs: chat.update + chat.postMessage with ts return
- src/config/schema.rs: ClaudeCodeRunnerConfig (enabled, ssh_host,
  tmux_prefix, session_ttl)
- src/tools/mod.rs: register runner tool

Closes #4287
2026-03-23 23:24:58 -04:00
Argenis
08e131d7c6
feat(slack): add /config command with Block Kit UI for model switching (#4464)
Add interactive /config command for Slack that renders Block Kit
static_select dropdowns for provider and model selection. Interactive
block_actions payloads from Socket Mode are translated into synthetic
/models and /model commands so existing runtime command handling
applies the selection to the per-sender RouteSelectionMap.

- Enable runtime model switching for the slack channel
- Add ShowConfig variant to ChannelRuntimeCommand
- Parse /config in parse_runtime_command (gated by supports_runtime_model_switch)
- Build Block Kit JSON with provider/model dropdowns and current selection
- Detect __ZEROCLAW_BLOCK_KIT__ prefix in SlackChannel::send() to post raw blocks
- Handle interactive envelope type in Socket Mode for block_actions events
- Non-Slack channels get a plain-text /config fallback

Closes #4286
2026-03-23 23:24:56 -04:00
Argenis
36db977b35
feat: adopt AGENTS.md as the primary agent instruction format (#4462)
Replace the AGENTS.md symlink with a real file containing all cross-tool
agent instructions. Reduce CLAUDE.md to Claude Code-specific directives
only, with a reference back to AGENTS.md for shared conventions.

Closes #4289
2026-03-23 23:24:45 -04:00
Argenis
92b0ebb61a
fix: include channel-lark in Docker build defaults (#4463)
Resolves #3538. Docker builds only included memory-postgres in ZEROCLAW_CARGO_FEATURES, which meant Lark/Feishu long-connection (WebSocket) support was not compiled into container images.
2026-03-23 23:22:36 -04:00
Argenis
9c312180a2
chore: bump version to 0.6.0 (#4455)
Includes macOS desktop menu bar app, media pipeline channels,
SOP engine improvements, and CLI tool integrations.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-23 21:16:46 -04:00
Argenis
a433c37c53
feat: add macOS desktop menu bar app (Tauri) (#4454)
* feat(tauri): add desktop app scaffolding with Tauri commands

Add Tauri v2 workspace member under apps/tauri with gateway client,
shared state, and Tauri command handlers for status, health, channels,
pairing, and agent messaging. The desktop app communicates with the
ZeroClaw gateway over HTTP and is buildable independently via
`cargo check -p zeroclaw-desktop`.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat(tauri): add system tray icon, menu, and events

Add tray module with menu construction (Show Dashboard, Status, Open
Gateway, Quit) and event handlers. Wire tray setup into the Tauri
builder's setup hook. Add icons/ placeholder directory.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(tauri): add mobile entry, platform capabilities, icons, and tests

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat(tauri): add auto-pairing, gradient icons, and WebView integration

Desktop app now auto-pairs with the gateway on startup using localhost
admin endpoints, injects the token into the WebView localStorage, and
loads the dashboard from the gateway URL. Adds blue gradient Z tray
icons (idle/working/error/disconnected states), proper .icns app icon,
health polling with tray updates, and Tauri-aware API/WS/SSE paths in
the web frontend.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat(tauri): add dock icon, keep-alive for tray mode, and install preflight

- Set macOS dock icon programmatically via NSApplication API so it shows
  the blue gradient Z even in dev builds without a .app bundle
- Add RunEvent::ExitRequested handler to keep app alive as a menu bar app
  when all windows are closed
- Add desktop app preflight checks to install.sh (Rust, Xcode CLT,
  cargo-tauri, Node.js) with automatic build on macOS

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-23 20:55:22 -04:00
Argenis
af8e805016
fix(tools): correct CLI args for codex and gemini harness tools (#4401)
* feat(tools): add Codex CLI and Gemini CLI harness tools

Add two new ACP harness tools following the same pattern as ClaudeCodeTool:

- CodexCliTool: delegates to `codex -q` for OpenAI Codex CLI
- GeminiCliTool: delegates to `gemini -p` for Google Gemini CLI

Both tools share the same security model (rate limiting, act-policy
enforcement, workspace containment, env sanitisation, kill-on-drop
timeout) and are config-gated via `[codex_cli]` / `[gemini_cli]`
sections with `enabled = true`.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(tools): use 'codex exec' instead of invalid '-q' flag

* feat(tools): add OpenCode CLI harness tool

Add opencode_cli tool following the same pattern as codex_cli and
gemini_cli. Uses `opencode run "<prompt>"` for non-interactive
delegation. Includes config struct, factory registration, env
sanitization, timeout, and tests.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(tools): correct CLI invocation args for codex and gemini tools

- Fix codex_cli to use `codex -q` (quiet mode) instead of incorrect
  `codex exec` subcommand, matching the documented behavior and the
  actual @openai/codex CLI interface.
- Fix gemini_cli install instruction to reference `@google/gemini-cli`
  instead of the incorrect `@anthropic-ai/gemini-cli` package name.

---------

Co-authored-by: Giulio V <vannini.gv@gmail.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: rareba <rareba@users.noreply.github.com>
2026-03-23 17:42:07 -04:00
Argenis
9f127f896d
fix(tools): validate task_id to prevent path traversal in delegate tool (#4405)
* feat(tools): add background and parallel execution to delegate tool

Add three new execution modes to the delegate tool:

- background: when true, spawns the sub-agent in a background tokio task
  and returns a task_id immediately. Results are persisted to
  workspace/delegate_results/{task_id}.json.

- parallel: accepts an array of agent names, runs them all concurrently
  with the same prompt, and returns all results when complete.

- action parameter with check_result/list_results/cancel_task support
  for managing background task lifecycle.

Cascade control: background sub-agents use child CancellationToken
derived from the parent, enabling cancel_all_background_tasks() to
abort all running background agents when the parent session ends.

Existing synchronous delegation flow is fully preserved (opt-in only).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(tools): validate task_id as UUID to prevent path traversal in delegate tool

The check_result and cancel_task actions accept a user-provided task_id
that is used directly in filesystem path construction. A malicious
task_id like "../../etc/passwd" could read or overwrite arbitrary files.

Since task_ids are always generated as UUIDs internally, this adds UUID
format validation before any filesystem operations, rejecting invalid
task_id values with a clear error message.

Also updates existing tests to use valid UUID-format task_ids and adds
dedicated path traversal rejection tests.

* fix: add missing attachments field to wati ChannelMessage after media pipeline merge

* fix(channels): add missing attachments field to voice_wake and lark

---------

Co-authored-by: Giulio V <vannini.gv@gmail.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-23 16:57:24 -04:00
Argenis
b98971c635
fix(sop): fix state file leak and add deterministic execution tests (#4404)
* feat(sop): add deterministic execution mode with typed steps and approval checkpoints

Add opt-in deterministic execution to the SOP workflow engine, inspired
by OpenClaw's Lobster engine. Deterministic mode bypasses LLM round-trips
for step transitions, executing steps sequentially with piped outputs.

Key additions:
- SopExecutionMode::Deterministic variant and `deterministic: true` SOP.toml flag
- SopStepKind enum (Execute/Checkpoint) for marking approval pause points
- StepSchema for typed input/output validation (JSON Schema fragments)
- DeterministicRunState for persisting/resuming interrupted workflows
- DeterministicSavings for tracking LLM calls saved
- SopRunAction::DeterministicStep and CheckpointWait action variants
- SopRunStatus::PausedCheckpoint status
- Engine methods: start_deterministic_run, advance_deterministic_step,
  resume_deterministic_run, persist/load_deterministic_state
- SopConfig in config/schema.rs with sops_dir, default_execution_mode,
  max_concurrent_total, approval_timeout_secs, max_finished_runs
- Wire `pub mod sop` in lib.rs (previously dead/uncompiled module)
- Fix pre-existing test issues: TempDir import, async test annotations

All 86 SOP core tests pass (engine: 42, mod: 17, dispatch: 13, types: 14).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(sop): resolve clippy warnings and fix metrics test failures

- Add `ampersona-gates = []` feature declaration to Cargo.toml to fix
  clippy `unexpected cfg condition value` errors in sop/mod.rs,
  sop/audit.rs, and sop/metrics.rs.
- Use dynamic Utc::now() timestamps in metrics test helper `make_run()`
  instead of hardcoded 2026-02-19 dates, which had drifted outside the
  7-day/30-day windowed metric windows causing 7 test failures.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(sop): remove non-functional ampersona-gates feature flag

The ampersona-gates feature flag referenced ampersona_core and
ampersona_engine crates that do not exist, causing cargo check
--all-features to fail. Remove the feature flag and all gated code:

- Remove ampersona-gates from Cargo.toml [features]
- Delete src/sop/gates.rs (entire module behind cfg gate)
- Remove gated methods from audit.rs (log_gate_decision, log_phase_state)
- Remove gated MetricsProvider impl and tests from metrics.rs
- Simplify sop_status.rs gate_eval field and append_gate_status
- Update observability docs (EN + zh-CN)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* style(sop): fix formatting in metrics.rs

* fix(sop): wire SOP tools into tool registry

The five SOP tools (sop_list, sop_advance, sop_execute, sop_approve,
sop_status) existed as source files but were never registered in
all_tools_with_runtime. They are now conditionally registered when
sop.sops_dir is configured.

Also fixes:
- Add mod sop + SopCommands re-export to main.rs (binary crate)
- Handle new DeterministicStep/CheckpointWait variants in match arms
- Add missing struct fields (deterministic, kind, schema, llm_calls_saved)
  to test constructors

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(sop): fix state file leak and add deterministic execution tests

- Fix persist_deterministic_state fallback: use system temp dir instead
  of current working directory when SOP location is unset, preventing
  state files from polluting the working directory.
- Add comprehensive test coverage for deterministic execution path:
  start, advance, checkpoint pause, completion with savings tracking,
  and rejection of non-deterministic SOPs.
- Add tests for deterministic flag in TOML manifest loading and
  checkpoint kind parsing from SOP.md.
- Add serde roundtrip tests for DeterministicRunState, SopStepKind,
  SopExecutionMode::Deterministic, and SopRunStatus::PausedCheckpoint.

* ci: retrigger CI

* fix: add missing attachments field to wati ChannelMessage after media pipeline merge

* fix(channels): add missing attachments field to voice_wake and lark

---------

Co-authored-by: Giulio V <vannini.gv@gmail.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: rareba <rareba@users.noreply.github.com>
2026-03-23 16:40:26 -04:00
Argenis
2ee0229740
fix(gateway): improve WebSocket chat error handling and diagnostics (#4407)
- Add error codes to WS error messages (AGENT_INIT_FAILED, AUTH_ERROR, PROVIDER_ERROR, INVALID_JSON, UNKNOWN_MESSAGE_TYPE, EMPTY_CONTENT)
- Send Close frame with code 1011 on agent init failure
- Add tracing logs for agent init and turn errors
- Increase MAX_API_ERROR_CHARS from 200 to 500
- Frontend: handle abnormal close codes and show config error hints

Fixes #3681
Supersedes #4366
Original work by mark-linyb
2026-03-23 15:30:18 -04:00
Argenis
0d2b57ee2e
fix(channels): ensure newline between narration and draft status lines (#4394)
Narration text from native tool-call providers that doesn't end with a newline now gets one appended before dispatch to the draft updater. Prevents garbled output in Telegram drafts.

Closes #4348
2026-03-23 15:30:15 -04:00
Argenis
b85a445955
fix(channels): prevent draft streaming hang after tool loop completion (#4393)
Drop delta_tx before awaiting draft_updater so the mpsc channel closes and the updater task can terminate. Without this, draft-capable channels (e.g. Telegram) hang indefinitely after the tool loop completes.

Closes #4300
2026-03-23 15:30:12 -04:00
Argenis
dbd8c77519
feat(channels): add automatic media understanding pipeline (#4402)
* feat(channels): add automatic media understanding pipeline for inbound messages

Add MediaPipeline that pre-processes inbound channel message attachments
before the agent sees them:

- Audio: transcribed via existing transcription infrastructure, annotated
  as [Audio transcription: ...]
- Images: annotated with [Image: <file> attached] (vision-aware)
- Video: annotated with [Video: <file> attached] (placeholder for future API)

The pipeline is opt-in via [media_pipeline] config section (default: disabled).
Individual media types can be toggled independently.

Changes:
- New src/channels/media_pipeline.rs with MediaPipeline struct and tests
- New MediaPipelineConfig in config/schema.rs
- Added attachments field to ChannelMessage for media pass-through
- Wired pipeline into process_channel_message after hooks, before agent

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(channels): add attachments field to integration test fixtures

Add missing `attachments: vec![]` to all ChannelMessage struct literals
in channel_matrix.rs and channel_routing.rs after the new attachments
field was added to the struct in traits.rs.

Also fix schema.rs test compilation: make TempDir import unconditional
and add explicit type annotations on tokio::fs calls to resolve type
inference errors in the bootstrap file tests.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(channels): add missing attachments field to gmail_push and discord_history constructors

These channels were added to master after the media pipeline PR was
originally branched. The ChannelMessage struct now requires an
attachments field, so initialise it to an empty Vec for channels
that do not yet extract attachments.

---------

Co-authored-by: Giulio V <vannini.gv@gmail.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-23 15:14:11 -04:00
Argenis
34db67428f
fix(gateway): stream ws agent chat responses (#4390)
Add Agent::turn_streamed() that forwards TurnEvent (Chunk, ToolCall,
ToolResult) through an mpsc channel during execution. The WebSocket
gateway uses tokio::join! to relay these events to the client in real
time instead of waiting for the full turn to complete.

Introduce chunk_reset message type so the frontend clears its draft
buffer before the authoritative done message arrives.  Update the React
AgentChat page to render streamed text live in the typing indicator
area, replacing the bounce-dot animation when content is available.

Backward-compatible: the done message still carries the full_response
field unchanged, and providers that do not support streaming fall back
to the non-streaming chat path transparently.

Closes #4372
2026-03-23 15:01:04 -04:00
Argenis
01d0c6b23a
fix(config): add cost tracking configuration to template and example files (#4387)
Add the [cost] section to dev/config.template.toml and
examples/config.example.toml documenting all CostConfig options
(enabled, daily_limit_usd, monthly_limit_usd, warn_at_percent,
allow_override) with their defaults and commented-out model pricing
examples.

Fixes #4373
2026-03-23 15:01:01 -04:00
Argenis
79f0a5ae30
fix(matrix): handle unused Result from backups().disable() (#4386)
Add `let _ =` prefix to explicitly discard the Result from
`client.encryption().backups().disable().await`, suppressing the
`unused_must_use` compiler warning.

Closes #4374
2026-03-23 15:00:58 -04:00
Argenis
5bdeeba213
feat(wati): add audio and voice message transcription (#4391)
* fix(transcription): remove deployment-specific WireGuard references from doc comments

LocalWhisperProvider and LocalWhisperConfig doc comments referenced
WireGuard and specific internal infrastructure. Deployment topology is
operator choice — replace with neutral examples.

* feat(wati): add audio and voice message transcription

* fix(wati): SSRF host validation, early fromMe check, download size cap

Validate that media_url host matches the configured api_url host before
fetching, preventing SSRF with credential leakage.  Move the fromMe
check before the HTTP download to avoid wasting bandwidth on outgoing
messages.  Add MAX_WATI_AUDIO_BYTES (25 MiB) Content-Length pre-check.
Skip empty transcripts in parse_audio_as_message.  Short-circuit
with_transcription() when config.enabled is false.

* fix(wati): remove dead field, enforce streaming size cap, log errors

- Remove unused transcription config field (clippy dead code)
- Use streaming download to enforce size cap without Content-Length
- Log network errors instead of silently swallowing with .ok()?

---------

Co-authored-by: Nim G <theredspoon@users.noreply.github.com>
2026-03-23 14:58:21 -04:00
Argenis
b5447175ff
feat(mattermost): add audio transcription via TranscriptionManager (#4389)
* fix(transcription): remove deployment-specific WireGuard references from doc comments

LocalWhisperProvider and LocalWhisperConfig doc comments referenced
WireGuard and specific internal infrastructure. Deployment topology is
operator choice — replace with neutral examples.

* feat(mattermost): add audio transcription via TranscriptionManager

Add transcription support for Mattermost audio attachments. Routes
audio through TranscriptionManager when configured, with duration
limit enforcement and wiremock-based integration tests.

* fix(mattermost): add download size cap, HTTP status check, warn logging

Replace chained .ok()? calls in try_transcribe_audio_attachment with
explicit error handling that logs warnings on HTTP failure, non-success
status codes, and oversized files.  Add MAX_MATTERMOST_AUDIO_BYTES
(25 MiB) Content-Length pre-check.  Remove mp4 and webm from the
extension-only fallback in is_audio_file().  Short-circuit
with_transcription() when config.enabled is false.

* fix(mattermost): add [Voice] prefix, filter empty, fix config init

- Add [Voice] prefix to transcribed audio matching Slack/Discord
- Filter empty transcription results
- Only store transcription config on successful manager init
- Update test expectation for [Voice] prefix

---------

Co-authored-by: Nim G <theredspoon@users.noreply.github.com>
2026-03-23 14:58:18 -04:00
Argenis
0dc05771ba
feat(lark): add audio message transcription (#4388)
* fix(transcription): remove deployment-specific WireGuard references from doc comments

LocalWhisperProvider and LocalWhisperConfig doc comments referenced
WireGuard and specific internal infrastructure. Deployment topology is
operator choice — replace with neutral examples.

* feat(lark): add audio message transcription

* test(channels): add three missing lark audio transcription tests

- lark_audio_skips_when_manager_none: parse_event_payload_async returns
  empty when transcription_manager is None
- lark_audio_routes_through_transcription_manager: wiremock end-to-end
  test proving file_key → download → whisper → ChannelMessage chain
- lark_audio_token_refresh_on_invalid_token_response: wiremock test
  verifying 401 triggers token invalidation and retry

Adds a #[cfg(test)] api_base_override field to LarkChannel so wiremock
can intercept requests that normally go to hardcoded Lark/Feishu API
base URLs.

* fix(lark): add audio download size cap, event_type guard, skip disabled transcription

Add MAX_LARK_AUDIO_BYTES (25 MiB) Content-Length pre-check before
reading audio response bodies on both the normal and token-refresh
paths.  Add the missing im.message.receive_v1 event_type guard in
parse_event_payload_async so non-message callbacks are rejected before
the audio branch.  Short-circuit with_transcription() when
config.enabled is false.

* fix(lark): address review feedback for audio transcription

- Wire parse_event_payload_async in webhook handler (was dead code)
- Use streaming download with enforced size cap (no Content-Length bypass)
- Fix test: lark_manager_some_when_valid_config asserted is_some with enabled=false
- Fix test: add missing header to lark_audio_skips_when_manager_none payload
- Remove redundant group-mention check in audio arm (shared check covers it)

---------

Co-authored-by: Nim G <theredspoon@users.noreply.github.com>
2026-03-23 14:58:11 -04:00
Argenis
10f9ea3454
fix(security): update blocked_commands_basic test after #4338 (#4399)
* fix(security): update blocked_commands_basic test after allowing python/node

After adding python3 and node to default_allowed_commands in #4338,
the test asserting they are blocked is now wrong. Replace with ruby
and perl which remain correctly blocked.

* fix(security): also fix python3 reference in pipe validation test

The command_with_pipes_validates_all_segments test also referenced
python3 in a blocked assertion. Replace with ruby.
2026-03-23 14:16:42 -04:00
linyibin
3ec532bc29
fix(config): add cost tracking configuration section to template (#4368)
Add [cost] section to dev/config.template.toml with all configuration options and commented per-model pricing examples. Fixes #3679.
2026-03-23 14:14:59 -04:00
Nisit Sirimarnkit
f1688c5910
fix(agent): use Gregorian datetime and prioritize date context in prompts (#4369)
Force explicit Gregorian year/month/day via Datelike/Timelike traits instead of chrono format() which inherits system locale (e.g. Buddhist calendar on Thai systems). Prepend datetime before memory context in user messages. Rename DateTimeSection header to CRITICAL CONTEXT.
2026-03-23 14:14:56 -04:00
Argenis
fd9f140268
feat(tools): add cross-channel poll creation tool (#4396)
* feat(tools): add cross-channel poll creation tool

Adds a poll tool that enables cross-channel poll creation with voting
support. Changes all_tools_with_runtime return type from 3-tuple to
4-tuple to accommodate the new reaction handle.

Original PR #4243 by rareba.

* ci: retrigger CI
2026-03-23 13:58:01 -04:00
Argenis
b2dccf86eb
feat(tools): add Firecrawl fallback for JS-heavy sites (#4395)
* feat(tools): add Firecrawl fallback for JS-heavy and bot-blocked sites

Adds optional Firecrawl API integration as a fallback when standard web
fetching fails due to JavaScript-heavy pages or bot protection. Includes
configurable API key, timeout, and domain allowlist/blocklist.

Original PR #4244 by rareba.

* ci: retrigger CI
2026-03-23 13:57:56 -04:00
Argenis
f0c106a938
fix(update): diagnose arch mismatch in validate_binary before execution (#4379)
When a downloaded binary has the wrong architecture, the previous code
attempted to execute it and surfaced a raw "Exec format error (os error 8)".
Now validate_binary reads the ELF/Mach-O header first, compares the binary
architecture against the host, and reports a clear diagnostic like:
"architecture mismatch: downloaded binary is aarch64 but this host is x86_64".

Closes #4291
2026-03-23 13:36:44 -04:00
Keith_void
e276e66c05
fix(config): resolve macOS test failures from path canonicalization (#4362)
- Prefer HOME env var in default_config_dir() before falling back to UserDirs::new()
- Pre-create temp_default_dir in test so is_temp_directory() can canonicalize /var -> /private/var symlink on macOS
2026-03-23 13:33:19 -04:00
Giulio V
2300f21315
feat(channels): extend /new session reset to all channels (#4361)
Move supports_runtime_model_switch() guard from parse_runtime_command() entry to /model and /models match arms only, so /new is available on every channel.

Closes #4236
2026-03-23 13:33:06 -04:00
Giulio V
2575edb1d2
feat(tools): add LLM task tool for structured JSON-only sub-calls (#4241)
* feat(tools): add LLM task tool for structured JSON-only sub-calls

Add LlmTaskTool — a lightweight tool that runs a single prompt through
an LLM provider with no tool access and optionally validates the
response against a caller-supplied JSON Schema. Ideal for structured
data extraction, classification, and transformation in workflows.

- Parameters: prompt (required), schema (optional JSON Schema),
  model (optional override), temperature (optional override)
- Uses configured default provider/model from root config
- Validates response JSON against schema (required fields, type checks)
- Strips markdown code fences from LLM responses before validation
- Gated by ToolOperation::Act security policy
- Registered in all_tools_with_runtime (always available)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: use non-constant value instead of approximate PI in tests

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: rareba <rareba@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-23 12:50:22 -04:00
Giulio V
d31f2c2d97
feat(agent): add loop detection guardrail for repetitive tool calls (#4240)
* feat(agent): add loop detection guardrail for repetitive tool calls

Introduces a LoopDetector that monitors a sliding window of recent tool
calls and detects three repetitive patterns:

1. Exact repeat — same tool+args called consecutively (default 3+)
2. Ping-pong — two tools alternating for 4+ cycles
3. No progress — same tool with different args but identical results (5+)

Each pattern escalates through Warning -> Block -> CircuitBreaker.
Configurable via [pacing] section: loop_detection_enabled (default true),
loop_detection_window_size (default 20), loop_detection_max_repeats
(default 3).

Wired into run_tool_call_loop alongside the existing time-gated
identical-output detector. Respects loop_ignore_tools exclusion list.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(agent): fix channel test interaction with loop detector

The max_tool_iterations channel tests use an IterativeToolProvider that
intentionally repeats identical tool calls. The loop detector (enabled by
default) fires its circuit breaker before max_tool_iterations is reached,
causing the test to fail. Disable loop detection in these two tests so
they exercise only the max_tool_iterations boundary.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(agent): address PR #4240 review — loop detector correctness and test precision

Critical fixes:
- Fix result_index/tool_calls misalignment: use enumerate() before
  filter_map() so the index stays aligned with tool_calls even when
  ordered_results contains None entries from skipped tool calls.
- Fix hash_value JSON key-order sensitivity: canonicalise() recursively
  sorts object keys before serialisation so {"a":1,"b":2} and
  {"b":2,"a":1} hash identically.

Tightened test assertions:
- ping_pong_escalates_with_more_cycles: assert Block with 5 cycles
  (was loose Warning|Block|Break match).
- no_progress_escalates_to_block_and_break: assert Break at 7 calls
  (was loose Block|Break match).
- no_progress_not_triggered_when_all_args_identical: assert Warning
  specifically (was accepting Ok as alternative).

New tests:
- ping_pong_detects_alternation_with_varying_args (item 3)
- window_eviction_prevents_stale_pattern_detection (item 4)
- hash_value_is_key_order_independent + nested variant (item 2)
- pacing_config_serde_defaults_match_manual_default (item 5)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: rareba <rareba@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-23 12:49:47 -04:00
Argenis
9d7f6c5aaf
fix(skills): surface actionable warning when skills are skipped due to script policy (#4383)
When a skill directory contains script files and `skills.allow_scripts`
is false (the default), the skill was silently skipped with only a
tracing::warn that most users never see. The LLM then reports "tool is
not available in this environment" with no guidance on how to fix it.

Now the loader emits a user-visible stderr warning that names the
skipped skill and suggests setting `skills.allow_scripts = true`.

Closes #4292
2026-03-23 12:47:32 -04:00
Argenis
f9081fcfa7
fix(update): use exact target triples in find_asset_url to prevent wrong binary selection (#4377)
The previous substring match (e.g. "aarch64-unknown-linux") could match
both gnu and android release assets. Switch to full triples like
"aarch64-unknown-linux-gnu" so find_asset_url only selects the correct
platform binary.

Closes #4293
2026-03-23 12:41:17 -04:00
Argenis
a4d95dec0e
fix(cost): enable cost tracking by default (#3679) (#4382)
Cost tracking was disabled by default (enabled: false), causing the
/api/cost endpoint to always return zeros unless users explicitly
opted in. Change the default to enabled: true so cost tracking works
out of the box for all models and channels.
2026-03-23 12:38:35 -04:00
Argenis
31508b8ec7
fix(build): disable prometheus on 32-bit ARM targets in installer (#3677) (#4384)
The prometheus crate requires AtomicU64 which is unavailable on 32-bit
ARM (armv7l/armv6l). Detect 32-bit ARM in install.sh and build with
--no-default-features, re-enabling only channel-nostr and
skill-creation (excluding observability-prometheus).
2026-03-23 12:35:09 -04:00
Argenis
3d9069552c
fix(gateway): send error responses for unrecognized WebSocket message types (#3681) (#4381)
Previously, the /ws/chat handler silently ignored messages with
unrecognized types, leaving clients waiting for a response that never
comes. Now sends explicit error messages describing the expected format.
2026-03-23 12:33:23 -04:00
Argenis
3b074041bf
fix(ollama): strip /api/chat suffix from user-provided base URL (#4376)
When users configure the Ollama api_url with the full endpoint path
(e.g. http://host:11434/api/chat), normalize_base_url only stripped
/api but not /api/chat, causing the final request URL to become
/api/chat/api/chat which fails on remote servers.

Closes #4342
2026-03-23 12:19:26 -04:00
Argenis
9a95318b85
fix(security): add python, python3, pip, node to default allowed commands (#4375)
The default_allowed_commands() list was missing common scripting
runtimes. After #3691 added serde defaults to AutonomyConfig, users
who omitted allowed_commands from their config silently fell back to
the restrictive default list, breaking shell tool access to Python.

Closes #4338
2026-03-23 12:19:23 -04:00
Argenis
ccd572b827
fix(matrix): handle unused Result from backups().disable() (#4374)
Add let _ = prefix to suppress unused Result warning on
client.encryption().backups().disable().await call.

Closes #4339
2026-03-23 12:19:10 -04:00
Argenis
41dd23175f
chore(ci): unify release pipeline for full auto-sync (#4283)
- Add tag-push trigger to Release Stable workflow so `git push origin v0.5.9`
  auto-triggers the full pipeline (builds, Docker, crates.io, website,
  Scoop, AUR, Homebrew, tweet) in one shot
- Add Homebrew Core as downstream job (was manual-only, never auto-triggered)
- Add `workflow_call` to pub-homebrew-core.yml so it can be called from
  the stable release workflow
- Skip beta releases on version bump commits (prevents beta/stable race)
- Skip auto crates.io publish when stable tag exists (prevents double-publish)
- Auto-create git tag on manual dispatch so tag always exists for downstream
- Fix cut_release_tag.sh to reference correct workflow name
2026-03-22 20:56:47 -04:00
Argenis
864d754b56
chore: bump version to 0.5.9 (#4282)
Release 0.5.9 includes:
- feat: enable internet access by default (#4270)
- feat: add browser automation skill and VNC setup scripts (#4281)
- feat: add voice message transcription support (#4278)
- feat: add image and file support for Feishu/Lark channel (#4280)
- feat: declarative cron job configuration (#4045)
- feat: add SearXNG search provider support (#4272)
- feat: register skill tools as callable tool specs (#4040)
- feat: named sessions with reconnect and validation (#4275)
- feat: restore time-decay scoring for memory (#4274)
- fix: prevent thinking level prefix leak across turns (#4277)
- fix: link enricher title extraction byte offset bug (#4271)
- fix: WhatsApp Web delivery channel with backend validation (#4273)
2026-03-22 20:23:55 -04:00
Argenis
ccd52f3394
feat: add browser automation skill and VNC setup scripts (#4281)
* feat: add browser automation skill and VNC setup scripts

- Add browser skill template for agent-browser CLI integration
- Add VNC setup scripts for GUI browser access (Xvfb, x11vnc, noVNC)
- Add comprehensive browser setup documentation
- Enables headless browser automation for AI agents

Tested on: Ubuntu 24.04, ZeroClaw 0.5.7, agent-browser 0.21.4

Co-authored-by: OpenClaw Assistant

* fix(docs): fix markdown lint errors and update browser config docs

- SKILL.md: add blank lines around headings (MD022)
- browser-setup.md: wrap bare URLs in angle brackets (MD034)
- browser-setup.md: rename duplicate "Access" heading (MD024)
- Update config examples to reflect browser enabled by default
- Add examples for restricting/disabling browser via config

---------

Co-authored-by: Argenis <theonlyhennygod@users.noreply.github.com>
2026-03-22 19:35:20 -04:00
Argenis
eb01aa451d
feat: add voice message transcription support (#4278)
Closes #4231. Adds voice memo detection and transcription for Slack and Discord channels. Audio files are downloaded, transcribed via the existing transcription module, and passed as text to the LLM.
2026-03-22 19:18:07 -04:00
Argenis
c785b45f2d
feat: add image and file support for Feishu/Lark channel (#4280)
Closes #4235. Adds image and file message type handling in the Lark channel - downloads images/files via Lark API, detects MIME types, and passes content to the model for analysis.
2026-03-22 19:14:43 -04:00
Argenis
ffb8b81f90
fix(agent): prevent thinking level prefix from leaking across turns (#4277)
* feat(agent): add thinking/reasoning level control per message

Users can set reasoning depth via /think:high etc. with resolution
hierarchy (inline > session > config > default). 6 levels from Off
to Max. Adjusts temperature and system prompt.

* fix(agent): prevent thinking level prefix from leaking across interactive turns

system_prompt was mutated in place for the first message's thinking
directive, then used as the "baseline" for restoration after each
interactive turn. This caused the first turn's thinking prefix to
persist across all subsequent turns.

Fix: save the original system_prompt before any thinking modifications
and restore from that saved copy between turns.
2026-03-22 19:09:12 -04:00
Argenis
65f856d710
fix(channels): link enricher title extraction byte offset bug (#4271)
* feat(channels): add automatic link understanding for inbound messages

Auto-detects URLs in inbound messages, fetches content, extracts
title+summary, and enriches the message before the agent sees it.
Includes SSRF protection (rejects private IPs).

* fix(channels): use lowercased string for title extraction to prevent byte offset mismatch

extract_title used byte offsets from the lowercased HTML to index into
the original HTML. Multi-byte characters whose lowercase form has a
different byte length (e.g. İ → i̇) would produce wrong slices or panics.
Fix: extract from the lowercased string directly. Add multibyte test.
2026-03-22 19:09:09 -04:00
Argenis
1682620377
feat(tools): enable internet access by default (#4270)
* feat(tools): enable internet access by default

Enable web_fetch, web_search, http_request, and browser tools by
default so ZeroClaw has internet access out of the box. Security
remains fully toggleable via config (set enabled = false to disable).

- web_fetch: enabled with allowed_domains = ["*"]
- web_search: enabled with DuckDuckGo (free, no API key)
- http_request: enabled with allowed_domains = ["*"]
- browser: enabled with allowed_domains = ["*"], agent_browser backend
- text_browser: remains opt-in (requires external binary)

* fix(tests): update component test for browser enabled by default

Update config_nested_optional_sections_default_when_absent to expect
browser.enabled = true, matching the new default.
2026-03-22 19:07:12 -04:00
Argenis
aa455ae89b
feat: declarative cron job configuration (#4045)
Add support for defining cron jobs directly in the TOML config file via
`[[cron.jobs]]` array entries. Declarative jobs are synced to the SQLite
database at scheduler startup with upsert semantics:

- New declarative jobs are inserted
- Existing declarative jobs are updated to match config
- Stale declarative jobs (removed from config) are deleted
- Imperative jobs (created via CLI/API) are never modified

Each declarative job requires a stable `id` for merge tracking. A new
`source` column (`"imperative"` or `"declarative"`) distinguishes the
two creation paths. Shell jobs require `command`, agent jobs require
`prompt`, validated before any DB writes.
2026-03-22 19:03:00 -04:00
Argenis
a9ffd38912
feat(memory): restore time-decay scoring lost in main→master migration (#4274)
Apply exponential time decay (2^(-age/half_life), 7-day half-life) to
memory entry scores post-recall. Core memories are exempt (evergreen).
Consolidate duplicate half-life constants into a single public constant
in the decay module.

Based on PR #4266 by 5queezer with constant consolidation fix.
2026-03-22 19:01:40 -04:00
Argenis
86a0584513
feat: add SearXNG search provider support (#4272)
Closes #4152. Adds SearXNG as a search provider option with JSON output support, configurable instance URL, env override support, and 7 new tests.
2026-03-22 19:01:35 -04:00
Argenis
abef4c5719
fix(cron): add WhatsApp Web delivery channel with backend validation (#4273)
Apply PR #4258 changes to add whatsapp/whatsapp-web/whatsapp_web match
arm in deliver_announcement, feature-gated behind whatsapp-web.

Added is_web_config() guard to bail early when the WhatsApp config is
for Cloud API mode (no session_path), preventing a confusing runtime
failure with an empty session path.
2026-03-22 18:58:26 -04:00
Argenis
483b2336c4
feat(gateway): add named sessions with reconnect and validation fixes (#4275)
* fix(cron): add WhatsApp Web delivery channel with backend validation

Apply PR #4258 changes to add whatsapp/whatsapp-web/whatsapp_web match
arm in deliver_announcement, feature-gated behind whatsapp-web.

Added is_web_config() guard to bail early when the WhatsApp config is
for Cloud API mode (no session_path), preventing a confusing runtime
failure with an empty session path.

* feat(gateway): add named sessions with human-readable labels

Apply PR #4267 changes with bug fixes:
- Add get_session_name trait method so WS session_start includes the
  stored name on reconnect (not just when ?name= query param is present)
- Rename API now returns 404 for non-existent sessions instead of
  silently succeeding
- Empty ?name= query param on WS connect no longer clears existing name
2026-03-22 18:58:15 -04:00
Argenis
14cda3bc9a
feat: register skill tools as callable tool specs (#4040)
Skill tools defined in [[tools]] sections are now registered as first-class
callable tool specs via the Tool trait, rather than only appearing as XML
in the system prompt. This enables the LLM to invoke skill tools through
native function calling.

- Add SkillShellTool for shell/script kind skill tools
- Add SkillHttpTool for http kind skill tools
- Add skills_to_tools() conversion and register_skill_tools() wiring
- Wire registration into both CLI and process_message agent paths
- Update prompt rendering to mark registered tools as callable
- Update affected tests across skills, agent/prompt, and channels
2026-03-22 18:51:24 -04:00
Argenis
6e8f0fa43c
docs: add ADR for tool shared state ownership contract (#4057)
Define the contract for long-lived shared state in multi-client tool
execution, covering ownership (handle pattern), identity assignment
(daemon-provided ClientId), lifecycle (validation at registration),
isolation (per-client for security state), and reload semantics
(config hash invalidation).
2026-03-22 18:40:34 -04:00
argenis de la rosa
a965b129f8 chore: bump version to 0.5.8
Release trigger bump after recent fixes.
2026-03-22 16:29:45 -04:00
Argenis
c135de41b7
feat(matrix): add allowed_rooms config for room-level gating (#4230) (#4260)
Add an `allowed_rooms` field to MatrixConfig that controls which rooms
the bot will accept messages from and join invites for. When the list
is non-empty, messages from unlisted rooms are silently dropped and
room invites are auto-rejected. When empty (default), all rooms are
allowed, preserving backward compatibility.

- Config: add `allowed_rooms: Vec<String>` with `#[serde(default)]`
- Message handler: replace disabled room_id filter with allowlist check
- Invite handler: auto-accept allowed rooms, auto-reject others
- Support both canonical room IDs and aliases, case-insensitive
2026-03-22 14:41:43 -04:00
Argenis
2d2c2ac9e6
feat(telegram): support forwarded messages with attribution (#4265)
Parse forward_from, forward_from_chat, and forward_sender_name fields
from Telegram message updates. Prepend forwarding attribution to message
content so the LLM has context about the original sender.

Closes #4118
2026-03-22 14:36:31 -04:00
Argenis
5e774bbd70
feat(multimodal): route image messages to dedicated vision provider (#4264)
When vision_provider is configured in [multimodal] config, messages
containing [IMAGE:] markers are automatically routed to the specified
vision-capable provider instead of failing on the default text provider.

Closes #4119
2026-03-22 14:36:29 -04:00
Argenis
33015067eb
feat(tts): add local Piper TTS provider (#4263)
Add a piper TTS provider that communicates with a local Piper/Coqui TTS
server via an OpenAI-compatible HTTP endpoint. This enables fully offline
voice pipelines: Whisper (STT) → LLM → Piper (TTS).

Closes #4116
2026-03-22 14:36:26 -04:00
Argenis
6b10c0b891
fix(approval): merge default auto_approve entries with user config (#4262)
When a user provides a custom `auto_approve` list in their TOML
config (e.g. to add an MCP tool), serde replaces the built-in
defaults instead of merging. This causes default safe tools like
`weather`, `calculator`, and `file_read` to lose auto-approve
status and get silently denied in non-interactive channel runs.

Add `ensure_default_auto_approve()` which merges built-in entries
into the user's list after deserialization, preserving user
additions while guaranteeing defaults are always present. Users
who want to require approval for a default tool can use
`always_ask`, which takes precedence.

Closes #4247
2026-03-22 14:28:09 -04:00
Argenis
bf817e30d2
fix(provider): prevent async runtime panic during model refresh (#4261)
Wrap `fetch_live_models_for_provider` calls in
`tokio::task::spawn_blocking` so the `reqwest::blocking::Client`
is created and dropped on a dedicated thread pool instead of
inside the async Tokio context. This prevents the
"Cannot drop a runtime in a context where blocking is not allowed"
panic when running `models refresh --provider openai`.

Closes #4253
2026-03-22 14:22:47 -04:00
Alix-007
0051a0c296
fix(matrix): enforce configured room scope on inbound events (#4251)
Co-authored-by: Alix-007 <267018309+Alix-007@users.noreply.github.com>
2026-03-22 14:08:13 -04:00
Canberk Özkan
d753de91f1
fix(skills): prevent panic by ensuring UTF-8 char boundary during truncation (#4252)
Fixed issue #4139.

Previously, slicing a string at exactly 64 bytes could land in the middle of a multi-byte UTF-8 character (e.g., Chinese characters), causing a runtime panic.

Changes:
- Replaced direct byte slicing with a safe boundary lookup using .char_indices().
- Ensures truncation always occurs at a valid character boundary at or before the 64-byte limit.
- Maintained existing hyphen-trimming logic.

Co-authored-by: loriscience <loriscience@gmail.com>
2026-03-22 14:08:01 -04:00
Argenis
f6b2f61a01
fix(matrix): disable automatic key backup when no backup key is configured (#4259)
Explicitly call `client.encryption().backups().disable()` when backups
are not enabled, preventing the matrix_sdk_crypto crate from attempting
room key backups on every sync cycle and spamming the logs with
"Trying to backup room keys but no backup key was found" warnings.

Closes #4227
2026-03-22 13:55:45 -04:00
Argenis
70e7910cb9
fix(web): remove unused import blocking release pipeline (#4234)
fix(web): remove unused import blocking release pipeline
2026-03-22 01:35:26 -04:00
argenis de la rosa
a8868768e8 fix(web): remove unused ChevronsUpDown import blocking release pipeline
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-22 01:20:51 -04:00
Argenis
67293c50df
chore: bump version to 0.5.7 (#4232)
chore: bump version to 0.5.7
2026-03-22 01:14:08 -04:00
argenis de la rosa
1646079d25 chore: bump version to 0.5.7
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-22 00:49:41 -04:00
Argenis
25b639435f
fix: merge voice-wake feature (PR #4162) with conflict resolution (#4225)
* feat(channels): add voice wake word detection channel

Add VoiceWakeChannel behind the `voice-wake` feature flag that:
- Captures audio from the default microphone via cpal
- Uses energy-based VAD to detect speech activity
- Transcribes speech via the existing transcription API (Whisper)
- Checks for a configurable wake word in the transcription
- On detection, captures the following utterance and dispatches it
  as a ChannelMessage

State machine: Listening -> Triggered -> Capturing -> Processing -> Listening

Config keys (under [channels_config.voice_wake]):
- wake_word (default: "hey zeroclaw")
- silence_timeout_ms (default: 2000)
- energy_threshold (default: 0.01)
- max_capture_secs (default: 30)

Includes tests for config parsing, state machine, RMS energy
computation, and WAV encoding.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(config): fix pre-existing test compilation errors in schema.rs

- Remove #[cfg(unix)] gate on `use tempfile::TempDir` import since
  TempDir is used unconditionally in bootstrap file tests
- Add explicit type annotations on tokio::fs::* calls to resolve
  type inference failures (create_dir_all, write, read_to_string)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(channels): exclude voice-wake from all-features CI check

Add a `ci-all` meta-feature in Cargo.toml that includes every feature
except `voice-wake`, which requires `libasound2-dev` (ALSA) not present
on CI runners. Update the check-all-features CI job to use
`--features ci-all` instead of `--all-features`.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Giulio V <vannini.gv@gmail.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 00:49:12 -04:00
Argenis
77779844e5
feat(memory): layered architecture upgrade + remove mem0 backend (#4226)
feat(memory): layered architecture upgrade + remove mem0 backend
2026-03-22 00:47:42 -04:00
Argenis
f658d5806a
fix: honor [autonomy] config section in daemon/channel mode
Fixes #4171
2026-03-22 00:47:32 -04:00
Argenis
7134fe0824
Merge pull request #4223 from zeroclaw-labs/fix/4214-heartbeat-utf8-safety
fix(heartbeat): prevent UTF-8 panic, add memory bounds and path validation
2026-03-22 00:41:47 -04:00
Argenis
263802b3df
Merge pull request #4224 from zeroclaw-labs/fix/4215-thai-i18n-cleanup
fix(i18n): remove extra keys and translate notion in th.toml
2026-03-22 00:41:21 -04:00
Argenis
3c25fddb2a
fix: merge Gmail Pub/Sub push PR #4164 (already integrated via #4200) (#4222)
* feat(channels): add Gmail Pub/Sub push notifications for real-time email

Add GmailPushChannel that replaces IMAP polling with Google's Pub/Sub
push notification system for real-time email-driven automation.

- New channel at src/channels/gmail_push.rs implementing the Channel trait
- Registers Gmail watch subscription (POST /gmail/v1/users/me/watch)
  with automatic renewal before the 7-day expiry
- Handles incoming Pub/Sub notifications at POST /webhook/gmail
- Fetches new messages via Gmail History API (startHistoryId-based)
- Dispatches email messages to the agent with full metadata
- Sends replies via Gmail messages.send API
- Config: gmail_push.enabled, topic, label_filter, oauth_token,
  allowed_senders, webhook_url
- OAuth token encrypted at rest via existing secret store
- Webhook endpoint added to gateway router
- 30+ unit tests covering notification parsing, header extraction,
  body decoding, sender allowlist, and config serialization

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(config): fix pre-existing test compilation errors in schema.rs

- Remove #[cfg(unix)] gate on `use tempfile::TempDir` import since
  TempDir is used unconditionally in bootstrap file tests
- Add explicit type annotations on tokio::fs::* calls to resolve
  type inference failures (create_dir_all, write, read_to_string)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(channels): fix extract_body_text_plain test

Gmail API sends base64url without padding. The decode_body function
converted URL-safe chars back to standard base64 but did not restore
the padding, causing STANDARD decoder to fail and falling back to
snippet. Add padding restoration before decoding.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Giulio V <vannini.gv@gmail.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 00:40:42 -04:00
Argenis
a6a46bdd25
fix: add weather tool to default auto_approve list
Fixes #4170
2026-03-22 00:21:33 -04:00
Argenis
235d4d2f1c
fix: replace ILIKE substring matching with full-text search in postgres memory recall()
Fixes #4204
2026-03-22 00:20:11 -04:00
argenis de la rosa
bd1e8c8e1a merge: resolve conflicts with master + remove memory-mem0 from ci-all
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-22 00:18:09 -04:00
Argenis
f81807bff6
fix: serialize env-dependent codex tests to prevent race (#4210) (#4218)
Add a process-scoped Mutex that all env-var-mutating tests in
openai_codex::tests must hold.  This prevents std::env::set_var /
remove_var calls from racing when Rust's test harness runs them on
parallel threads.

Affected tests:
- resolve_responses_url_prefers_explicit_endpoint_env
- resolve_responses_url_uses_provider_api_url_override
- resolve_reasoning_effort_prefers_configured_override
- resolve_reasoning_effort_uses_legacy_env_when_unconfigured
2026-03-22 00:14:01 -04:00
argenis de la rosa
bb7006313c feat(memory): layered architecture upgrade + remove mem0 backend
Implement 6-phase memory system improvement:
- Multi-stage retrieval pipeline (cache → FTS → vector)
- Namespace isolation with strict filtering
- Importance scoring (category + keyword heuristics)
- Conflict resolution via Jaccard similarity + superseded_by
- Audit trail decorator (AuditedMemory<M>)
- Policy engine (quotas, read-only namespaces, retention rules)
- Deterministic sort tiebreaker on equal scores

Remove mem0 (OpenMemory) backend — all capabilities now covered
natively with better performance (local SQLite vs external REST API).

46 battle tests + 262 existing tests pass. Backward-compatible:
existing databases auto-migrate, existing configs work unchanged.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-22 00:09:43 -04:00
Argenis
9a49626376
fix: use POSIX-compatible sh -c instead of dash-specific -lc (#4209) (#4217)
* fix: build web dashboard during install.sh (#4207)

* fix: use POSIX-compatible sh -c instead of dash-specific -lc in cron scheduler (#4209)
2026-03-22 00:07:37 -04:00
Argenis
8b978a721f
fix: build web dashboard during install.sh (#4207) (#4216) 2026-03-22 00:02:54 -04:00
argenis de la rosa
75b4c1d4a4 fix(heartbeat): prevent UTF-8 panic, add memory bounds and path validation in session context
- Use char_indices for safe UTF-8 truncation instead of byte slicing
- Replace unbounded Vec with VecDeque rolling window in load_jsonl_messages
- Add path separator validation for channel/to to prevent directory traversal
2026-03-22 00:01:44 -04:00
argenis de la rosa
b2087e6065 fix(i18n): remove extra keys and translate untranslated notion entry in th.toml 2026-03-21 23:59:46 -04:00
Nisit Sirimarnkit
ad8f81ad76
Merge branch 'master' into i18n/thai-tool-descriptions 2026-03-22 10:28:47 +07:00
ninenox
c58e1c1fb3 i18n: add Thai tool descriptions 2026-03-22 10:09:03 +07:00
Martin Minkus
cb0779d761 feat(heartbeat): add load_session_context to inject conversation history
When `load_session_context = true` in `[heartbeat]`, the daemon loads the
last 20 messages from the target user's JSONL session file and prepends them
to the heartbeat task prompt before calling the LLM.

This gives the companion context — who the user is, what was last discussed —
so outreach messages feel like a natural continuation rather than a blank-slate
ping. Defaults to `false` (opt-in, no change to existing behaviour).

Key behaviours:
- Session context is re-read on every heartbeat tick (not cached at startup)
- Skips context injection if only assistant messages are present (prevents
  heartbeat outputs feeding back in a loop)
- Scans sessions directory for matching JSONL files using flexible filename
  matching: {channel}_{to}.jsonl, {channel}_*_{to}.jsonl, or
  {channel}_{to}_*.jsonl — handles varying session key formats
- Injects file mtime as "last message ~Xh ago" so the LLM knows how long
  the user has been silent

Config example:
  [heartbeat]
  enabled = true
  interval_minutes = 120
  load_session_context = true
  target = "telegram"
  to = "your_username"

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-22 02:44:19 +00:00
Chris Hengge
daca2d9354
fix(web/tools): make section headings collapsible (#4180)
Agent Tools and CLI Tools section headings were static divs with no
way to collapse sections the user is not interested in, making the
page unwieldy with a large tool set.

- Convert both section heading divs to button elements toggling
  agentSectionOpen / cliSectionOpen state (both default open)
- Section content renders conditionally on those booleans
- ChevronsUpDown icon added (already in lucide-react bundle) that
  fades in on hover and indicates collapsed/expanded state
- No change to individual tool card parameter schema expand/collapse

Risk: Low — UI state only, no API or logic change.
Does not change: search/filter behaviour, tool card expand/collapse,
CLI tools table structure.
2026-03-21 22:25:18 -04:00
Chris Hengge
3c1e710c38
fix(web/logs): layout, footer status indicator, and empty-state note (#4203)
Three issues addressed:

Empty page: the logs page shows nothing at idle because the SSE stream
only carries ObserverEvent variants (llm_request, tool_call, error,
agent_start, agent_end). Daemon stdout and RUST_LOG tracing output go
to the terminal/log file and are never forwarded to the broadcast
channel — this is correct behaviour, not a misconfiguration. Added a
dismissible informational banner explaining what appears on the stream
and how to access tracing output (RUST_LOG + terminal).

Layout: flex-1 log entries div was missing min-h-0, which can cause
flex children to refuse to shrink below content size in some browsers.

Connection indicator: moved from the toolbar (where it cluttered the
title and controls) to a compact footer strip below the scroll area,
matching the /agent page pattern exactly.

Also added colour rules for llm_request, agent_start, agent_end,
tool_call_start event types which previously fell through to default
grey.

Risk: Low — UI layout and informational copy only, no backend change.
Does not change: SSE connection logic, event parsing, pause/resume,
type filters, or the underlying broadcast observer.
2026-03-21 22:11:20 -04:00
Chris Hengge
0aefde95f2
fix(web/config): fill viewport and add TOML syntax highlighting (#4201)
Two issues with the config editor:

Layout: the page root had no height constraint and the textarea used
min-h-[500px] resize-y, causing independent scrollbars on both the
page and the editor. Fixed by adopting the Memory/Cron flex column
pattern so the editor fills the remaining viewport height with a single
scroll surface.

Highlighting: plain textarea with no visual structure for TOML.
Added a zero-dependency layered pre-overlay technique — no new npm
packages (per CLAUDE.md anti-pattern rules). A pre element sits
absolute behind a transparent textarea; highlightToml() produces HTML
colour-coding sections, keys, strings, booleans, numbers, datetimes,
and comments via per-line regex. onScroll syncs the overlay. Tab key
inserts two spaces instead of leaving focus.

dangerouslySetInnerHTML used on the pre — content is the user's own
local config, not from the network, risk equivalent to any local editor.

Risk: Low-Medium — no API or backend change. New rendering logic
in editor only.
Does not change: save/load API calls, validation, sensitive field
masking behaviour.
2026-03-21 22:11:18 -04:00
Chris Hengge
a84aa60554
fix(web/cron): contain table scroll within viewport (#4186)
The cron page used a block-flow root with no height constraint, causing
the jobs table to grow taller than the viewport and the page itself to
scroll. This was inconsistent with the Memory page pattern.

- Change page root to flex flex-col h-full matching Memory's layout
- Table wrapper gains flex-1 min-h-0 overflow-auto so it fills
  remaining height and scrolls both axes internally
- Table header already has position:sticky so it pins correctly
  inside the scrolling container with no CSS change needed

Risk: Low — layout only, no logic or API change.
Does not change: job CRUD, modal, catch-up toggle, run history panel.
2026-03-21 22:11:15 -04:00
Chris Hengge
edd4b37325
fix(web/dashboard): rename channels card heading and add internal scroll (#4178)
The card heading used the key dashboard.active_channels ("Active Channels")
even though the card has a toggle between Active and All views, making the
static heading misleading. The channel list div had no height cap, causing
tall channel lists to stretch the card and break 3-column grid alignment.

- Change heading to t("dashboard.channels") — key already present in all
  three locales (zh/en/tr), no i18n changes needed
- Add overflow-y-auto max-h-48 pr-1 to the channel list wrapper so it
  scrolls internally instead of stretching the card
2026-03-21 22:09:00 -04:00
Argenis
c5f0155061
Merge pull request #4193 from zeroclaw-labs/fix/reaction-tool
fix(tools): pass platform channel_id to reaction trait
2026-03-21 21:38:32 -04:00
argenis de la rosa
9ee06ed6fc merge: resolve conflicts with master (image_gen + sessions) 2026-03-21 21:18:46 -04:00
Argenis
ac6b43e9f4
fix: remove unused channel_names field from DiscordHistoryChannel (#4199)
* feat: add discord history logging and search tool with persistent channel cache

* fix: remove unused channel_names field from DiscordHistoryChannel

The channel_names HashMap was declared and initialized but never used.
Channel name caching is handled via discord_memory.get()/store() with
the cache:channel_name: prefix. Remove the dead field.

* style: run cargo fmt on discord_history.rs

---------

Co-authored-by: ninenox <nisit15@hotmail.com>
2026-03-21 21:15:23 -04:00
Argenis
6c5573ad96
Merge pull request #4194 from zeroclaw-labs/fix/session-messaging-tools
fix(security): add enforcement and validation to session tools
2026-03-21 21:15:17 -04:00
Argenis
1d57a0d1e5
fix(web/tools): improve a11y in collapsible section headings (#4197)
* fix(web/tools): make section headings collapsible

Agent Tools and CLI Tools section headings were static divs with no
way to collapse sections the user is not interested in, making the
page unwieldy with a large tool set.

- Convert both section heading divs to button elements toggling
  agentSectionOpen / cliSectionOpen state (both default open)
- Section content renders conditionally on those booleans
- ChevronsUpDown icon added (already in lucide-react bundle) that
  fades in on hover and indicates collapsed/expanded state
- No change to individual tool card parameter schema expand/collapse

Risk: Low — UI state only, no API or logic change.
Does not change: search/filter behaviour, tool card expand/collapse,
CLI tools table structure.

* fix(web/tools): improve a11y and fix invalid HTML in collapsible sections

- Replace <h2> inside <button> with <span role="heading" aria-level={2}>
  to fix invalid HTML (heading elements not permitted in interactive content)
- Add aria-expanded attribute to section toggle buttons for screen readers
- Add aria-controls + id linking buttons to their controlled sections
- Replace ChevronsUpDown with ChevronDown icon — ChevronsUpDown is
  visually symmetric so rotating 180deg has no visible effect; ChevronDown
  rotating to -90deg gives a clear directional cue
- Remove unused ChevronsUpDown import

---------

Co-authored-by: WareWolf-MoonWall <chris.hengge@gmail.com>
2026-03-21 21:02:10 -04:00
Argenis
9780c7d797
fix: restrict free command to Linux-only in security policy (#4198)
* fix: resolve claude-code test flakiness and update security policy

* fix: restrict `free` command to Linux-only in security policy

`free` is not available on macOS or other BSDs. Move it behind
a #[cfg(target_os = "linux")] gate so it is only included in the
default allowed commands on Linux systems.

---------

Co-authored-by: ninenox <nisit15@hotmail.com>
2026-03-21 21:02:05 -04:00
Argenis
35a5451a17
fix(channels): address critical security bugs in Gmail Pub/Sub push (#4200)
* feat(channels): add Gmail Pub/Sub push notifications for real-time email

Add GmailPushChannel that replaces IMAP polling with Google's Pub/Sub
push notification system for real-time email-driven automation.

- New channel at src/channels/gmail_push.rs implementing the Channel trait
- Registers Gmail watch subscription (POST /gmail/v1/users/me/watch)
  with automatic renewal before the 7-day expiry
- Handles incoming Pub/Sub notifications at POST /webhook/gmail
- Fetches new messages via Gmail History API (startHistoryId-based)
- Dispatches email messages to the agent with full metadata
- Sends replies via Gmail messages.send API
- Config: gmail_push.enabled, topic, label_filter, oauth_token,
  allowed_senders, webhook_url
- OAuth token encrypted at rest via existing secret store
- Webhook endpoint added to gateway router
- 30+ unit tests covering notification parsing, header extraction,
  body decoding, sender allowlist, and config serialization

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(config): fix pre-existing test compilation errors in schema.rs

- Remove #[cfg(unix)] gate on `use tempfile::TempDir` import since
  TempDir is used unconditionally in bootstrap file tests
- Add explicit type annotations on tokio::fs::* calls to resolve
  type inference failures (create_dir_all, write, read_to_string)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(channels): fix extract_body_text_plain test

Gmail API sends base64url without padding. The decode_body function
converted URL-safe chars back to standard base64 but did not restore
the padding, causing STANDARD decoder to fail and falling back to
snippet. Add padding restoration before decoding.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(channels): address critical security bugs in Gmail Pub/Sub push

- Add webhook authentication via shared secret (webhook_secret config
  field or GMAIL_PUSH_WEBHOOK_SECRET env var), preventing unauthorized
  message injection through the unauthenticated webhook endpoint
- Add 1MB body size limit on webhook endpoint to prevent memory exhaustion
- Fix race condition in handle_notification: hold history_id lock across
  the read-fetch-update cycle to prevent duplicate message processing
  when concurrent webhook notifications arrive
- Sanitize RFC 2822 headers (To/Subject) to prevent CRLF injection
  attacks that could add arbitrary headers to outgoing emails
- Fix extract_email_from_header panic on malformed angle brackets by
  using rfind('>') and validating bracket ordering
- Add 30s default HTTP client timeout for all Gmail API calls,
  preventing indefinite hangs
- Clone tx sender before message processing loop to avoid holding
  the mutex lock across network calls

---------

Co-authored-by: Giulio V <vannini.gv@gmail.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 20:59:56 -04:00
Argenis
8e81d44d54
fix(gateway): address critical security and reliability bugs in Live Canvas (#4196)
* feat(gateway): add Live Canvas (A2UI) tool and real-time web viewer

Add a Live Canvas system that enables the agent to push rendered content
(HTML, SVG, Markdown, text) to a web-visible canvas in real time.

Backend:
- src/tools/canvas.rs: CanvasTool with render/snapshot/clear/eval actions,
  backed by a shared CanvasStore (Arc<RwLock<HashMap>>) with per-canvas
  broadcast channels for real-time updates
- src/gateway/canvas.rs: REST endpoints (GET/POST/DELETE /api/canvas/:id,
  GET /api/canvas/:id/history, GET /api/canvas) and WebSocket endpoint
  (WS /ws/canvas/:id) for real-time frame delivery

Frontend:
- web/src/pages/Canvas.tsx: Canvas viewer page with WebSocket connection,
  iframe sandbox rendering, canvas switcher, frame history panel

Registration:
- CanvasTool registered in all_tools_with_runtime (always available)
- Canvas routes wired into gateway router
- CanvasStore added to AppState
- Canvas page added to App.tsx router and Sidebar navigation
- i18n keys added for en/zh/tr locales

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(config): fix pre-existing test compilation errors in schema.rs

- Remove #[cfg(unix)] gate on `use tempfile::TempDir` import since
  TempDir is used unconditionally in bootstrap file tests
- Add explicit type annotations on tokio::fs::* calls to resolve
  type inference failures (create_dir_all, write, read_to_string)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(gateway): share CanvasStore between tool and REST API

The CanvasTool and gateway AppState each created their own CanvasStore,
so content rendered via the tool never appeared in the REST API.

Create the CanvasStore once in the gateway, pass it to
all_tools_with_runtime via a new optional parameter, and reuse the
same instance in AppState.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(gateway): address critical security and reliability bugs in Live Canvas

- Validate content_type in REST POST endpoint against allowed set,
  preventing injection of "eval" frames via the REST API
- Enforce MAX_CONTENT_SIZE (256KB) limit on REST POST endpoint,
  matching tool-side validation to prevent memory exhaustion
- Add MAX_CANVAS_COUNT (100) limit to prevent unbounded canvas creation
  and memory exhaustion from CanvasStore
- Handle broadcast RecvError::Lagged in WebSocket handler gracefully
  instead of disconnecting the client
- Make MAX_CONTENT_SIZE and ALLOWED_CONTENT_TYPES pub for gateway reuse
- Update CanvasStore::render and subscribe to return Option for
  canvas count enforcement

---------

Co-authored-by: Giulio V <vannini.gv@gmail.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: rareba <rareba@users.noreply.github.com>
2026-03-21 20:59:18 -04:00
Argenis
86ad0c6a2b
fix(channels): address critical bugs in voice wake word detection (#4191)
* feat(channels): add voice wake word detection channel

Add VoiceWakeChannel behind the `voice-wake` feature flag that:
- Captures audio from the default microphone via cpal
- Uses energy-based VAD to detect speech activity
- Transcribes speech via the existing transcription API (Whisper)
- Checks for a configurable wake word in the transcription
- On detection, captures the following utterance and dispatches it
  as a ChannelMessage

State machine: Listening -> Triggered -> Capturing -> Processing -> Listening

Config keys (under [channels_config.voice_wake]):
- wake_word (default: "hey zeroclaw")
- silence_timeout_ms (default: 2000)
- energy_threshold (default: 0.01)
- max_capture_secs (default: 30)

Includes tests for config parsing, state machine, RMS energy
computation, and WAV encoding.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(config): fix pre-existing test compilation errors in schema.rs

- Remove #[cfg(unix)] gate on `use tempfile::TempDir` import since
  TempDir is used unconditionally in bootstrap file tests
- Add explicit type annotations on tokio::fs::* calls to resolve
  type inference failures (create_dir_all, write, read_to_string)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(channels): exclude voice-wake from all-features CI check

Add a `ci-all` meta-feature in Cargo.toml that includes every feature
except `voice-wake`, which requires `libasound2-dev` (ALSA) not present
on CI runners. Update the check-all-features CI job to use
`--features ci-all` instead of `--all-features`.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(channels): address critical bugs in voice wake word detection

- Replace std::mem::forget(stream) with dedicated thread that holds the
  cpal stream and shuts down cleanly via oneshot channel, preventing
  microphone resource leaks on task cancellation
- Add config validation: energy_threshold must be positive+finite,
  silence_timeout_ms >= 100ms, max_capture_secs clamped to 300
- Guard WAV encoding against u32 overflow for large audio buffers
- Add hard cap on capture_buf size to prevent unbounded memory growth
- Increase audio channel buffer from 4 to 64 slots to reduce chunk
  drops during transcription API calls
- Remove dead WakeState::Processing variant that was never entered

---------

Co-authored-by: Giulio V <vannini.gv@gmail.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 20:43:19 -04:00
Argenis
6ecf89d6a9
fix(ci): skip release and publish workflows on forks (#4190)
When a fork syncs with upstream, GitHub attributes the push to the fork
owner, causing release-beta-on-push and publish-crates-auto to run
under the wrong identity — leading to confusing notifications and
guaranteed failures (missing secrets).

Add repository guards to root jobs so the entire pipeline is skipped
on forks.
2026-03-21 20:42:55 -04:00
argenis de la rosa
691efa4d8c style: fix cargo fmt formatting in reaction tool 2026-03-21 20:38:24 -04:00
argenis de la rosa
d1e3f435b4 style: fix cargo fmt formatting in session tools 2026-03-21 20:38:08 -04:00
Argenis
44c3e264ad
Merge pull request #4192 from zeroclaw-labs/fix/image-gen-tool
fix(tools): harden image_gen security and model validation
2026-03-21 20:37:27 -04:00
argenis de la rosa
f2b6013329 fix(tools): harden image_gen security enforcement and model validation
- Replace manual can_act()/record_action() with enforce_tool_operation()
  to match the codebase convention used by all other tools (notion,
  memory_forget, claude_code, delegate, etc.), producing consistent
  error messages and avoiding logic duplication.

- Add model parameter validation to prevent URL path traversal attacks
  via crafted model identifiers (e.g. "../../evil-endpoint").

- Add tests for model traversal rejection and filename sanitization.
2026-03-21 20:08:51 -04:00
argenis de la rosa
05d3c51a30 fix(security): add security policy enforcement and input validation to session tools
SessionsSendTool was missing security gate enforcement entirely - any agent
could send messages to any session without security policy checks. Similarly,
SessionsHistoryTool had no security enforcement for reading session data.

Changes:
- Add SecurityPolicy field to SessionsHistoryTool (enforces ToolOperation::Read)
- Add SecurityPolicy field to SessionsSendTool (enforces ToolOperation::Act)
- Add session_id validation to reject empty or non-alphanumeric-only IDs
- Pass security policy from all_tools_with_runtime registration
- Add tests for empty session_id, non-alphanumeric session_id validation
2026-03-21 20:04:44 -04:00
argenis de la rosa
2ceda31ce2 fix(tools): pass platform channel_id to reaction trait instead of channel name
The reaction tool was passing the channel adapter name (e.g. "discord",
"slack") as the first argument to Channel::add_reaction() and
Channel::remove_reaction(), but the trait signature expects a
platform-specific channel_id (e.g. Discord channel snowflake, Slack
channel ID like "C0123ABCD"). This would cause all reaction API calls
to fail at the platform level.

Fixes:
- Add required "channel_id" parameter to the tool schema
- Extract and pass channel_id (not channel_name) to trait methods
- Update tool description to mention the new parameter
- Add MockChannel channel_id capture for test verification
- Add test asserting channel_id (not name) reaches the trait
- Update all existing tests to supply channel_id
2026-03-21 20:01:22 -04:00
Argenis
9069bc3c1f
fix(agent): add system prompt budgeting for small-context models (#4185)
For models with small context windows (e.g. glm-4.5-air ~8K tokens),
the system prompt alone can exceed the limit. This adds:

- max_system_prompt_chars config option (default 0 = unlimited)
- compact_context now also compacts the system prompt: skips the
  Channel Capabilities section and shows only tool names
- Truncation with marker when prompt exceeds the budget

Users can set `max_system_prompt_chars = 8000` in [agent] config
to cap the system prompt for small-context models.

Closes #4124
2026-03-21 19:40:21 -04:00
Argenis
9319fe18da
fix(approval): support wildcard * in auto_approve and always_ask (#4184)
auto_approve = ["*"] was doing exact string matching, so only the
literal tool name "*" was matched. Users expecting wildcard semantics
had every tool blocked in supervised mode.

Also adds "prompt exceeds max length" to the context-window error
detection hints (fixes GLM/ZAI error 1261 detection).

Closes #4127
2026-03-21 19:38:11 -04:00
Argenis
cc454a86c8
fix(install): remove pairing code display from installer (#4176)
The gateway pairing code is now shown in the dashboard, so displaying
it in the installer output is redundant and cluttered (showed 3 codes).
2026-03-21 19:06:37 -04:00
Argenis
256e8ccebf
chore: bump version to v0.5.6 (#4174)
Update version across all distribution manifests:
- Cargo.toml / Cargo.lock
- dist/aur/PKGBUILD + .SRCINFO
- dist/scoop/zeroclaw.json
2026-03-21 18:03:38 -04:00
Argenis
72c9e6b6ca
fix(publish): publish aardvark-sys dep before main crate (#4172)
* fix(publish): add aardvark-sys version and publish it before main crate

- Add version = "0.1.0" to aardvark-sys path dependency in Cargo.toml
- Update all three publish workflows to publish aardvark-sys first
- Add aardvark-sys COPY to Dockerfile for workspace builds
- Fixes cargo publish failure: "dependency aardvark-sys does not
  specify a version"

* ci: publish aardvark-sys before main crate in all publish workflows

All three crates.io publish workflows now publish aardvark-sys first,
wait for indexing, then publish the main zeroclawlabs crate.
2026-03-21 16:20:50 -04:00
Argenis
755a129ca2
fix(install): use /dev/tty for sudo in curl|bash Xcode license accept (#4169)
When run via `curl | bash`, stdin is the curl pipe, so sudo cannot
prompt for a password. Redirect sudo's stdin from /dev/tty to reach
the real terminal, allowing the password prompt to work in piped
invocations.
2026-03-21 14:15:21 -04:00
Argenis
8b0d3684c5
fix(install): auto-accept Xcode license instead of bailing out (#4165)
Instead of exiting with a manual remediation step, the installer now
attempts to accept the Xcode/CLT license automatically via
`sudo xcodebuild -license accept`. Falls back to a clear error message
only if sudo fails (e.g. no terminal or password).
2026-03-21 13:57:38 -04:00
Giulio V
cdb5ac1471 fix(tools): fix remove_reaction_success test
The output format used "{action}ed" which produced "removeed" for the
remove action. Use explicit past-tense mapping instead.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 18:49:35 +01:00
Giulio V
67acb1a0bb fix(config): fix pre-existing test compilation errors in schema.rs
- Remove #[cfg(unix)] gate on `use tempfile::TempDir` import since
  TempDir is used unconditionally in bootstrap file tests
- Add explicit type annotations on tokio::fs::* calls to resolve
  type inference failures (create_dir_all, write, read_to_string)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 18:10:05 +01:00
Giulio V
9eac6bafef fix(config): fix pre-existing test compilation errors in schema.rs
- Remove #[cfg(unix)] gate on `use tempfile::TempDir` import since
  TempDir is used unconditionally in bootstrap file tests
- Add explicit type annotations on tokio::fs::* calls to resolve
  type inference failures (create_dir_all, write, read_to_string)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 18:09:48 +01:00
Giulio V
a12f2ff439 fix(config): fix pre-existing test compilation errors in schema.rs
- Remove #[cfg(unix)] gate on `use tempfile::TempDir` import since
  TempDir is used unconditionally in bootstrap file tests
- Add explicit type annotations on tokio::fs::* calls to resolve
  type inference failures (create_dir_all, write, read_to_string)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 18:09:36 +01:00
Argenis
a38a4d132e
fix(hardware): drain stdin in subprocess test to prevent broken pipe flake (#4161)
* fix(hardware): drain stdin in subprocess test to prevent broken pipe flake

The test script did not consume stdin, so SubprocessTool's stdin write
raced against the process exit, causing intermittent EPIPE failures.
Add `cat > /dev/null` to drain stdin before producing output.

* style: format subprocess test
2026-03-21 12:19:53 -04:00
Argenis
48aba73d3a
fix(install): always check Xcode license on macOS, not just with --install-system-deps (#4153)
The Xcode license test-compile was inside install_system_deps(), which
only runs when --install-system-deps is passed. On macOS the default
path skipped this entirely, so users hit `cc` exit code 69 deep in
cargo build. Move the check into the unconditional main flow so it
always fires on Darwin.
2026-03-21 11:29:36 -04:00
Argenis
a1ab1e1a11
fix(install): use test-compile instead of xcrun for Xcode license detection (#4151)
xcrun --show-sdk-path can succeed even when the Xcode/CLT license has
not been accepted, so the previous check was ineffective. Replace it
with an actual test-compilation of a trivial C file, which reliably
triggers the exit-code-69 failure when the license is pending.
2026-03-21 11:03:07 -04:00
Giulio V
f394abf35c feat(tools): add standalone image generation tool via fal.ai
Add ImageGenTool that exposes fal.ai Flux model image generation as a
standalone tool, decoupled from the LinkedIn client. The tool accepts a
text prompt, optional filename/size/model parameters, calls the fal.ai
synchronous API, downloads the result, and saves to workspace/images/.

- New src/tools/image_gen.rs with full Tool trait implementation
- New ImageGenConfig in schema.rs (enabled, default_model, api_key_env)
- Config-gated registration in all_tools_with_runtime
- Security: checks can_act() and record_action() before execution
- Comprehensive unit tests (prompt validation, API key, size enum,
  autonomy blocking, tool spec)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 15:17:28 +01:00
Giulio V
52e0271bd5 feat(tools): add emoji reaction tool for cross-channel reactions
Add ReactionTool that exposes Channel::add_reaction and
Channel::remove_reaction as an agent-callable tool. Uses a
late-binding ChannelMapHandle (Arc<RwLock<HashMap>>) pattern
so the tool can be constructed during tool registry init and
populated once channels are available in start_channels.

Parameters: channel, message_id, emoji, action (add/remove).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 15:15:25 +01:00
Giulio V
6c0a48efff feat(tools): add session list, history, and send tools for inter-agent messaging
Add three new tools in src/tools/sessions.rs:
- sessions_list: lists active sessions with channel, message count, last activity
- sessions_history: reads last N messages from a session by ID
- sessions_send: appends a message to a session for inter-agent communication

All tools operate on the SessionBackend trait, using the JSONL SessionStore
by default. Registered unconditionally in all_tools_with_runtime when the
sessions directory is accessible.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 15:07:18 +01:00
SimianAstronaut7
87b5bca449
feat(config): add configurable pacing controls for slow/local LLM workloads (#3343)
* feat(config): add configurable pacing controls for slow/local LLM workloads (#2963)

Add a new `[pacing]` config section with four opt-in parameters that
let users tune timeout and loop-detection behavior for local LLMs
(Ollama, llama.cpp, vLLM) without disabling safety features entirely:

- `step_timeout_secs`: per-step LLM inference timeout independent of
  the overall message budget, catching hung model responses early.
- `loop_detection_min_elapsed_secs`: time-gated loop detection that
  only activates after a configurable grace period, avoiding false
  positives on long-running browser/research workflows.
- `loop_ignore_tools`: per-tool loop-detection exclusions so tools
  like `browser_screenshot` that structurally resemble loops are not
  counted toward identical-output detection.
- `message_timeout_scale_max`: overrides the hardcoded 4x ceiling in
  the channel message timeout scaling formula.

All parameters are strictly optional with no effect when absent,
preserving full backwards compatibility.

Closes #2963

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(config): add missing pacing fields in tests and call sites

* fix(config): add pacing arg to remaining cost-tracking test call sites

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: argenis de la rosa <theonlyhennygod@gmail.com>
2026-03-21 08:54:08 -04:00
Argenis
be40c0c5a5
Merge pull request #4145 from zeroclaw-labs/feat/gateway-path-prefix
feat(gateway): add path_prefix for reverse-proxy deployments
2026-03-21 08:48:56 -04:00
argenis de la rosa
6527871928 fix: add path_prefix to test AppState in gateway/api.rs 2026-03-21 08:14:28 -04:00
argenis de la rosa
0bda80de9c feat(gateway): add path_prefix for reverse-proxy deployments
Adopted from #3709 by @slayer with minor cleanup.
Supersedes #3709
2026-03-21 08:14:28 -04:00
Argenis
02f57f4d98
Merge pull request #4144 from zeroclaw-labs/feat/claude-code-tool
feat(tools): add ClaudeCodeTool for two-tier agent delegation
2026-03-21 08:14:19 -04:00
Argenis
ef83dd44d7
Merge pull request #4146 from zeroclaw-labs/feat/memory-recall-time-range
feat(memory): add time range filter to recall (since/until)
2026-03-21 08:14:12 -04:00
Argenis
a986b6b912
fix(install): detect un-accepted Xcode license + bump to v0.5.5 (#4147)
* fix(install): detect un-accepted Xcode license before build

Add an xcrun check after verifying Xcode CLT is installed. When the
Xcode/CLT license has not been accepted, cc exits with code 69 and
the build fails with a cryptic linker error. This surfaces a clear
message telling the user to run `sudo xcodebuild -license accept`.

* chore(release): bump version to v0.5.5

Update version across all distribution manifests:
- Cargo.toml and Cargo.lock
- dist/aur/PKGBUILD and .SRCINFO
- dist/scoop/zeroclaw.json
2026-03-21 08:09:27 -04:00
SimianAstronaut7
b6b1186e3b
feat(channel): add per-channel proxy_url support for HTTP/SOCKS5 proxies (#3345)
* feat(channel): add per-channel proxy_url support for HTTP/SOCKS5 proxies

Allow each channel to optionally specify a `proxy_url` in its config,
enabling users behind restrictive networks to route channel traffic
through HTTP or SOCKS5 proxies. When set, the per-channel proxy takes
precedence over the global `[proxy]` config; when absent, the channel
falls back to the existing runtime proxy behavior.

Adds `proxy_url: Option<String>` to all 12 channel config structs
(Telegram, Discord, Slack, Mattermost, Signal, WhatsApp, Wati,
NextcloudTalk, DingTalk, QQ, Lark, Feishu) and introduces
`build_channel_proxy_client`, `build_channel_proxy_client_with_timeouts`,
and `apply_channel_proxy_to_builder` helpers that normalize proxy URLs
and integrate with the existing client cache.

Closes #3262

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(channel): add missing proxy_url fields in test initializers

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: argenis de la rosa <theonlyhennygod@gmail.com>
2026-03-21 07:53:20 -04:00
SimianAstronaut7
00dc0c8670
feat(tool): enrich delegate sub-agent system prompt and add skills_directory config key (#3344)
* feat(tool): enrich delegate sub-agent system prompt and add skills_directory config key (#3046)

Sub-agents configured under [agents.<name>] previously received only the
bare system_prompt string. They now receive a structured system prompt
containing: tools section (allowed tools with parameters and invocation
protocol), skills section (from scoped or default directory), workspace
path, current date/time, safety constraints, and shell policy when shell
is in the effective tool list.

Add optional skills_directory field to DelegateAgentConfig for per-agent
scoped skill loading. When unset, falls back to default workspace
skills/ directory.

Closes #3046

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(tools): add missing fields after rebase

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: argenis de la rosa <theonlyhennygod@gmail.com>
2026-03-21 07:53:02 -04:00
argenis de la rosa
43f2a0a815 fix: add ClaudeCodeConfig to config re-exports and fix formatting 2026-03-21 07:51:36 -04:00
argenis de la rosa
50b5bd4d73 ci: retrigger CI after stuck runners 2026-03-21 07:46:34 -04:00
argenis de la rosa
8c074870a1 fix(memory): replace redundant closures with function references
Clippy flagged `.map(|s| chrono::DateTime::parse_from_rfc3339(s))` as
redundant — use `.map(chrono::DateTime::parse_from_rfc3339)` directly.
2026-03-21 07:46:34 -04:00
argenis de la rosa
61d1841ce3 fix: update gateway mock Memory impls with since/until params
Both test mock implementations of Memory::recall() in gateway/mod.rs
were missing the new since/until parameters.
2026-03-21 07:46:34 -04:00
argenis de la rosa
eb396cf38f feat(memory): add time range filter to recall (since/until)
Adopted from #3705 by @fangxueshun with fixes:
- Added input validation for date strings (RFC 3339)
- Used chrono DateTime comparison instead of string comparison
- Added since < until validation
- Updated mem0 backend
Supersedes #3705
2026-03-21 07:46:34 -04:00
argenis de la rosa
9f1657b9be fix(tools): use kill_on_drop for ClaudeCodeTool subprocess timeout 2026-03-21 07:46:24 -04:00
argenis de la rosa
8fecd4286c fix(tools): use kill_on_drop for ClaudeCodeTool subprocess timeout
Fixes E0382 borrow-after-move error: wait_with_output() consumed the
child handle, making child.kill() in the timeout branch invalid.
Use kill_on_drop(true) with cmd.output() instead.
2026-03-21 07:46:24 -04:00
argenis de la rosa
df21d92da3 feat(tools): add ClaudeCodeTool for two-tier agent delegation
Adopted from #3748 by @ilyasubkhankulov with fixes:
- Removed unused _runtime field
- Fixed subprocess timeout handling
- Excluded unrelated Slack threading and Dockerfile changes

Closes #3748 (superseded)
2026-03-21 07:46:24 -04:00
Argenis
8d65924704
fix(channels): add cost tracking and enforcement to all channels (#4143)
Adds per-channel cost tracking via task-local context in the tool call
loop. Budget enforcement blocks further API calls when limits are
exceeded. Resolves merge conflicts with model-switch retry loop,
reply_target parameter, and autonomy level additions on master.

Supersedes #3758
2026-03-21 07:37:15 -04:00
Argenis
756c3cadff
feat(transcription): add LocalWhisperProvider for self-hosted STT (TDD) (#4141)
Self-hosted Whisper-compatible STT provider that POSTs audio to a
configurable HTTP endpoint (e.g. faster-whisper over WireGuard). Audio
never leaves the platform perimeter.

Implemented via red/green TDD cycles:
  Wave 1 — config schema: LocalWhisperConfig struct, local_whisper field
    on TranscriptionConfig + Default impl, re-export in config/mod.rs
  Wave 2 — from_config validation: url non-empty, url parseable, bearer_token
    non-empty, max_audio_bytes > 0, timeout_secs > 0; returns Result<Self>
  Wave 3 — manager integration: registration with ? propagation (not if let Ok
    — credentials come directly from config, no env-var fallback; present
    section with bad values is a hard error, not a silent skip)
  Wave 4 — transcribe(): resolve_audio_format() extracted from validate_audio()
    so LocalWhisperProvider can resolve MIME without the 25 MB cloud cap;
    size check + format resolution before HTTP send
  Wave 5 — HTTP mock tests: success response, bearer auth header, 503 error

33 tests (20 baseline + 13 new), all passing. Clippy clean.

Co-authored-by: Nim G <theredspoon@users.noreply.github.com>
2026-03-21 07:15:36 -04:00
Argenis
ee870028ff
feat(channel): use Slack native markdown blocks for rich formatting (#4142)
Slack's Block Kit supports a native `markdown` block type that accepts
standard Markdown and handles rendering. This removes the need for a
custom Markdown-to-mrkdwn converter. Messages over 12,000 chars fall
back to plain text.

Co-authored-by: Joe Hoyle <joehoyle@gmail.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 07:12:27 -04:00
frido22
83183a39a5
feat(status): show service running state in zeroclaw status (#3751) 2026-03-21 06:49:47 -04:00
shiben
7a941fb753
feat(auth): add import functionality for existing OpenAI Codex auth p… (#3762)
* feat(auth): add import functionality for existing OpenAI Codex auth profiles

Introduces a new command-line option to import an existing `auth.json` file for OpenAI Codex, allowing users to bypass the login flow. The import feature reads and parses the specified JSON file, extracting authentication tokens and storing them in the user's profile. This change enhances user experience by simplifying the authentication process for existing users.

- Added `import` option to `AuthCommands` enum
- Implemented `import_openai_codex_auth_profile` function to handle the import logic
- Updated `handle_auth_command` to process the import option and validate provider compatibility
- Ensured that the import feature is exclusive to the `openai-codex` provider

* feat(auth): extract expiry from JWT in OpenAI Codex import

Enhances the `import_openai_codex_auth_profile` function by extracting the expiration date from the JWT access token. This change allows for more accurate management of token lifetimes by replacing the hardcoded expiration date with a dynamic value derived from the token itself.

- Added `extract_expiry_from_jwt` function to handle JWT expiration extraction
- Updated `TokenSet` to use the extracted expiration date instead of a static value
2026-03-21 06:49:44 -04:00
Argenis
bcdbce0bee
feat(web): add theme system with CSS variables and settings modal (#4133)
- Add ThemeContext with light/dark/system theme support
- Migrate all hardcoded colors to CSS variables
- Add SettingsModal for theme customization
- Add font loader for dynamic font selection
- Add i18n support for Chinese and Turkish locales
- Fix accessibility: add aria-live to pairing error message

Co-authored-by: nanyuantingfeng <nanyuantingfeng@163.com>
2026-03-21 06:22:30 -04:00
Argenis
abb844d7f8
fix(config): add missing WhatsApp Web policy config keys (#4131)
* fix(config): add missing WhatsApp Web policy config keys (mode, dm_policy, group_policy, self_chat_mode)

* fix(onboard): add missing WhatsApp policy fields to wizard struct literals

The new mode, dm_policy, group_policy, and self_chat_mode fields added
to WhatsAppConfig need default values in the onboard wizard's struct
initializers to avoid E0063 compilation errors.
2026-03-21 06:04:21 -04:00
Argenis
48733d5ee2
feat(cron): add Edit button and modal for updating cron jobs (#4132)
- Backend: add PATCH /api/cron/{id} handler (handle_api_cron_patch)
  using update_shell_job_with_approval with approved=false; validates
  job exists (404 on miss), accepts name/schedule/command patch fields
- Router: register PATCH on /api/cron/{id} alongside existing DELETE
- Frontend API: add patchCronJob(id, patch) calling PATCH /api/cron/{id}
- i18n: add cron.edit, cron.edit_modal_title, cron.edit_error,
  cron.saving, cron.save keys to all 3 locales (zh, en, tr)
- UI: Edit (Pencil) button in Actions column opens a pre-populated modal
  with the job's current name, schedule expression, and command;
  submitting PATCHes the job and updates the table row in-place

Co-authored-by: WareWolf-MoonWall <chris.hengge@gmail.com>
2026-03-21 05:50:23 -04:00
Argenis
2d118af78f
fix(channels): wire model_switch callback into channel inference path (#4130)
The channel path in `src/channels/mod.rs` was passing `None` as the
`model_switch_callback` to `run_tool_call_loop()`, which meant model
switching via the `model_switch` tool was silently ignored in channel
mode.

Wire the callback in following the same pattern as the CLI path:
- Pass `Some(model_switch_callback.clone())` instead of `None`
- Wrap the tool call loop in a retry loop
- Handle `ModelSwitchRequested` errors by re-creating the provider
  with the new model and retrying

Fixes #4107
2026-03-21 05:43:21 -04:00
Argenis
8d7e7e994e
fix(memory): use plain OS threads for postgres operations to avoid nested runtime panic (#4129)
Replace `tokio::task::spawn_blocking()` with plain `std:🧵:Builder`
OS threads in all PostgresMemory trait methods. The sync `postgres` crate
(v0.19.x) internally calls `Runtime::block_on()`, which panics when called
from Tokio's blocking pool threads in recent Tokio versions. Plain OS threads
have no runtime context, so the nested `block_on` succeeds.

This matches the pattern already used in `PostgresMemory::initialize_client()`,
which correctly used `std:🧵:Builder` and never exhibited this bug.

A new `run_on_os_thread` helper centralizes the pattern: spawn an OS thread,
run the closure, and bridge the result back via a `tokio::sync::oneshot` channel.

Fixes #4101
2026-03-21 05:33:55 -04:00
Joe Hoyle
d38d706c8e
feat(channel): add Slack Assistants API status indicators (#4105)
Implement start_typing/stop_typing for Slack using the Assistants API
assistant.threads.setStatus method. Tracks thread context from
assistant_thread_started events and inbound messages, then sets
"is thinking..." status during processing. Status auto-clears when
the bot sends a reply via chat.postMessage.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 05:32:31 -04:00
Chris Hengge
523188da08
feat(tools): add WeatherTool with wttr.in integration (#4104)
Implements the Weather integration as a native Rust Tool trait
implementation, consistent with the existing first-party tool
architecture (no WASM/plugin layer required).

- Add src/tools/weather_tool.rs with full WeatherTool impl
  - Fetches from wttr.in ?format=j1 (no API key, global coverage)
  - Supports city names (any language/script), IATA airport codes,
    GPS coordinates, postal/zip codes, domain-based geolocation
  - Metric (°C, km/h, mm) and imperial (°F, mph, in) units
  - Current conditions + 0-3 day forecast with hourly breakdown
  - Graceful error messages for unknown/invalid locations
  - Respects runtime proxy config via apply_runtime_proxy_to_builder
  - 36 unit tests: schema, URL building, param validation, formatting
- Register WeatherTool unconditionally in all_tools_with_runtime
  (no API key needed, no config gate — same pattern as CalculatorTool)
- Flip integrations registry Weather entry from ComingSoon to Available

Closes #<issue>
2026-03-21 05:32:28 -04:00
Baha Abu Nojaim
82f7fbbe0f
feat(providers): add DeepMyst as OpenAI-compatible provider (#4103)
Register DeepMyst (https://deepmyst.com) as an OpenAI-compatible
provider with Bearer auth and DEEPMYST_API_KEY env var support.
Aliases: "deepmyst", "deep-myst".

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-21 05:32:26 -04:00
Caleb
c1b2fceca5
fix(onboard): make tmux paste safe for text prompts (#4106) 2026-03-21 05:14:37 -04:00
Loc Nguyen Huu
be6e9fca5d
fix(docker): align workspace with Cargo.lock for --locked builds (#4126)
The builder used sed to remove crates/robot-kit from [workspace].members because that path was not copied into the image. Cargo.lock is still generated for the full workspace (including zeroclaw-robot-kit), so the manifest and lockfile disagreed. cargo build --release --locked then tried to rewrite Cargo.lock and failed with "cannot update the lock file ... because --locked was passed" (commonly hit when ZEROCLAW_CARGO_FEATURES includes memory-postgres).

Copy crates/robot-kit/ into the image and drop the sed step so the workspace matches the committed lockfile.

Made-with: Cursor

Co-authored-by: lokinh <locnh@uniultra.xyz>
2026-03-21 05:14:35 -04:00
Greg Lamberson
75c11dfb92
fix(config): prevent test suite from clobbering active_workspace.toml (#4121)
* fix(config): prevent test suite from clobbering active_workspace.toml

Refactor persist_active_workspace_config_dir() to accept the default
config directory as an explicit parameter instead of reading HOME
internally. This eliminates a hidden dependency on process-wide
environment state that caused test-suite runs to overwrite the real
user's active_workspace.toml with a stale temp-directory path.

The temp-directory guard is now unconditional (previously gated behind
cfg(not(test))). It rejects writes only when a temp config_dir targets
a non-temp default location, so test-to-test writes within temp dirs
still succeed.

Closes #4117

* fix: remove needless borrow on default_config_dir parameter

---------

Co-authored-by: lamco-office <office@lamco.io>
2026-03-21 05:14:32 -04:00
tf4fun
48270fbbf3
fix(cron): add qq to supported delivery channel whitelist (#4120)
The channel validation in `validate_announce_delivery` was missing `qq`,
causing API-created cron jobs with QQ delivery to be rejected.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-21 05:14:30 -04:00
Argenis
18a456b24e
fix(mcp): wire MCP tools into WebSocket chat and gateway /api/tools (#4096)
* fix(mcp): wire MCP tools into WebSocket chat path and gateway /api/tools

Agent::from_config() did not initialize MCP tools because it was
synchronous and MCP connection requires async. The gateway tool
registry built for /api/tools also missed MCP tools for the same
reason.

Changes:
- Make Agent::from_config() async so it can call McpRegistry::connect_all()
- Add MCP tool initialization (both eager and deferred modes) to
  from_config(), following the same pattern used in loop_.rs CLI/webhook paths
- Add MCP tool initialization to the gateway's tool registry so
  /api/tools reflects MCP tools
- Update all three call sites (run(), handle_socket, test) to await

Closes #4042

* fix: merge master and fix formatting

* fix: remove underscore prefix from used bindings (clippy)
2026-03-21 05:13:01 -04:00
ehu shubham shaw
71e89801b5
feat(hardware): add RPi GPIO, Aardvark I2C/SPI/GPIO, and hardware plugin system (#4125)
* feat(hardware): add RPi GPIO, Aardvark I2C/SPI/GPIO, and hardware plugin system

Extends the hardware subsystem with three clusters of functionality,
all feature-gated (hardware / peripheral-rpi) with no impact on default builds.

Raspberry Pi native support:
- src/hardware/rpi.rs: board self-discovery (model, serial, revision),
  sysfs GPIO pin read/write, and ACT LED control
- scripts/99-act-led.rules: udev rule for non-root ACT LED access
- scripts/deploy-rpi.sh, scripts/rpi-config.toml, scripts/zeroclaw.service:
  one-shot deployment helper and systemd service template

Total Phase Aardvark USB adapter (I2C / SPI / GPIO):
- crates/aardvark-sys/: new workspace crate with FFI bindings loaded at
  runtime via libloading; graceful stub fallback when .so is absent or
  arch mismatches (Rosetta 2 detection)
- src/hardware/aardvark.rs: AardvarkTransport implementing Transport trait
- src/hardware/aardvark_tools.rs: agent tools i2c_scan, i2c_read,
  i2c_write, spi_transfer, gpio_aardvark
- src/hardware/datasheet.rs: datasheet search/download for detected devices
- docs/aardvark-integration.md, examples/hardware/aardvark/: guide + examples

Hardware plugin / ToolRegistry system:
- src/hardware/tool_registry.rs: ToolRegistry for hardware module tool sets
- src/hardware/loader.rs, src/hardware/manifest.rs: manifest-driven loader
- src/hardware/subprocess.rs: subprocess execution helper for board I/O
- src/gateway/hardware_context.rs: POST /api/hardware/reload endpoint
- src/hardware/mod.rs: exports all new modules; merge_hardware_tools and
  load_hardware_context_prompt helpers

Integration hooks (minimal surface):
- src/hardware/device.rs: DeviceKind::Aardvark, DeviceRuntime::Aardvark,
  has_aardvark / resolve_aardvark_device on DeviceRegistry
- src/hardware/transport.rs: TransportKind::Aardvark
- src/peripherals/mod.rs: gate create_board_info_tools behind hardware feature
- src/agent/loop_.rs: TOOL_CHOICE_OVERRIDE task-local for Anthropic provider
- src/providers/anthropic.rs: read TOOL_CHOICE_OVERRIDE; add tool_choice field
- Cargo.toml: add aardvark-sys to workspace and as dependency
- firmware/zeroclaw-nucleo/: update Cargo.toml and Cargo.lock

Non-goals:
- No changes to agent orchestration, channels, providers, or security policy
- No new config keys outside existing [hardware] / [peripherals] sections
- No CI workflow changes

Risk: Low. All new paths are feature-gated; aardvark.so loads at runtime
only when present. No schema migrations or persistent state introduced.

Rollback: revert this single commit.

* fix(hardware): resolve clippy and rustfmt CI failures

- struct_excessive_bools: allow on DeviceCapabilities (7 bool fields needed)
- unnecessary_debug_formatting: use .display() instead of {:?} for paths
- stable_sort_primitive: replace .sort() with .sort_unstable() on &str slices

* fix(hardware): add missing serial/uf2/pico modules declared in mod.rs

cargo fmt was exiting with code 1 because mod.rs declared pub mod serial,
uf2, pico_flash, pico_code but those files were missing from the branch.
Also apply auto-formatting to loader.rs.

* fix(hardware): apply rustfmt 1.92.0 formatting (matches CI toolchain)

* docs(scripts): add RPi deployment and interaction guide

* push

* feat(firmware): add initial Pico firmware and serial device handling

- Introduced main.py for ZeroClaw Pico firmware with a placeholder for MicroPython implementation.
- Added binary UF2 file for Pico deployment.
- Implemented serial device enumeration and validation in the hardware module, enhancing security by restricting allowed serial paths.
- Updated related modules to integrate new serial device functionality.

---------

Co-authored-by: ehushubhamshaw <eshaw1@wpi.edu>
2026-03-21 04:17:01 -04:00
Argenis
46f6e79557
fix(gateway): improve error message for Docker bridge connectivity (#4095)
When the gateway security guard blocks a public bind address, the error
message now mentions the Docker use case and provides clear instructions
for connecting from Docker containers.

Closes #4086
2026-03-21 00:16:01 -04:00
Argenis
c301b1d4d0
fix(approval): auto-approve read-only tools in non-interactive mode (#4094)
* fix(approval): auto-approve read-only tools in non-interactive mode

Add web_search_tool, web_fetch, calculator, glob_search, content_search,
and image_info to the default auto_approve list. These are read-only tools
with no side effects that were being silently denied in channel mode
(Telegram, Slack, etc.) because the non-interactive ApprovalManager
auto-denies any tool not in auto_approve when autonomy != full.

Closes #4083

* fix: remove duplicate default_otp_challenge_max_attempts function
2026-03-20 23:57:43 -04:00
Argenis
981a93d942
fix(config): remove duplicate default_otp_challenge_max_attempts function (#4098)
The function was defined twice in schema.rs after merging #3921, causing
compilation failures on all downstream branches.
2026-03-20 23:57:27 -04:00
Argenis
34f0b38e42
fix(config): remove duplicate default_otp_challenge_max_attempts function (#4097)
PR #3921 accidentally introduced a duplicate definition of
default_otp_challenge_max_attempts() in config/schema.rs, causing
compilation to fail on master (E0428: name defined multiple times).
2026-03-20 18:49:15 -04:00
khhjoe
00209dd899
feat(memory): add mem0 (OpenMemory) backend integration (#3965)
* feat(memory): add mem0 (OpenMemory) backend integration

- Implement Mem0Memory struct with full Memory trait
- Add history() audit trail, recall_filtered() with time/metadata filters
- Add store_procedural() for conversation trace extraction
- Add ProceduralMessage type to Memory trait with default no-op
- Feature-gated behind `memory-mem0` flag
- 9 unit tests covering edge cases

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* style: apply cargo fmt

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* feat(memory): add extraction_prompt config, deploy scripts, and timing instrumentation

- Add `extraction_prompt` field to `Mem0Config` for custom LLM fact
  extraction prompts (e.g. Cantonese/Chinese content), with
  `MEM0_EXTRACTION_PROMPT` env var fallback
- Pass `custom_instructions` in mem0 store requests so the server
  uses the client-supplied prompt over its default
- Add timing instrumentation to channel message pipeline
  (mem_recall_ms, elapsed_before_llm_ms, llm_call_ms, total_ms)
- Add `deploy/mem0/` with self-hosted mem0 + reranker GPU server
  scripts, fully configurable via environment variables
- Update config reference docs (EN, zh-CN, VI) with `[memory.mem0]`
  subsection

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

# Conflicts:
#	src/channels/mod.rs

* chore: remove accidentally staged worktree from index

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 18:22:44 -04:00
caoy
9e8a478254
fix(gemini): use default chat() for prompt-guided tool calling and add vision support (#3932)
* fix(gemini): use default chat() for prompt-guided tool calling and add multimodal vision support

The Gemini provider's chat() override bypassed the default trait
implementation that injects tool definitions into the system prompt for
providers without native tool calling. This caused the agent loop to see
tool definitions in context but never actually invoke them, resulting in
hallucinated tool calls (e.g. claiming "Stored" without calling
memory_store).

Remove the broken chat() override so the default prompt-guided fallback
in the Provider trait handles tool injection correctly. Add an explicit
capabilities() declaration (native_tool_calling: false, vision: true).

Also add multimodal support: convert Part from a plain struct to an
untagged enum with Text and Inline variants, and add build_parts() to
extract [IMAGE:data:...] markers as Gemini inline_data parts.

Includes 14 new tests covering capabilities, Part serialization,
build_parts edge cases, and role-mapping behavior. Removes unused
ChatResponse import.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* test(gemini): move capabilities tests to component level and add tool conversion test

Move the 3 capabilities tests (native_tool_calling, vision,
supports_native_tools) from the inline module to
tests/component/gemini_capabilities.rs since they exercise the public
Provider trait contract through the factory. Add a new
convert_tools_returns_prompt_guided test verifying the agent loop will
receive PromptGuided payload for Gemini.

Private internals tests (Part serialization, build_parts, role mapping)
remain inline since those types are not publicly exported.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* style(gemini): fix cargo fmt formatting

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(gemini): add prompt_caching field to capabilities declaration

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: myclaw <myclaw@myclaws-MacBook-Air.local>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-20 18:22:41 -04:00
Anton Markelov
96f25ac701
fix(prompt): respect autonomy level in SafetySection (Agent/gateway WS path) (#3952) (#4037)
The `SafetySection` in `SystemPromptBuilder` always hardcoded
"Do not run destructive commands without asking" and "Do not bypass
oversight or approval mechanisms" regardless of the configured
autonomy level. This caused the gateway WebSocket path (web interface)
to instruct the LLM to simulate approval dialogs even when
`autonomy.level = "full"`.

PRs #3955/#3970/#3975 fixed the channel dispatch path
(`build_system_prompt_with_mode_and_autonomy`) but missed the
`Agent::from_config` → `SystemPromptBuilder` path used by
`gateway/ws.rs`.

Changes:
- Add `autonomy_level` field to `PromptContext`
- Rewrite `SafetySection::build()` to conditionally include/exclude
  approval instructions based on autonomy level, matching the logic
  already present in `build_system_prompt_with_mode_and_autonomy`
- Add `autonomy_level` field to `Agent` struct and `AgentBuilder`
- Pass `config.autonomy.level` through `Agent::from_config`
- Add tests for full/supervised autonomy safety section behavior

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 18:22:35 -04:00
Jacobinwwey
fb5c8cb620
feat(tools): route web_search providers with alias fallback (#4038) 2026-03-20 18:22:32 -04:00
ifengqi
9f5543e046
fix(qq): respond to WebSocket Ping frames to prevent connection timeout (#4041)
The QQ channel WebSocket loop did not handle incoming Ping frames,
causing the server to consider the connection dead and drop it. Add a
Ping handler that replies with Pong, keeping the connection alive.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-20 18:22:30 -04:00
luikore
05c9b8180b
Make lark / feishu render markdown (#3866)
Co-authored-by: Luikore <masked>
2026-03-20 18:22:29 -04:00
Thorbjørn Lindeijer
f96a0471b5
fix(providers): clamp unsupported temperatures in Claude Code provider (#3961)
The Claude Code CLI only supports temperatures 0.7 and 1.0, but
internal subsystems (memory consolidation, context summarizer) use
lower values like 0.1 and 0.2. Previously the provider rejected these
with a hard error, triggering retries and WARN-level log noise.

Clamp to the nearest supported value instead, since the CLI ignores
temperature anyway.

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 18:22:26 -04:00
Artem Chernenko
072f5f1170
fix(transcription): honor configured default provider (#3883)
Co-authored-by: Artem Chernenko <12207348+turboazot@users.noreply.github.com>
2026-03-20 18:22:25 -04:00
Darren.Zeng
28f94ae48c
style: enhance .editorconfig with comprehensive file type settings (#3872)
Expand the minimal .editorconfig to include comprehensive settings for
all file types used in the project:

- Add root = true declaration
- Add standard settings: end_of_line, charset, trim_trailing_whitespace,
  insert_final_newline
- Add Rust-specific settings (indent_size = 4, max_line_length = 100)
  to match rustfmt.toml
- Add Markdown settings (preserve trailing whitespace for hard breaks)
- Add TOML, YAML, Python, Shell script, and JSON settings

This ensures consistent editor behavior across all contributors and
matches the project's formatting standards.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude <noreply@anthropic.com>
2026-03-20 18:22:22 -04:00
RoomWithOutRoof
8a217a77f9
fix(config): add challenge_max_attempts field to OtpConfig (#3921)
* fix(docker): default CMD to daemon instead of gateway

- Change default Docker CMD from gateway to daemon in both dev and release stages
- gateway only starts the HTTP/WebSocket server — channel listeners (Matrix, Telegram, Discord, etc.) are never spawned
- daemon starts the full runtime: gateway + channels + heartbeat + scheduler
- Users who configure channels in config.toml and run the default image get no response because the channel sync loops never start

* fix(config): add challenge_max_attempts field to OtpConfig

Add missing challenge_max_attempts field to support OTP challenge
attempt limiting. This field allows users to configure the maximum
number of OTP challenge attempts before lockout.

Fixes #3919

---------

Co-authored-by: Jah-yee <jah.yee@outlook.com>
Co-authored-by: Jah-yee <jahyee@sparklab.ai>
2026-03-20 18:22:19 -04:00
Chris Hengge
79a7f08b04
fix(web): anchor memory table to viewport with dual scrollbars (#4027)
The /memory data grid grew unboundedly with table rows, pushing the
horizontal scrollbar to the very bottom of a tall page and making it
inaccessible without scrolling all the way down first.

- Layout: change outer shell from min-h-screen to h-screen +
  overflow-hidden, and add min-h-0 to <main> so flex-1 overflow-y-auto
  actually clamps at the viewport boundary instead of growing infinitely.
- Memory page: switch root div to flex-col h-full so it fills the
  bounded main area; give the glass-card table wrapper flex-1 min-h-0
  overflow-auto so it consumes remaining space and exposes both
  scrollbars without any page-level scrolling required.
- index.css: pin .table-electric thead th with position:sticky / top:0
  and a matching opaque background so column headers stay visible
  during vertical scroll inside the bounded card.

The result behaves like a bounded iframe: the table fills the available
screen, rows scroll vertically, wide columns scroll horizontally, and
both scrollbars are always reachable.
2026-03-20 18:22:11 -04:00
Argenis
de12055364
feat(slack): implement reaction support for Slack channel (#4091)
* feat(slack): implement reaction support with sanitized error responses

Add add_reaction() and remove_reaction() for Slack channel, with
unicode-to-Slack emoji mapping, idempotent error handling, and
sanitized API error responses matching the pattern used by
chat.postMessage.

Based on #4089 by @joehoyle, with sanitize_api_error() applied to
reaction error paths for consistency with existing Slack methods.

Supersedes #4089

* chore(deps): bump rustls-webpki to 0.103.10 (RUSTSEC-2026-0049)
2026-03-20 18:18:45 -04:00
Argenis
65c34966bb
Merge pull request #4092 from zeroclaw-labs/docs/aur-ssh-key
docs: remove architecture diagram from all READMEs
2026-03-20 18:02:32 -04:00
Will Sarg
4a2be7c2e5
feat(verifiable_intent): add native verifiable intent lifecycle module (#2938)
* feat(verifiable_intent): add native verifiable intent lifecycle module

Implements a Rust-native Verifiable Intent (VI) subsystem for ZeroClaw,
providing full credential lifecycle support for commerce agent authorization
using SD-JWT layered credentials.

New module: src/verifiable_intent/
- error.rs: ViError/ViErrorKind (25+ variants), implements std::error::Error
- types.rs: JWK, Cnf, Entity, Constraint (8 variants), Immediate/Autonomous
  mandate structs, Fulfillment, Layer1/Layer2/CredentialChain
- crypto.rs: base64url helpers, SD hash, JWS sign/verify, EC P-256 key
  generation/loading, disclosure creation, SD-JWT serialize/parse
- verification.rs: StrictnessMode, ChainVerificationResult,
  ConstraintCheckResult, verify_timestamps, verify_sd_hash_binding,
  verify_l3_cross_reference, verify_checkout_hash_binding, check_constraints
- issuance.rs: create_layer2_immediate, create_layer2_autonomous,
  create_layer3_payment, create_layer3_checkout

New tool: src/tools/verifiable_intent.rs
- VerifiableIntentTool implementing Tool trait (name: vi_verify)
- Operations: verify_binding, evaluate_constraints, verify_timestamps
- Gated behind verifiable_intent.enabled config flag

Wiring:
- src/lib.rs: pub mod verifiable_intent
- src/main.rs: mod verifiable_intent (binary re-declaration)
- src/config/schema.rs: VerifiableIntentConfig struct, field on Config
- src/config/mod.rs: re-exports VerifiableIntentConfig
- src/onboard/wizard.rs: default field in Config literals
- src/tools/mod.rs: conditional tool registration

Uses only existing deps: ring (ECDSA P-256), sha2, base64, serde_json,
chrono, anyhow. No new dependencies added.

Validation: cargo fmt clean, cargo clippy -D warnings clean,
cargo test --lib -- verifiable_intent passes (44 tests)

* chore(verifiable_intent): add Apache 2.0 attribution for VI spec reference

The src/verifiable_intent/ module is a Rust-native reimplementation based
on the Verifiable Intent open specification and reference implementation by
genereda (https://github.com/genereda/verifiable-intent), Apache 2.0.

- Add attribution section to src/verifiable_intent/mod.rs doc comment
- Add third-party attribution entry to NOTICE per Apache 2.0 section 4(d)

* fix(verifiable_intent): correct VI attribution URL and author

Replace hallucinated github.com/genereda/verifiable-intent with the
actual remote: github.com/agent-intent/verifiable-intent

* fix(verifiable_intent): remove unused pub re-exports to fix clippy

Remove unused re-exports of ViError, ViErrorKind, types::*,
ChainVerificationResult, and ConstraintCheckResult from the module
root. Only StrictnessMode is used externally.

---------

Co-authored-by: argenis de la rosa <theonlyhennygod@gmail.com>
2026-03-20 17:52:55 -04:00
argenis de la rosa
2bd141aa07 docs: remove architecture diagram section from all READMEs
Remove the "How it works (short)" ASCII diagram section from
all 31 README files (English + 30 translations).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-20 17:45:35 -04:00
Argenis
cc470601de
docs(i18n): expand language hubs and add six new locales (#2934)
Expand README/docs navigation to include Korean, Tagalog, German, Arabic, Hindi, and Bengali locale entries. Add canonical locale hub and summary files for each new language under docs/i18n/.

Update i18n index/coverage metadata to reflect hub-level support and keep language discovery consistent across root docs entry points.
2026-03-20 17:40:09 -04:00
Argenis
707ee02d76
chore: bump version to 0.5.4 (#4090) 2026-03-20 16:06:52 -04:00
Argenis
a47a9ee269
fix(skills): improve ClawhHub skill installer with zip crate and URL parsing (#4088)
Replace the shell-based unzip extraction with the zip crate for
cross-platform support. Use reqwest::Url for proper URL parsing,
add www.clawhub.ai and clawhub: shorthand support, fix the download
API URL, add ZIP path traversal protection, size limits, rate-limit
handling, and SKILL.toml fallback generation.

Supersedes #4043
Closes #4022
2026-03-20 15:46:52 -04:00
Argenis
8bb61fe368
fix(cron): persist delivery for api-created cron jobs (#4087)
Resolves merge conflicts from PR #4064. Uses typed DeliveryConfig in
CronAddBody and passes delivery directly to add_shell_job_with_approval
and add_agent_job instead of post-creation patching. Preserves master's
richer API fields (session_target, model, allowed_tools, delete_after_run).
2026-03-20 15:42:00 -04:00
avianion
38a8e910d0
feat(providers): add Avian as OpenAI-compatible provider (#4076)
* feat(providers): add Avian as a named provider

Add Avian (https://avian.io) as a first-class OpenAI-compatible provider
with bearer auth via AVIAN_API_KEY. Registers the provider in the factory,
credential resolver, provider list, onboard wizard, and docs.

Models: deepseek/deepseek-v3.2, moonshotai/kimi-k2.5, z-ai/glm-5,
minimax/minimax-m2.5.

* style: fix rustfmt formatting in wizard.rs curated models

Collapse short GLM-5 tuple onto single line to satisfy cargo fmt.

* fix: add none-backend test and sync localized docs

- Add memory_backend parameter to scaffold_workspace so it can
  conditionally skip MEMORY.md creation and adjust AGENTS.md guidance
  when memory.backend = "none"
- Add test scaffold_none_backend_disables_memory_guidance_and_skips_memory_md
- Sync Avian provider entry to all 6 localized providers-reference.md
  files (el, fr, ja, ru, vi, zh-CN)
- Bump providers-reference.md date to March 9, 2026

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* test(onboard): add explicit Avian assertions in wizard helper tests

Add targeted assertions for the Avian provider branches to prevent
silent regressions, as requested in CodeRabbit review feedback:

- default_model_for_provider("avian") => "deepseek/deepseek-v3.2"
- curated_models_for_provider("avian") includes all 4 catalog entries
- supports_live_model_fetch("avian") => true
- models_endpoint_for_provider("avian") => Avian API URL
- provider_env_var("avian") => "AVIAN_API_KEY"

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* style: fix rustfmt formatting in wizard.rs scaffold_workspace

Break scaffold_workspace function signature and .await.unwrap() chains
across multiple lines to comply with rustfmt max line width.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Argenis <theonlyhennygod@gmail.com>
2026-03-20 15:31:59 -04:00
Eddie's AI Agent
916ad490bd
fix: remove BSD stat fallback from Dockerfiles (#3847) (#4077)
Containers always run Linux, so only GNU stat (-c%s) is needed.
The BSD fallback (stat -f%z) caused shell arithmetic errors under
podman when the fallback syntax was evaluated.

Co-authored-by: SpaceLobster <spacelobster@SpaceLobsters-Mac-mini.local>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Argenis <theonlyhennygod@gmail.com>
2026-03-20 15:31:56 -04:00
Eddie's AI Agent
2dee42d6e4
fix(daemon): add 8MB stack size for ARM64 Linux targets (#4078)
ARM64 Linux (musl and Android) targets were using the default 2MB stack,
which is insufficient for the 126 JsonSchema derives and causes silent
daemon crashes due to stack overflow. x86_64 and Windows already had 8MB
overrides.

Closes #3537

Co-authored-by: SpaceLobster <spacelobster@SpaceLobsters-Mac-mini.local>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-authored-by: Argenis <theonlyhennygod@gmail.com>
2026-03-20 15:28:40 -04:00
dependabot[bot]
6dbc1d7c9c
chore(deps): bump the rust-all group with 2 updates (#4047)
Bumps the rust-all group with 2 updates: [opentelemetry-otlp](https://github.com/open-telemetry/opentelemetry-rust) and [extism](https://github.com/extism/extism).


Updates `opentelemetry-otlp` from 0.31.0 to 0.31.1
- [Release notes](https://github.com/open-telemetry/opentelemetry-rust/releases)
- [Changelog](https://github.com/open-telemetry/opentelemetry-rust/blob/main/docs/release_0.30.md)
- [Commits](https://github.com/open-telemetry/opentelemetry-rust/compare/v0.31.0...opentelemetry-otlp-0.31.1)

Updates `extism` from 1.13.0 to 1.20.0
- [Release notes](https://github.com/extism/extism/releases)
- [Commits](https://github.com/extism/extism/compare/v1.13.0...v1.20.0)

---
updated-dependencies:
- dependency-name: opentelemetry-otlp
  dependency-version: 0.31.1
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: rust-all
- dependency-name: extism
  dependency-version: 1.20.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: rust-all
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-03-20 15:16:08 -04:00
Anatolii Fesiuk
2bc2ddfbae
feat(tool): add myself and list_projects actions to jira tool (#4061)
* Sync jira tool description between .rs and en.toml

Replace multi-line operational guide in en.toml with the same one-liner
already in jira_tool.rs description(), matching the pattern used by all
other tools where both sources are in sync.

* Add myself action to jira tool for credential verification

* Add tests for myself action in jira tool

* Review and fix list_projects action added to jira tool

- Fix doc comment: update action count from four to five and add missing
  myself entry
- Remove redundant statuses_url variable (was identical to url)

The list_projects action fetches all projects with their issue types,
statuses, and assignable users by combining /rest/api/3/project,
per-project /statuses, and /user/assignable/multiProjectSearch endpoints.

* Remove hardcoded project-specific statuses from shape_projects

Replace fixed known_order list (which included project-specific statuses
like 'Collecting Intel', 'Design', 'Verification') with a simple
alphabetical sort. Any Jira project can use arbitrary status names so
hardcoding an order is not applicable universally.

* Fix list_projects: bounded concurrency, error surfacing, and output shape

- Use tokio::task::JoinSet with STATUS_CONCURRENCY=5 to fetch per-project
  statuses concurrently instead of sequentially, bounding API blast radius
- Surface user fetch errors: non-2xx and JSON parse failures now bail
  instead of silently falling back to empty vec
- Surface per-project status JSON parse errors instead of swallowing them
  with unwrap_or_else
- Move users to top-level output {projects, users} so they are not
  duplicated across every project entry

* fix(tool): apply rustfmt formatting to jira_tool.rs
2026-03-20 15:11:53 -04:00
Moksh Gupta
9fadf50375
Feat/add pinggy tunnel (#4060)
* feat(tunnel): add Pinggy tunnel support with configuration options

* feat(pinggy): update Pinggy tunnel configuration to remove domain field and improve SSH command handling

* feat(pinggy): add encryption and decryption for Pinggy tunnel token in config

* feat(pinggy): enhance region configuration for Pinggy tunnel with detailed options and validation

* feat(pinggy): enhance region validation and streamline output handling in Pinggy tunnel

* fix(pinggy): resolve clippy and fmt warnings

---------

Co-authored-by: moksh gupta <moksh.gupta@linktoany.com>
2026-03-20 15:11:50 -04:00
tf4fun
a047a0d9b8
feat(channel): enhance QQ channel with rich media and cron delivery (#4059)
Add full rich media send/receive support using unified [TYPE:target] markers
(aligned with Telegram). Register QQ as a cron announcement delivery channel.

- Media upload with SHA256-based caching and TTL
- Attachment download to workspace with all types supported
- Voice: prefer voice_wav_url (WAV), inject QQ ASR transcription
- File uploads include file_name for proper display in QQ client
- msg_seq generation and reply rate-limit tracking
- QQ delivery instructions in system prompt
- Register QQ in cron scheduler and tool description

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-20 15:11:47 -04:00
Argenis
43be8e5075
fix: preserve Slack session context when thread_replies=false (#4084)
* fix: preserve session context across Slack messages when thread_replies=false

When thread_replies=false, inbound_thread_ts() was falling back to each
message's own ts, giving every message a unique conversation key and
breaking multi-turn context. Now top-level messages get thread_ts=None
when threading is disabled, so all messages from the same user in the
same channel share one session.

Closes #4052

* chore: ignore RUSTSEC-2024-0384 (unmaintained instant crate via nostr)
2026-03-20 15:00:31 -04:00
Argenis
ea8fe95b19
fix: normalize 5-field cron weekday numbers to standard crontab semantics (#4082)
* fix: normalize 5-field cron weekday numbers to match standard crontab

The cron crate uses 1=Sun,2=Mon,...,7=Sat while standard crontab uses
0/7=Sun,1=Mon,...,6=Sat. This caused '1-5' to mean Sun-Thu instead of
Mon-Fri. Add a normalization step when converting 5-field expressions
to 6-field so user-facing semantics match standard crontab behavior.

Closes #4049

* chore: ignore RUSTSEC-2024-0384 (unmaintained instant crate via nostr)
2026-03-20 15:00:28 -04:00
Argenis
ce8f2133fb
fix(provider): replace overall timeout with per-read timeout for Codex streams (#4081)
* fix(provider): replace overall timeout with per-read timeout for Codex streams

The 120-second overall HTTP timeout was killing SSE streams mid-flight
when GPT-5.4 reasoning responses exceeded that duration. Replace with
a 300-second per-read timeout that only fires when no data arrives,
allowing long-running streams to complete while still detecting stalled
connections.

Closes #3786

* chore(deps): bump aws-lc-rs to fix RUSTSEC-2026-0044/0048

Update aws-lc-rs 1.16.1 → 1.16.2 (aws-lc-sys 0.38.0 → 0.39.0) to
resolve security advisory for X.509 Name Constraints Bypass.
2026-03-20 14:40:27 -04:00
Argenis
19d9c6f32c
Merge pull request #4075 from zeroclaw-labs/feat/testing-agent-loop
chore: bump version to 0.5.3
2026-03-20 13:34:04 -04:00
argenis de la rosa
0bff8d18fd chore: bump version to 0.5.3
Update version across all distribution manifests:
- Cargo.toml + Cargo.lock
- dist/scoop/zeroclaw.json
- dist/aur/PKGBUILD + .SRCINFO (catch up from 0.4.3)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-20 12:56:19 -04:00
Argenis
6f5b029033
fix(skills): support ClawHub registry URLs in skills install (#4069)
`zeroclaw skills install https://clawhub.ai/owner/skill` previously
failed because is_git_source() treated all https:// URLs as git repos.
ClawHub is a web registry, not a git host.

- Add is_clawhub_source() to detect clawhub.ai URLs
- Add clawhub_slug() to extract the skill name from the URL path
- Add install_clawhub_skill_source() to download via the ClawHub
  download API and extract the ZIP into the skills directory
- Exclude clawhub.ai URLs from git source detection
- Security audit runs on downloaded skills as with git installs

Closes #4022
2026-03-20 12:19:14 -04:00
Argenis
83ee103abb
fix: add interrupt_on_new_message support for Matrix channel (#4070)
Add the missing interrupt_on_new_message field to MatrixConfig and wire
it through InterruptOnNewMessageConfig so Matrix behaves consistently
with Telegram, Slack, Discord, and Mattermost.

Closes #4058
2026-03-20 12:17:16 -04:00
Argenis
d39ba69156
fix(providers): bail immediately on unrecoverable context window overflow (#4068)
When a request exceeds a model's context window and there is no
conversation history to truncate (e.g. system prompt alone is too
large), bail immediately with an actionable error message instead of
wasting all retry attempts on the same unrecoverable error.

Previously, the retry loop would attempt truncation, find nothing to
drop (only system + one user message), then fall through to the normal
retry logic which classified context window errors as retryable. This
caused 3 identical failing API calls for a single "hello" message.

The fix adds an early exit in all three chat methods (chat_with_history,
chat_with_tools, chat) when truncate_for_context returns 0 dropped
messages, matching the existing behavior in chat_with_system.

Fixes #4044
2026-03-20 12:10:24 -04:00
Argenis
22cdb5e2e2
fix(whatsapp): remove duplicate variable declaration causing unused warning (#4066)
* feat(config): add google workspace operation allowlists

* docs(superpowers): link google workspace operation inventory sources

* docs(superpowers): verify starter operation examples

* fix(google_workspace): remove duplicate credential/audit blocks, fix trim in allowlist check, add duplicate-methods test

- Remove the duplicated credentials_path, default_account, and audit_log
  blocks that were copy-pasted into execute() — they were idempotent but
  misleading and would double-append --account args on every call.
- Trim stored service/resource/method values in is_operation_allowed() to
  match the trim applied during Config::validate(), preventing a mismatch
  where a config entry with surrounding whitespace would pass validation but
  never match at runtime.
- Add google_workspace_allowed_operations_reject_duplicate_methods_within_entry
  test to cover the duplicate-method validation path that was implemented but
  untested.

* fix(google_workspace): close sub_resource bypass, trim allowed_services at runtime, mark spec implemented

- HIGH: extract and validate sub_resource before the allowlist check;
  is_operation_allowed() now accepts Option<&str> for sub_resource and
  returns false (fail-closed) when allowed_operations is non-empty and
  a sub_resource is present — prevents nested gws calls such as
  `drive/files/permissions/list` from slipping past a 3-segment policy
- MEDIUM: runtime allowed_services check now uses s.trim() == service,
  matching the trim() applied during config validation
- LOW: spec status updated to Implemented; stale "does not currently
  support method-level allowlists" line removed
- Added test: operation_allowlist_rejects_sub_resource_when_operations_configured

* docs(google_workspace): document sub_resource limitation and add config-reference entries

Spec updates (superpowers/specs):
- Semantics section: note that sub_resource calls are denied fail-closed when
  allowed_operations is configured
- Mental model: show both 3-segment and 4-segment gws command shapes; explain
  that 4-segment commands are unsupported with allowed_operations in this version
- Runtime enforcement: correct the validation order to match the implementation
  (sub_resource extracted before allowlist check, budget charged last)
- New section: Sub-Resource Limitation — documents impact, operator workaround,
  and confirms the deny is intentional for this slice
- Follow-On Work: add sub_resource config model extension as item 1

Config reference updates (all three locales):
- Add [google_workspace] section with top-level keys, [[allowed_operations]]
  sub-table, sub-resource limitation note, and TOML example

* fix(docs): add classroom and events to allowed_services list in all config-reference locales

* feat(google_workspace): extend allowed_operations to support sub_resource for 4-segment gws commands

All Gmail operations use gws gmail users <sub_resource> <method>, not the flat
3-segment shape. Without sub_resource support in allowed_operations, Gmail could
not be scoped at all, making the email-assistant use case impossible.

Config model:
- Add optional sub_resource field to GoogleWorkspaceAllowedOperation
- An entry without sub_resource matches 3-segment calls (Drive, Calendar, etc.)
- An entry with sub_resource matches only calls with that exact sub_resource value
- Duplicate detection updated to (service, resource, sub_resource) key

Runtime:
- Remove blanket sub_resource deny; is_operation_allowed now matches on all four
  dimensions including the optional sub_resource

Tests:
- Add operation_allowlist_matches_gmail_sub_resource_shape
- Add operation_allowlist_matches_drive_3_segment_shape
- Add rejects_operation_with_unlisted_sub_resource
- Add google_workspace_allowed_operations_allow_same_resource_different_sub_resource
- Add google_workspace_allowed_operations_reject_invalid_sub_resource_characters
- Add google_workspace_allowed_operations_deserialize_without_sub_resource
- Update all existing tests to use correct gws command shapes

Docs:
- Spec: correct Gmail examples throughout; remove Sub-Resource Limitation section;
  update data model, validation rules, example use case, and follow-on work
- Config-reference (en, vi, zh-CN): add sub_resource field to allowed_operations
  table; update Gmail examples to correct 4-segment shapes

Platform:
- email-assistant SKILL.md: update allowed_operations paths to gmail/users/* shape

* fix(google_workspace): add classroom and events to service parameter schema description

* fix(google_workspace): cross-validate allowed_operations service against allowed_services

When allowed_services is explicitly configured, each allowed_operations entry's
service must appear in that list. An entry that can never match at runtime is a
misconfigured policy: it looks valid but silently produces a narrower scope than
the operator intended. Validation now rejects it with a clear error message.

Scope: only applies when allowed_services is non-empty. When it is empty, the tool
uses a built-in default list defined in the tool layer; the validator cannot
enumerate that list without duplicating the constant, so the cross-check is skipped.

Also:
- Update allowed_operations field doc-comment from 3-part (service, resource, method)
  to 4-part (service, resource, sub_resource, method) model
- Soften Gmail sub_resource "required" language in config-reference (en, vi, zh-CN)
  from a validation requirement to a runtime matching requirement — the validator
  does not and should not hardcode API shape knowledge for individual services
- Add tests: rejects operation service not in allowed_services; skips cross-check
  when allowed_services is empty

* fix(google_workspace): cross-validate allowed_operations.service against effective service set

When allowed_services is empty the validator was silently skipping the
service cross-check, allowing impossible configs like an unlisted service
in allowed_operations to pass validation and only fail at runtime.

Move DEFAULT_GWS_SERVICES from the tool layer (google_workspace.rs) into
schema.rs so the validator can use it unconditionally. When allowed_services
is explicitly set, validate against that set; when empty, fall back to
DEFAULT_GWS_SERVICES. Remove the now-incorrect "skips cross-check when empty"
test and add two replacement tests: one confirming a valid default service
passes, one confirming an unknown service is rejected even with empty
allowed_services.

* fix(google_workspace): update test assertion for new error message wording

* docs(google_workspace): fix stale 3-segment gmail example in TDD plan

* fix(google_workspace): address adversarial review round 4 findings

- Error message for denied operations now includes sub_resource when
  present, so gmail/users/messages/send and gmail/users/drafts/send
  produce distinct, debuggable errors.
- Audit log now records sub_resource, completing the trail for 4-segment
  Gmail operations.
- Normalize (trim) allowed_services and allowed_operations fields at
  construction time in new(). Runtime comparisons now use plain equality
  instead of .trim() on every call, removing the latent defect where a
  future code path could forget to trim and silently fail to match.
- Unify runtime character validation with schema validation: sub_resource
  and service/resource/method checks now both require lowercase alphanumeric
  plus underscore and hyphen, matching the validator's character set.
- Add positional_cmd_args() test helper and tests verifying 3-segment
  (Drive) and 4-segment (Gmail) argument ordering.
- Add test confirming page_limit without page_all passes validation.
- Add test confirming whitespace in config values is normalized at
  construction, not deferred to comparison time.
- Fix spec Runtime Enforcement section to reflect actual code order.

* fix(google_workspace): wire production helpers to close test coverage gaps

- Remove #[cfg(test)] from positional_cmd_args; execute() now calls the
  same function the arg-ordering tests exercise, so a drift in the real
  command-building path is caught by the existing tests.
- Extract build_pagination_args(page_all, page_limit) as a production
  method used by execute(). Replace the brittle page_limit_without_page_all
  test (which relied on environment-specific execution failure wording)
  with four direct assertions on build_pagination_args covering all
  page_all/page_limit combinations.

* fix(whatsapp): remove duplicate variable declaration causing unused warning

Remove duplicate `let transcription_config = self.transcription.clone()`
(line 626 shadowed by identical line 628). The duplicate caused a
compiler warning during --features whatsapp-web builds.

Note: the reported "hang" at 526/528 crates is expected behavior for
release builds with lto="fat" + codegen-units=1 — the final link step
is slow but does complete.

Closes #4034

---------

Co-authored-by: Nim G <theredspoon@users.noreply.github.com>
2026-03-20 12:07:55 -04:00
Argenis
206d19af11
fix(api): respect job_type and delivery in POST /api/cron (#4063) (#4065) 2026-03-20 12:03:46 -04:00
Nim G
bbd2556861
feat(tool): google_workspace operation-level allowlist (#4010)
* feat(config): add google workspace operation allowlists

* docs(superpowers): link google workspace operation inventory sources

* docs(superpowers): verify starter operation examples

* fix(google_workspace): remove duplicate credential/audit blocks, fix trim in allowlist check, add duplicate-methods test

- Remove the duplicated credentials_path, default_account, and audit_log
  blocks that were copy-pasted into execute() — they were idempotent but
  misleading and would double-append --account args on every call.
- Trim stored service/resource/method values in is_operation_allowed() to
  match the trim applied during Config::validate(), preventing a mismatch
  where a config entry with surrounding whitespace would pass validation but
  never match at runtime.
- Add google_workspace_allowed_operations_reject_duplicate_methods_within_entry
  test to cover the duplicate-method validation path that was implemented but
  untested.

* fix(google_workspace): close sub_resource bypass, trim allowed_services at runtime, mark spec implemented

- HIGH: extract and validate sub_resource before the allowlist check;
  is_operation_allowed() now accepts Option<&str> for sub_resource and
  returns false (fail-closed) when allowed_operations is non-empty and
  a sub_resource is present — prevents nested gws calls such as
  `drive/files/permissions/list` from slipping past a 3-segment policy
- MEDIUM: runtime allowed_services check now uses s.trim() == service,
  matching the trim() applied during config validation
- LOW: spec status updated to Implemented; stale "does not currently
  support method-level allowlists" line removed
- Added test: operation_allowlist_rejects_sub_resource_when_operations_configured

* docs(google_workspace): document sub_resource limitation and add config-reference entries

Spec updates (superpowers/specs):
- Semantics section: note that sub_resource calls are denied fail-closed when
  allowed_operations is configured
- Mental model: show both 3-segment and 4-segment gws command shapes; explain
  that 4-segment commands are unsupported with allowed_operations in this version
- Runtime enforcement: correct the validation order to match the implementation
  (sub_resource extracted before allowlist check, budget charged last)
- New section: Sub-Resource Limitation — documents impact, operator workaround,
  and confirms the deny is intentional for this slice
- Follow-On Work: add sub_resource config model extension as item 1

Config reference updates (all three locales):
- Add [google_workspace] section with top-level keys, [[allowed_operations]]
  sub-table, sub-resource limitation note, and TOML example

* fix(docs): add classroom and events to allowed_services list in all config-reference locales

* feat(google_workspace): extend allowed_operations to support sub_resource for 4-segment gws commands

All Gmail operations use gws gmail users <sub_resource> <method>, not the flat
3-segment shape. Without sub_resource support in allowed_operations, Gmail could
not be scoped at all, making the email-assistant use case impossible.

Config model:
- Add optional sub_resource field to GoogleWorkspaceAllowedOperation
- An entry without sub_resource matches 3-segment calls (Drive, Calendar, etc.)
- An entry with sub_resource matches only calls with that exact sub_resource value
- Duplicate detection updated to (service, resource, sub_resource) key

Runtime:
- Remove blanket sub_resource deny; is_operation_allowed now matches on all four
  dimensions including the optional sub_resource

Tests:
- Add operation_allowlist_matches_gmail_sub_resource_shape
- Add operation_allowlist_matches_drive_3_segment_shape
- Add rejects_operation_with_unlisted_sub_resource
- Add google_workspace_allowed_operations_allow_same_resource_different_sub_resource
- Add google_workspace_allowed_operations_reject_invalid_sub_resource_characters
- Add google_workspace_allowed_operations_deserialize_without_sub_resource
- Update all existing tests to use correct gws command shapes

Docs:
- Spec: correct Gmail examples throughout; remove Sub-Resource Limitation section;
  update data model, validation rules, example use case, and follow-on work
- Config-reference (en, vi, zh-CN): add sub_resource field to allowed_operations
  table; update Gmail examples to correct 4-segment shapes

Platform:
- email-assistant SKILL.md: update allowed_operations paths to gmail/users/* shape

* fix(google_workspace): add classroom and events to service parameter schema description

* fix(google_workspace): cross-validate allowed_operations service against allowed_services

When allowed_services is explicitly configured, each allowed_operations entry's
service must appear in that list. An entry that can never match at runtime is a
misconfigured policy: it looks valid but silently produces a narrower scope than
the operator intended. Validation now rejects it with a clear error message.

Scope: only applies when allowed_services is non-empty. When it is empty, the tool
uses a built-in default list defined in the tool layer; the validator cannot
enumerate that list without duplicating the constant, so the cross-check is skipped.

Also:
- Update allowed_operations field doc-comment from 3-part (service, resource, method)
  to 4-part (service, resource, sub_resource, method) model
- Soften Gmail sub_resource "required" language in config-reference (en, vi, zh-CN)
  from a validation requirement to a runtime matching requirement — the validator
  does not and should not hardcode API shape knowledge for individual services
- Add tests: rejects operation service not in allowed_services; skips cross-check
  when allowed_services is empty

* fix(google_workspace): cross-validate allowed_operations.service against effective service set

When allowed_services is empty the validator was silently skipping the
service cross-check, allowing impossible configs like an unlisted service
in allowed_operations to pass validation and only fail at runtime.

Move DEFAULT_GWS_SERVICES from the tool layer (google_workspace.rs) into
schema.rs so the validator can use it unconditionally. When allowed_services
is explicitly set, validate against that set; when empty, fall back to
DEFAULT_GWS_SERVICES. Remove the now-incorrect "skips cross-check when empty"
test and add two replacement tests: one confirming a valid default service
passes, one confirming an unknown service is rejected even with empty
allowed_services.

* fix(google_workspace): update test assertion for new error message wording

* docs(google_workspace): fix stale 3-segment gmail example in TDD plan

* fix(google_workspace): address adversarial review round 4 findings

- Error message for denied operations now includes sub_resource when
  present, so gmail/users/messages/send and gmail/users/drafts/send
  produce distinct, debuggable errors.
- Audit log now records sub_resource, completing the trail for 4-segment
  Gmail operations.
- Normalize (trim) allowed_services and allowed_operations fields at
  construction time in new(). Runtime comparisons now use plain equality
  instead of .trim() on every call, removing the latent defect where a
  future code path could forget to trim and silently fail to match.
- Unify runtime character validation with schema validation: sub_resource
  and service/resource/method checks now both require lowercase alphanumeric
  plus underscore and hyphen, matching the validator's character set.
- Add positional_cmd_args() test helper and tests verifying 3-segment
  (Drive) and 4-segment (Gmail) argument ordering.
- Add test confirming page_limit without page_all passes validation.
- Add test confirming whitespace in config values is normalized at
  construction, not deferred to comparison time.
- Fix spec Runtime Enforcement section to reflect actual code order.

* fix(google_workspace): wire production helpers to close test coverage gaps

- Remove #[cfg(test)] from positional_cmd_args; execute() now calls the
  same function the arg-ordering tests exercise, so a drift in the real
  command-building path is caught by the existing tests.
- Extract build_pagination_args(page_all, page_limit) as a production
  method used by execute(). Replace the brittle page_limit_without_page_all
  test (which relied on environment-specific execution failure wording)
  with four direct assertions on build_pagination_args covering all
  page_all/page_limit combinations.

---------

Co-authored-by: argenis de la rosa <theonlyhennygod@gmail.com>
2026-03-20 11:46:22 -04:00
ryankr
2a1b32f8bf
feat(gateway): make request timeout configurable via env var (#4055)
Add `ZEROCLAW_GATEWAY_TIMEOUT_SECS` environment variable to override the
hardcoded 30-second gateway request timeout at runtime.

Agentic workloads with tool use (web search, MCP tools, sub-agent
delegation) regularly exceed 30 seconds, causing HTTP 408 timeouts at the
gateway layer — even though provider_timeout_secs allows longer LLM calls.

The default remains 30s for backward compatibility.

Co-authored-by: Claude Code (claude-opus-4-6) <noreply@anthropic.com>
2026-03-20 11:24:13 -04:00
Tomas Ward
505024d290
fix(provider): complete Anthropic OAuth setup-token authentication (#4053)
Setup tokens (sk-ant-oat01-*) from Claude Pro/Max subscriptions require
specific headers and a system prompt to authenticate successfully.
Without these, the API returns 400 Bad Request.

Changes to apply_auth():
- Add claude-code-20250219 and interleaved-thinking-2025-05-14 beta
  headers alongside existing oauth-2025-04-20
- Add anthropic-dangerous-direct-browser-access: true header

New apply_oauth_system_prompt() method:
- Prepends required "You are Claude Code" identity to system prompt
- Handles String, Blocks, and None system prompt variants

Changes to chat_with_system() and chat():
- Inject OAuth system prompt when using setup tokens
- Use NativeChatRequest/NativeChatResponse for proper SystemPrompt
  enum support in chat_with_system

Test updates:
- Updated apply_auth test to verify new beta headers and
  browser-access header

Tested with real OAuth token via `zeroclaw agent -m` — confirmed
working end-to-end.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-20 11:24:10 -04:00
ryankr
a0c12b5a28
fix(agent): force sequential execution when tool_search is in batch (#4054)
tool_search activates deferred MCP tools into ActivatedToolSet at runtime.
When tool_search runs in parallel with the tools it activates, a race
condition occurs where tool lookups happen before activation completes,
resulting in "Unknown tool" errors.

Force sequential execution in should_execute_tools_in_parallel() whenever
tool_search is present in the tool call batch.

Co-authored-by: Claude Code (claude-opus-4-6) <noreply@anthropic.com>
2026-03-20 11:24:07 -04:00
Argenis
b4a2afb04a
feat(tools): add text browser tool for headless environments (#4031)
* feat(tools): add text browser tool for headless environments (#3879)

* fix(tools): remove redundant match arm in text_browser clippy lint

* ci: trigger fresh workflow run
2026-03-20 01:59:43 -04:00
Argenis
8c2122017b
feat(ci): add ARM cross-compilation targets for SBCs (#4033)
Add armv7-unknown-linux-gnueabihf and arm-unknown-linux-gnueabihf targets
to the release build matrix for Raspberry Pi Zero, Orange Pi, and similar
ARM single-board computers. Disable observability-prometheus on 32-bit
targets (requires AtomicU64). Update install.sh to map armv6l to the
correct ARM binary.
2026-03-20 01:59:32 -04:00
Argenis
4f5644cf46
fix(skills): preserve TOML [[tools]] in Compact prompt injection mode (#4032)
In Compact (MetadataOnly) mode, skill tools were omitted from the system
prompt alongside instructions. This meant the LLM had no visibility into
TOML-defined tools when running in Compact mode, defeating the primary
advantage of TOML skills over MD skills (structured tool metadata).

Now Compact mode skips only instructions (loaded on demand via
read_skill) while still inlining tool definitions so the LLM knows
which skill tools are available.

Closes #3702
2026-03-20 01:55:34 -04:00
Argenis
7d958c33ef
feat(ci): add armv7-unknown-linux-gnueabihf release target (#4029)
Add pre-built armv7 (32-bit ARM hard-float) binary to all release
workflows: stable, beta, and cross-platform manual build. Uses
gcc-arm-linux-gnueabihf cross-compiler on ubuntu-22.04 for glibc 2.35
compatibility.

The install script already maps armv7l/armv6l to this target, so
users on Raspberry Pi and similar devices will be able to install
directly via the install script once binaries are published.

Closes #3759
2026-03-20 01:36:15 -04:00
sagit
89df0ab89e
feat(providers): add Bailian (Aliyun) provider support (#3158)
* feat: add Bailian (Aliyun) provider support

- Add bailian provider with coding.dashscope.aliyuncs.com endpoint
- Set User-Agent to 'openclaw' for compatibility with Bailian Coding Plan
- Support aliases: bailian, aliyun-bailian, aliyun
- Enable vision support for multimodal models

This allows users to use Alibaba Cloud's Bailian (百炼) Coding Plan API keys
with zeroclaw, matching the configuration used in OpenClaw.

* fix: address PR review comments for bailian provider

- Use is_bailian_alias() in factory match arm (consistency with other providers)
- Add bailian to canonical_china_provider_name() for onboarding routing
- Add BAILIAN_API_KEY env var resolution with DASHSCOPE_API_KEY fallback
- Add test coverage for bailian aliases in canonical_china_provider_name test

* fix: format bailian provider code

---------

Co-authored-by: Sagit-chu <sagit@example.com>
2026-03-20 01:26:16 -04:00
Chris Hengge
e4439a5c95
feat(web): add Active/All filter toggle to dashboard Channels card (#4026)
- Rename 'Active Channels' card heading to 'Channels'
- Add showAllChannels state (default: active-only view)
- Pill toggle button switches between 'Active' (green) and 'All' (blue)
  views; button label reflects the *current* view so clicking switches to
  the other mode
- When active-only and all channels are inactive, show
  'No active channels' hint instead of an empty list
- Add i18n keys dashboard.channels, dashboard.filter_active,
  dashboard.filter_all, dashboard.no_active_channels to en, zh, tr
  locales
- No backend changes; no new dependencies
2026-03-20 01:20:38 -04:00
Argenis
15d84b2622
fix(copilot): support vision via multi-part content messages (#4021)
* fix(copilot): support vision via multi-part content messages

The Copilot provider sent image markers as plain text in the content
field. The API expects OpenAI-style multi-part content arrays with
separate text and image_url parts for vision input.

Add ApiContent enum (untagged) that serializes as either a plain string
or an array of ContentPart objects. User messages containing [IMAGE:]
markers are converted to multi-part content using the shared
multimodal::parse_image_markers() helper, matching the pattern used by
the OpenRouter and Compatible providers. Non-user messages and messages
without images serialize as plain strings.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* fix(copilot): route chat_with_system through multimodal parser and add tests

Fix chat_with_system user message bypassing to_api_content(), which left
[IMAGE:] markers as plain text on that code path. Add unit tests for
to_api_content() covering image-marker, plain-text, and non-user roles.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

* style: fix clippy wildcard match warning in copilot test

* style: apply rustfmt formatting

* style: fix formatting in channels/mod.rs

* ci: re-trigger CI

* fix(test): update conversation history key in new-session test

The conversation_history_key function now includes reply_target in the
key format. Update test assertions to use the correct key format
(telegram_chat-refresh_alice instead of telegram_alice).

---------

Co-authored-by: Tim Stewart <timstewartj@gmail.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-03-20 00:07:00 -04:00
Argenis
9b14ddab79
fix(ollama): default to prompt-guided tool calling for local models (#4024)
* fix(ollama): default to prompt-guided tool calling for local models

Ollama's /api/chat native tool-calling parameter is silently ignored by
many local models (llama3, qwen3, phi4, etc.), causing them to emit
tool-call JSON as plain text instead of structured tool calls. When
native_tool_calling was true, the XML tool-use instructions were
suppressed from the system prompt, leaving models with no guidance on
the expected tool protocol.

Default to prompt-guided (XML) tool calling so all Ollama models receive
tool-use instructions in the system prompt and the existing text-based
parser in the agent loop can extract tool calls reliably.

Also fixes a minor rustfmt issue in channels/mod.rs from #2891.

Fixes #3999
Fixes #3982

* fix(channels): update test keys for reply_target in history key

The conversation_history_key now includes reply_target (from #2891),
but the refreshes_available_skills_after_new_session test still used
the old key format "telegram_alice" instead of
"telegram_chat-refresh_alice".
2026-03-19 23:42:03 -04:00
Argenis
0c73dc9427
Merge pull request #4025 from zeroclaw-labs/chore/version-bump-0.5.2
chore: bump version to 0.5.2
2026-03-19 23:41:28 -04:00
argenis de la rosa
19ac6ec2a5 chore: bump version to 0.5.2
Update version in Cargo.toml, Cargo.lock, and Scoop manifest.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-19 23:35:01 -04:00
Argenis
fda06ed157
Merge pull request #4023 from zeroclaw-labs/docs/architecture-diagram
docs(readme): overhaul all 31 READMEs to match OpenClaw-level depth
2026-03-19 23:33:15 -04:00
argenis de la rosa
169733f8de docs(readme): suppress MD001/MD024 lint in translated Prerequisites sections
The Prerequisites section uses #### headings inside <details> blocks,
which triggers MD001 (heading increment skip) and MD024 (duplicate
heading text). The English README is excluded from the docs quality
gate; add markdownlint-disable/enable comments to match.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-19 23:22:14 -04:00
argenis de la rosa
eaf23e63a4 docs(readme): overhaul all 31 READMEs to match OpenClaw-level depth
Restructure the English README and all 30 translated READMEs to match
OpenClaw's comprehensive structure while preserving ZeroClaw branding.

Key changes across all files:
- Replace outdated bootstrap.sh references with install.sh
- Add missing sections: Security defaults (DM access), Everything we
  built so far, How it works (ASCII architecture diagram), CLI commands,
  Agent workspace + skills, Channel configuration examples, Subscription
  Auth (OAuth), Migrating from OpenClaw
- Fix comparison image path to docs/assets/zeroclaw-comparison.jpeg
- Update doc links to current paths (docs/reference/api/*, docs/ops/*)
- Standardize social badges (X, Facebook, Discord, Instagram, TikTok,
  RedNote, Reddit) and contributors badge across all locales
- Update benchmark table (binary size ~8.8 MB)
- Full structural parity: all 31 files are 792 lines with identical
  section ordering

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-19 23:09:07 -04:00
Tim
ecbef64a7c
fix(channels): include reply_target in conversation history key (#2891)
conversation_history_key() now includes reply_target to isolate
conversation histories across distinct Discord/Slack/Mattermost
channels for the same sender. Previously all channels produced
the same key {channel}_{sender}, causing cross-channel context bleed.

New key format: {channel}_{reply_target}_{sender} (without thread)
or {channel}_{reply_target}_{thread_ts}_{sender} (with thread).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-03-19 22:58:06 -04:00
guoraymon
633d711faa
fix(agent): prevent memory context duplication by reordering operations (#2649)
Swap auto_save and load_context order to avoid retrieving the just-stored
user message in memory recall.

Before: load_context → auto_save (could retrieve own message)
After: auto_save → load_context (query first, store after)

This fixes the issue where "[Memory context]\n- user_msg: <current>\n<current>"
would appear, causing the user's input to be duplicated.
2026-03-19 22:54:28 -04:00
Argenis
37594fd520
fix(tool): include descriptions in deferred MCP tools system prompt (#4018)
Add tool descriptions alongside names in the deferred tools section so
the LLM can better identify which tool to activate via tool_search.

Original author: @mark-linyb
2026-03-19 22:35:55 -04:00
Octopus
64ecb76773
feat: upgrade MiniMax default model to M2.7 (#3865)
Add MiniMax-M2.7 and M2.7-highspeed to model selection lists. Set M2.7 as the new default for MiniMax, Novita, and Ollama cloud providers. Retain all previous models (M2.5, M2.1, M2) as alternatives.
2026-03-19 22:33:29 -04:00
Eddie's AI Agent
128b33e717
fix(channel): remove dead AtomicU32 fallback in channels mod (#3521)
The conditional cfg branches for AtomicU32 (32-bit fallback) and
AtomicU64 (64-bit) became dead code after portable_atomic::AtomicU64
was adopted in bd757996. The AtomicU32 branch would fail to compile
on 32-bit targets because the import was removed but the usage remained.

Use portable_atomic::AtomicU64 unconditionally, which works on all
targets.

Fixes #3452

Co-authored-by: SpaceLobster <spacelobster@SpaceLobsters-Mac-mini.local>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 22:33:05 -04:00
Eddie's AI Agent
aa6e3024d4
fix(channel): use uppercase IMAGE tag in Matrix channel for multimodal handling (#3523)
Closes #3486

Co-authored-by: SpaceLobster <spacelobster@SpaceLobsters-Mac-mini.local>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 22:32:48 -04:00
Eddie's AI Agent
bc7f797603
fix(channel): align Matrix reply channel key with dispatch key (#3522)
The Matrix channel was setting `channel: format!("matrix:{}", room_id)`
on incoming messages, but the reply router looks up channels by name
using `channels_by_name.get(&msg.channel)` where the key is just
"matrix". This mismatch caused replies to silently drop.

Align with how all other channels (telegram, discord, slack, etc.)
set the channel field to just the channel name.

Closes #3477

Co-authored-by: SpaceLobster <spacelobster@SpaceLobsters-Mac-mini.local>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 22:32:44 -04:00
Eddie's AI Agent
2d4a1b1ad7
fix(packaging): ensure Homebrew var directory exists on first start (#3524)
* fix(packaging): ensure Homebrew var directory exists on first start

When zeroclaw is installed via Homebrew, the service plist references
/opt/homebrew/var/zeroclaw (or /usr/local/var/zeroclaw on Intel) for
runtime data, but this directory was never created. This caused
`brew services start zeroclaw` to fail on first use.

The fix detects Homebrew installations by checking if the binary lives
under a Homebrew prefix (Cellar path or symlinked bin/ with Cellar
sibling). When detected:

- install_macos() creates the var directory and sets
  ZEROCLAW_CONFIG_DIR + WorkingDirectory in the generated plist
- start() defensively ensures the var directory exists before
  invoking launchctl

Closes #3464

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* style(service): fix cargo fmt formatting in Homebrew var dir tests

Collapse multi-line assert_eq! macros to single-line form as
required by cargo fmt.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: SpaceLobster <spacelobster@SpaceLobsters-Mac-mini.local>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 22:32:13 -04:00
Darren.Zeng
c7e8f3d235
docs(service): add SAFETY comments to unsafe libc::getuid() calls (#3869)
Add proper SAFETY documentation comments to the two unsafe blocks using
libc::getuid() in src/service/mod.rs:

1. In is_root() function: documents why getuid() is safe to call
2. In is_root_matches_system_uid() test: documents the test's purpose

This addresses the best practices audit finding about unsafe code without
safety comments. While we could use the nix crate's safe wrapper, adding
safety comments is a minimal change that satisfies the audit requirement
without introducing new dependencies.

As noted in refactor-candidates.md, the nix crate alternative would also
be acceptable, but this change follows the principle of minimal intervention
for documentation-only improvements.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude <noreply@anthropic.com>
2026-03-19 22:12:02 -04:00
Darren.Zeng
ff5c1c6e9b
style: enhance rustfmt.toml with additional stable formatting options (#3870)
Expand rustfmt.toml configuration beyond just 'edition = \"2021\"' to
include additional stable formatting options that improve code consistency:

- max_width = 100: consistent line length
- tab_spaces = 4, hard_tabs = false: standard indentation
- use_field_init_shorthand = true: cleaner struct initialization
- use_try_shorthand = true: cleaner try operator usage
- reorder_imports = true, reorder_modules = true: consistent ordering
- match_arm_leading_pipes = \"Never\": cleaner match syntax

These options are all available on stable Rust and follow common Rust
formatting conventions. This addresses the finding in refactor-candidates.md
about the minimal rustfmt.toml configuration.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude <noreply@anthropic.com>
2026-03-19 22:12:00 -04:00
linyibin
b3e3a0a335
fix(config): handle tilde expansion in non-TTY environments (cron) (#3849)
When HOME environment variable is not set (common in cron jobs or
other non-TTY environments), `shellexpand::tilde` returns the literal
`~` unexpanded. This caused SOUL.md, IDENTITY.md, and other workspace
files to not be loaded because the workspace_dir path was invalid.

This fix adds `expand_tilde_path()` helper that falls back to
`directories::UserDirs` when shellexpand fails to expand tilde.
If both methods fail, a warning is logged advising users to use
absolute paths or set HOME explicitly in cron environments.

Closes #3819
2026-03-19 22:10:59 -04:00
linyibin
4d89f27f21
fix(docker): remove conflicting .dockerignore entries (closes #3836) (#3844)
The Dockerfile requires firmware/ and crates/robot-kit/ for the build:
- crates/robot-kit/Cargo.toml is needed for workspace manifest parsing
- firmware/ is copied as part of the build context

These entries in .dockerignore caused COPY commands to fail with
"file not found" errors, resulting in 100% Docker build failure.

Made-with: Cursor
2026-03-19 22:10:56 -04:00
dependabot[bot]
8d8d010196
chore(deps): bump indicatif from 0.17.11 to 0.18.4 (#3857)
Bumps [indicatif](https://github.com/console-rs/indicatif) from 0.17.11 to 0.18.4.
- [Release notes](https://github.com/console-rs/indicatif/releases)
- [Commits](https://github.com/console-rs/indicatif/compare/0.17.11...0.18.4)

---
updated-dependencies:
- dependency-name: indicatif
  dependency-version: 0.18.4
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-03-19 22:10:09 -04:00
dependabot[bot]
ff9904c740
chore(deps): bump tokio-tungstenite from 0.28.0 to 0.29.0 (#3858)
Bumps [tokio-tungstenite](https://github.com/snapview/tokio-tungstenite) from 0.28.0 to 0.29.0.
- [Changelog](https://github.com/snapview/tokio-tungstenite/blob/master/CHANGELOG.md)
- [Commits](https://github.com/snapview/tokio-tungstenite/compare/v0.28.0...v0.29.0)

---
updated-dependencies:
- dependency-name: tokio-tungstenite
  dependency-version: 0.29.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2026-03-19 22:10:06 -04:00
Argenis
e411e80acb
fix(provider): disable native tool calling for Venice (#4016)
Venice's API accepts the OpenAI-compatible tool format without error
but models ignore the tool specs and hallucinate tool usage in prose
instead. Disable native tool calling so Venice uses prompt-guided
tools, which work reliably.

Add without_native_tools() builder method to OpenAiCompatibleProvider
for providers that support system messages but not native function
calling.

Closes #4007
2026-03-19 21:48:38 -04:00
Argenis
073c8cccbc
fix(web): add clipboard fallback for HTTP contexts (#4015)
The copy button in the web dashboard silently failed when served over
HTTP because navigator.clipboard requires a secure context (HTTPS).
Add a textarea-based fallback using document.execCommand('copy') and
proper error handling for the clipboard promise.

Closes #4008
2026-03-19 21:48:35 -04:00
Argenis
58be05a6e0
fix(channel): clear persisted JSONL session on /new command (#4014)
The /new command only cleared in-memory conversation history but left
the JSONL session file on disk. On daemon restart, stale history was
rehydrated, negating the user's session reset. This caused context
pollution and degraded tool calling reliability.

Add delete_session() to SessionStore and call it from the /new handler
so both in-memory and persisted state are cleared.

Closes #4009
2026-03-19 21:48:33 -04:00
Argenis
c794d54821
feat(tools): add calculator tool with arithmetic and statistical functions (#4012)
Provides 25 functions (add, subtract, divide, multiply, pow, sqrt, abs,
modulo, round, log, ln, exp, factorial, sum, average, median, mode, min,
max, range, variance, stdev, percentile, count, percentage_change, clamp)
as a single always-enabled tool to avoid LLM miscalculations and reduce
token waste on numeric tasks.

Co-authored-by: Anatolii <anatolii@Anatoliis-MacBook.local>
2026-03-19 21:40:27 -04:00
Argenis
5f0f88de3d
fix(channel): add interruption_scope_id for thread-aware cancellation scoping (#4017)
Add interruption_scope_id to ChannelMessage for thread-aware cancellation. Slack genuine thread replies and Matrix threads get scoped keys, preventing cross-thread cancellation. All other channels preserve existing behavior.

Supersedes #3900. Depends on #3891.
2026-03-19 21:34:04 -04:00
RoomWithOutRoof
776e5947ef
fix(docker): default CMD to daemon instead of gateway (#3897)
Change default Docker CMD from gateway to daemon in both dev and release stages. Gateway only starts the HTTP/WebSocket server — channel listeners are never spawned. Daemon starts the full runtime: gateway + channels + heartbeat + scheduler.

Closes #3893
2026-03-19 21:08:53 -04:00
Darren.Zeng
e2183c89a3
dev: add Justfile for convenient development commands (#3874)
Add a Justfile providing convenient shortcuts for common development
tasks. Just is a modern command runner that serves as a better Make
alternative for project-specific commands.

Commands provided:
- just fmt / just fmt-check - Format code
- just lint - Run clippy
- just test / just test-lib - Run tests
- just ci - Run full CI quality gate locally
- just build / just build-debug - Build the project
- just dev [args] - Run zeroclaw in development mode
- just doc - Generate and open documentation
- just audit / just deny - Security and license checks
- just fmt-toml - Format TOML files with taplo

This complements the existing shell scripts in dev/ and scripts/
by providing a more discoverable command interface.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude <noreply@anthropic.com>
2026-03-19 21:08:16 -04:00
Darren.Zeng
29dc1172c0
style: add taplo.toml for TOML file formatting consistency (#3873)
Add Taplo configuration file to ensure consistent formatting of TOML
files across the project. Taplo is a TOML formatter and linter that
helps maintain consistent style in configuration files.

Configuration includes:
- 2-space indentation to match project style
- Comment alignment for better readability
- Trailing newline enforcement
- Sorting rules for dependencies and features

This complements rustfmt.toml and .editorconfig to provide complete
formatting coverage for all file types in the project.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude <noreply@anthropic.com>
2026-03-19 21:08:08 -04:00
Darren.Zeng
e79e1b88b7
dev: add .gitattributes for consistent line endings and file handling (#3875)
Add comprehensive .gitattributes file to ensure consistent line endings
and proper file handling across different operating systems and editors.

Key features:
- Automatic line ending normalization (LF for most files)
- Language detection hints for GitHub Linguist
- Binary file declarations to prevent text processing
- CRLF handling for Windows-specific files (.ps1, .sln)
- Mark Cargo.lock as generated to hide diff noise in PRs

This helps prevent issues with:
- Mixed line endings in the repository
- Incorrect language statistics on GitHub
- Binary files being corrupted by text processing
- Unnecessary merge conflicts due to line ending differences

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude <noreply@anthropic.com>
2026-03-19 21:08:02 -04:00
Reaster
f886ce47e9
fix(docker): default CMD to daemon instead of gateway (#3893)
The `gateway` command only starts the HTTP/WebSocket server without
channel listeners. Users configuring channels (Matrix, Telegram, etc.)
in config.toml get no response because the channel sync loops are
never spawned. The `daemon` command starts the full runtime including
gateway, channels, heartbeat, and scheduler.

Co-authored-by: reaster <reaster@courrier.dev>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-19 21:07:53 -04:00
Nim G
6a30e24e7b
feat(channel): add interrupt_on_new_message support for Discord (#3918)
* feat(channel): add /stop command to cancel in-flight tasks

Adds an explicit /stop slash command that allows users on any non-CLI
channel (Matrix, Telegram, Discord, Slack, etc.) to cancel an agent
task that is currently running.

Changes:
- is_stop_command(): new helper that detects /stop (case-insensitive,
  optional @botname suffix), not gated on channel type
- /stop fast path in run_message_dispatch_loop: intercepts /stop before
  semaphore acquisition so the target task is never replaced in the store;
  fires CancellationToken on the running task; sends reply via tokio::spawn
  using the established two-step channel lookup pattern
- register_in_flight separated from interrupt_enabled: all non-CLI tasks
  now enter the in_flight_by_sender store, enabling /stop to reach them;
  auto-cancel-on-new-message remains gated on interrupt_enabled (Telegram/
  Slack only) — this is a deliberate broadening, not a side effect

Deferred to follow-up (feat/matrix-interrupt-on-new-message):
- interrupt_on_new_message config field for Matrix
- thread-aware interruption_scope_key (requires per-channel thread_ts
  semantics analysis; Slack always sets thread_ts, Matrix only for replies)

Supersedes #2855

Tests: 7 new unit tests for is_stop_command; all 4075 tests pass.

* feat(channel): add interrupt_on_new_message support for Discord

---------

Co-authored-by: argenis de la rosa <theonlyhennygod@gmail.com>
2026-03-19 19:33:37 -04:00
Argenis
83587eea4a
Merge pull request #3917 from theredspoon/feat/mattermost-interrupt-on-new-message
feat(channel): add interrupt_on_new_message support for Mattermost
2026-03-19 19:09:16 -04:00
Argenis
226b2282f5
Fix delegate agent timeout config regression (#4004)
* feat(delegate): make delegate timeout configurable via config.toml

Add configurable timeout options for delegate tool:
- timeout_secs: for non-agentic sub-agent calls (default: 120s)
- agentic_timeout_secs: for agentic sub-agent runs (default: 300s)

Previously these were hardcoded constants (DELEGATE_TIMEOUT_SECS
and DELEGATE_AGENTIC_TIMEOUT_SECS). Users can now customize them
in config.toml under [[delegate.agents]] section.

Fixes #3898

* feat(config): make delegate tool timeouts configurable via config.toml

This change makes the hardcoded 120s/300s delegate tool timeouts
configurable through the config file:

- Add [delegate] section to Config with timeout_secs and agentic_timeout_secs
- Add DelegateToolConfig struct for global default timeout values
- Add DEFAULT_DELEGATE_TIMEOUT_SECS (120) and DEFAULT_DELEGATE_AGENTIC_TIMEOUT_SECS (300) constants
- Remove hardcoded constants from delegate.rs
- Update tests to use constant values instead of magic numbers
- Update examples/config.example.toml with documentation

Closes #3898

* fix: keep delegate timeout fields as Option<u64> with global fallback

- Change DelegateAgentConfig.timeout_secs and agentic_timeout_secs from
  u64 to Option<u64> so per-agent overrides are truly optional
- Implement manual Default for DelegateToolConfig with correct values
  (120s and 300s) instead of derive(Default)
- Add DelegateToolConfig to DelegateTool struct and wire through
  constructors so agent timeouts fall back to global [delegate] config
- Add validation for delegate timeout values in Config::validate()
- Fix example config to use [agents.name] table syntax matching the
  HashMap<String, DelegateAgentConfig> schema
- Add missing timeout fields to all DelegateAgentConfig struct literals
  across codebase (doctor, swarm, model_routing_config, tools/mod)

* chore: trigger CI

* chore: retrigger CI

* fix: cargo fmt line wrapping in config/mod.rs

* fix: import timeout constants in delegate tests

* style: cargo fmt

---------

Co-authored-by: vincent067 <vincent067@outlook.com>
2026-03-19 18:54:33 -04:00
argenis de la rosa
7bf5f3edde fix: add missing mattermost field to test InterruptOnNewMessageConfig 2026-03-19 18:53:28 -04:00
Argenis
191192a104
add configurable allow_scripts audit option (#4001)
Co-authored-by: wangyingtao.10 <wangyingtao.10@jd.com>
2026-03-19 18:31:31 -04:00
argenis de la rosa
8b942853c4 merge: resolve conflicts with master after #3891 merge
Keep both the PR's Mattermost interrupt_on_new_message additions
and master's new fields (pending_new_sessions, prompt_config,
autonomy_level) from the /stop command PR (#3891).
2026-03-19 18:29:36 -04:00
Argenis
95473d83b5
Merge pull request #4006 from zeroclaw-labs/readme-updates
docs: add OpenClaw migration commands to all translated READMEs
2026-03-19 18:29:04 -04:00
Argenis
b5668acf2f
Merge pull request #3891 from theredspoon/feat/stop-command
feat(channel): add /stop command to cancel in-flight tasks
2026-03-19 18:24:04 -04:00
Argenis
2128c9db5b
Fix Jira tool panics and dedup bug (#4003)
* feat: add Jira tool with get_ticket, search_tickets, and comment_ticket

Implements a new `jira` tool following the existing zeroclaw tool
conventions (Tool trait, SecurityPolicy, config-gated registration).

- get_ticket: configurable detail level (basic/basic_search/full/changelog)
  with response shaping; always in the default allowed_actions list
- search_tickets: JQL-based search with cursor pagination (nextPageToken);
  always returns basic_search shape; gated by allowed_actions
- comment_ticket: posts ADF comments with inline markdown-like syntax —
  @email mentions resolved to Jira accountId, **bold**, bullet lists,
  newlines; gated by allowed_actions and SecurityPolicy Act operation

Config: [jira] section with base_url, email, api_token (encrypted at
rest, falls back to JIRA_API_TOKEN env var), allowed_actions (default:
["get_ticket"]), and timeout_secs. Validated on load.

Tool description in tool_descriptions/en.toml documents all three
actions and the full comment syntax for the AI system prompt.

* fix: address jira tool code review findings

High priority:
- Validate issue_key against ^[A-Z][A-Z0-9]+-\d+$ before URL interpolation
  to prevent path traversal in get_ticket and comment_ticket

Medium priority:
- Add email guard in tool registration (mod.rs) to skip with a warning
  instead of registering a broken tool when jira.email is empty
- Shape comment_ticket response to return only id, author, created —
  avoids exposing internal Jira metadata to the AI
- Replace O(n²) comment matching in shape_basic with a HashMap lookup
  keyed by comment ID for O(1) access
- Add api_token validation in Config::validate() checking both the
  config field and JIRA_API_TOKEN env var when jira.enabled = true

Low priority:
- Make shape_basic_search private (was accidentally pub)
- Extend clean_email to strip leading punctuation (( and [) so that
  @(john@co.com) resolves correctly; fix suffix computation via pointer
  arithmetic to handle the shifted offset
- Clarify tool_descriptions/en.toml: @prefix is required for mentions,
  bare emails without @ are treated as plain text
- Handle unmatched ** in parse_inline: emit as literal text instead of
  silently producing a bold node with no closing marker

* fix(jira): allow lowercase project keys in issue_key validation

Relax validate_issue_key to accept both PROJ-123 and proj-123, since
some Jira instances use lowercase custom project keys. Path traversal
protection is preserved via alphanumeric + digit-number requirement.

* feat(tools): add tool honesty instructions to system prompt

Prevent AI from fabricating tool results by injecting a CRITICAL:
Tool Honesty section into both channel and CLI/agent system prompts.

Rules: never fabricate or guess tool results, report errors as-is,
and ask the user when unsure if a tool call succeeded.

* style: sort JiraConfig import alphabetically in config/mod.rs

* style(jira): fix strict clippy lints in jira_tool

- Derive Default for LevelOfDetails instead of manual impl
- Use char arrays in trim_start_matches/trim_end_matches
- Allow cast_possible_truncation on search_tickets (usize->u32 bounded by max_results)
- Remove needless borrow on &email

* fix(ci): adapt to upstream autonomy_level additions in channels/mod.rs

- Add missing autonomy_level argument to build_system_prompt_with_mode call in test
- Add missing autonomy_level field in ChannelRuntimeContext test initializer
- Allow large_futures in load_or_init test (Config struct growth from JiraConfig)

* fix(ci): resolve duplicate and missing autonomy_level in test initializers

* fix(ci): use TelegramRecordingChannel in telegram-specific test

The test process_channel_message_executes_tool_calls_instead_of_sending_raw_json
sent messages on channel "telegram" but registered RecordingChannel (name:
"test-channel"), causing the channel lookup to return None and no messages to
be sent.

* fix(jira): prevent panics on short dates, fix dedup bug, normalize base_url

- Add date_prefix() helper to safely slice date strings instead of
  panicking on empty or short strings from the Jira API.
- Replace Vec::dedup() with HashSet-based retain in extract_emails()
  to correctly deduplicate non-adjacent duplicates.
- Strip trailing slashes from base_url during construction to prevent
  double-slash URLs.
- Add tests for date_prefix and non-adjacent email dedup.

---------

Co-authored-by: Anatolii <anatolii@Anatoliis-MacBook.local>
Co-authored-by: Anatolii <anatolii.fesiuk@gmail.com>
2026-03-19 18:14:34 -04:00
argenis de la rosa
8d3d14f1e4 docs(readme): add OpenClaw migration commands to all translated READMEs
Every translated README now includes the migrate section:
  zeroclaw migrate openclaw --dry-run
  zeroclaw migrate openclaw

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-19 18:09:46 -04:00
Argenis
632d513c2e
fix(agent): preserve native tool-call text in draft updates (#4005)
Preserve assistant text from native tool-call turns in draft updates. Falls back to response_text when parsed_text is empty and native tool calls are present. Relays text through on_delta for draft-capable channels like Telegram.

Supersedes #3976. Closes #3974
2026-03-19 18:07:25 -04:00
Argenis
ade588b4ec
feat(security): inject security policy summary into LLM system prompt (#4002)
Inject a human-readable summary of the active SecurityPolicy into the system prompt Safety section. LLM sees allowed commands, forbidden paths, autonomy level, and rate limits.

Supersedes #3968. Closes #2404
2026-03-19 17:54:12 -04:00
Dmitrii Mukhutdinov
f7636ab81c
fix(observability): handle missing OtelObserver match arms and add all-features CI check (#3981)
Add missing CacheHit/CacheMiss match arms to OtelObserver::record_event. Add check-all-features CI job to prevent --all-features compile regressions.
2026-03-19 17:48:35 -04:00
Martin
d76e4e5a86
feat(channels): add TTS voice reply support to Telegram channel (#3942)
Adds text-to-speech output to the Telegram channel, mirroring the
existing WhatsApp Web voice-chat implementation. When a user sends a
voice note (transcribed via STT), the channel enters voice-chat mode
and subsequent agent replies are synthesised into a Telegram voice note
via the configured TTS provider, in addition to the normal text reply.
Sending a text message exits voice-chat mode.

Implementation details:
- Add `tts_config`, `voice_chats`, and `pending_voice` fields to
  `TelegramChannel`
- Add `with_tts()` builder method, gated on `config.enabled`
- Track voice-chat state: enter on successful STT transcription, exit
  on incoming text message
- `synthesize_and_send_voice()` static method: synthesises audio via
  `TtsManager` and uploads to Telegram using `sendVoice` multipart API
- 10-second debounce in `send()` ensures only the final substantive
  reply in a tool chain gets a voice note (skips JSON, code blocks,
  URLs, short status messages)
- Wire `.with_tts(config.tts.clone())` into both Telegram construction
  sites in the channel factory

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-19 17:35:22 -04:00
Argenis
d9cea87fae
Merge pull request #3998 from zeroclaw-labs/fix/images-2
fix(docs): absolute banner URLs + web dashboard logo update
2026-03-19 16:47:09 -04:00
argenis de la rosa
6213bcab07 fix(docs): use absolute URLs for banner in all READMEs + update web dashboard logo
- Replace relative docs/assets/zeroclaw-banner.png paths with absolute
  raw.githubusercontent.com URLs in all 31 README files so the banner
  renders correctly regardless of where the README is viewed
- Switch web dashboard favicon and logos from logo.png to zeroclaw-trans.png
- Add zeroclaw-trans.png and zeroclaw-banner.png assets
- Update build.rs to track new dashboard asset
- Fix missing autonomy_level in new test + Box::pin large future

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-19 16:34:32 -04:00
argenis de la rosa
fe9f58f917 fix(docs): use absolute URLs for banner in all READMEs + update web dashboard logo
- Replace relative docs/assets/zeroclaw-banner.png paths with absolute
  raw.githubusercontent.com URLs in all 31 README files so the banner
  renders correctly regardless of where the README is viewed
- Switch web dashboard favicon and logos from logo.png to zeroclaw-trans.png
- Add zeroclaw-trans.png and zeroclaw-banner.png assets
- Update build.rs to track new dashboard asset

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-19 16:31:21 -04:00
Argenis
04c7ce4488
Merge pull request #3955 from Alix-007/issue-3952-full-autonomy-channel-prompt
fix(prompt): respect autonomy level in channel prompts
2026-03-19 16:12:13 -04:00
Argenis
5eea95ef2a
Merge pull request #3899 from Alix-007/issue-3842-openrouter-timeout
fix(openrouter): respect provider_timeout_secs on slow responses
2026-03-19 16:06:57 -04:00
argenis de la rosa
af1c37c2fb fix: pass autonomy_level through to prompt builder in wrapper function
build_system_prompt_with_mode was discarding the autonomy_level
parameter, passing None to build_system_prompt_with_mode_and_autonomy.
This caused full-autonomy prompts to still include "ask before acting"
instructions. Convert the level to an AutonomyConfig and pass it through.
2026-03-19 15:56:37 -04:00
argenis de la rosa
e3e4aef21c fix: box-pin large future in config init test to satisfy clippy
Config::load_or_init() produces a future >16KB, triggering
clippy::large_futures. Wrap with Box::pin() as recommended.
2026-03-19 15:44:41 -04:00
argenis de la rosa
a48e335be9 fix: box-pin large future in config init test to satisfy clippy
Config::load_or_init() produces a future >16KB, triggering
clippy::large_futures. Wrap with Box::pin() as recommended.
2026-03-19 15:44:19 -04:00
argenis de la rosa
fba15520dc fix: add missing autonomy_level field to test ChannelRuntimeContext
The full_autonomy_prompt test was missing the autonomy_level field
added to ChannelRuntimeContext by a recently merged PR.
2026-03-19 15:32:23 -04:00
argenis de la rosa
7504da1117 fix: add missing autonomy_level arg to test after merge with master
The refresh-skills test was missing the autonomy_level parameter
added to build_system_prompt_with_mode and ChannelRuntimeContext
by a recently merged PR.
2026-03-19 15:31:18 -04:00
argenis de la rosa
6292cdfe1c Merge origin/master into issue-3952-full-autonomy-channel-prompt
Resolve conflict in src/channels/mod.rs Safety section. Keeps the
PR's AutonomyConfig-based prompt construction (build_system_prompt_with_mode_and_autonomy)
while incorporating master's granular safety rules (conditional
destructive-command and ask-before-acting lines based on autonomy level).
Also fixes missing autonomy_level arg in refresh-skills test and removes
duplicate autonomy.level args from auto-merged call sites.
2026-03-19 15:27:43 -04:00
argenis de la rosa
693661b564 Merge origin/master into issue-3842-openrouter-timeout
Resolve merge conflicts keeping the PR's changes:
- timeout_secs parameter in OpenRouterProvider::new()
- read_response_body + parse_response_body pattern
- OPENROUTER_CONNECT_TIMEOUT_SECS and DEFAULT_OPENROUTER_TIMEOUT_SECS constants
- Update master's new tests to use two-arg new() signature
2026-03-19 15:19:32 -04:00
Argenis
4daec8c0df
Merge pull request #3288 from Alix-007/fix-2400-block-config-self-mutation
fix(security): block agent writes to runtime config state
2026-03-19 15:16:48 -04:00
Argenis
3cf609cb38
Merge pull request #3959 from Alix-007/issue-3706-read-skill
feat(skills): add read_skill for compact mode
2026-03-19 15:16:42 -04:00
Argenis
e1b7d29f1b
Merge pull request #3940 from Alix-007/issue-3845-refresh-skills-on-new
channel: refresh available skills after /new
2026-03-19 15:16:35 -04:00
Argenis
fef69a4128
Merge pull request #3787 from Alix-007/issue-3774-path-normalization
fix(tools): normalize workspace-prefixed paths
2026-03-19 15:16:17 -04:00
Argenis
643b683c39
Merge pull request #3954 from Alix-007/issue-2901-zai-tool-stream
fix(zai): enable tool_stream for tool-capable requests
2026-03-19 15:15:59 -04:00
Argenis
74c93b0ebc
Merge pull request #3943 from Alix-007/issue-3902-claude-code-test-race
test(claude_code): isolate echo script per test run
2026-03-19 15:15:52 -04:00
Argenis
a7bf69d279
Merge pull request #3356 from Alix-007/fix/config-load-initialized-state
fix(config): report existing configs as initialized on load
2026-03-19 15:15:47 -04:00
Argenis
f68af9a4c7
Merge pull request #3994 from zeroclaw-labs/fix/images
fix(docs): update banner image and add Instagram to all READMEs
2026-03-19 15:14:02 -04:00
Argenis
cca3d66955
fix: add Node.js build dependency to Homebrew formula template (#3996)
The Homebrew publish workflow downloads a source tarball but never
ensures Node.js is available during the build. Without it, build.rs
cannot run npm to compile the web frontend, so Homebrew-installed
ZeroClaw ships without the web dashboard.

Add `depends_on "node" => :build` to the formula by inserting it
after the existing `depends_on "rust" => :build` line. This lets
build.rs detect npm and automatically run `npm ci && npm run build`
to produce the web/dist assets.

Fixes #3991
2026-03-19 15:13:56 -04:00
Argenis
95bf229225
fix(config): enable compact_context by default (#3995)
* fix: change compact_context default to true

Local LLMs with limited context windows immediately run out of context
when compact_context defaults to false. The system prompt alone can
consume 25K+ tokens, exceeding even 55K context windows with history.

Setting compact_context=true by default limits system prompt injection
to 6000 chars and RAG results to 2 chunks, making the agent usable
with smaller models out of the box.

Fixes #3987

* docs: update compact_context default to true in config reference

Update all locale variants (en, zh-CN, vi) to reflect the new default.

* test: update tests to expect compact_context default of true

Update assertions in schema.rs unit tests and config_persistence.rs
component tests to match the new default value.
2026-03-19 15:13:14 -04:00
argenis de la rosa
ebe19147f2 fix(docs): update banner image and add Instagram badge to all READMEs
- Replace zeroclaw.png (broken/outdated) with zeroclaw-banner.png
  across all 30 translated README files
- Add Instagram social badge (@therealzeroclaw) to all translations

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-19 14:57:28 -04:00
Argenis
e387f58579
Merge pull request #3951 from Alix-007/issue-3946-release-memory-postgres
build(release): include memory-postgres in published artifacts
2026-03-19 14:42:15 -04:00
Argenis
fa06798926
fix(cron): persist allowed_tools for agent jobs (#3993)
Persist allowed_tools in cron_jobs table, threading it through CLI add/update and cron_add/cron_update tool APIs. Add regression coverage for store, tool, and CLI roundtrip paths.

Fixups over original PR #3929: add allowed_tools to all_overdue_jobs SELECT (merge gap), resolve merge conflicts.

Closes #3920
Supersedes #3929
2026-03-19 14:37:55 -04:00
Alix-007
a4cd4b287e
feat(slack): add thread_replies channel option (#3930)
Add a thread_replies option to Slack channel config (default true). When false, replies go to channel root instead of the originating thread.

Closes #3888
2026-03-19 14:32:02 -04:00
argenis de la rosa
3ce7f2345e merge: resolve conflicts with master, include channel-lark in RELEASE_CARGO_FEATURES
Add channel-lark (merged to master separately) to RELEASE_CARGO_FEATURES
env var. Keep the DRY env-var approach and remove stale Docker build-args.
2026-03-19 14:24:30 -04:00
Argenis
eb9dfc04b4
fix(anthropic): always apply cache_control to system prompts (#3990)
* fix: always use Blocks format for system prompts with cache_control

System prompts under 3KB were wrapped in SystemPrompt::String which
cannot carry cache_control headers, resulting in 0% cache hit rate
on Haiku 4.5. Always use SystemPrompt::Blocks with ephemeral
cache_control regardless of prompt size.

Fixes #3977

* fix: lower conversation caching threshold from >4 to >1 messages

The previous threshold of >4 non-system messages was too restrictive,
delaying cache benefits until 5+ turns. Lower to >1 so caching kicks
in after the first user+assistant exchange.

Fixes #3977

* test: update anthropic cache tests for new thresholds and Blocks format

- convert_messages_small_system_prompt now expects Blocks with
  cache_control instead of String variant
- should_cache_conversation tests updated for >1 threshold
- backward_compatibility test replaced with blocks-system test
2026-03-19 14:21:45 -04:00
Argenis
9cc74a2698
fix(security): wire sandbox into shell command execution (#3989)
* fix: add sandbox field to ShellTool struct

Add `sandbox: Arc<dyn Sandbox>` field to `ShellTool` and a
`new_with_sandbox()` constructor so callers can inject the configured
sandbox backend. The existing `new()` constructor defaults to
`NoopSandbox` for backward compatibility.

Ref: #3983

* fix: apply sandbox wrapping in ShellTool::execute()

Call `self.sandbox.wrap_command()` on the underlying std::process::Command
(via `as_std_mut()`) after building the shell command and before clearing
the environment. This ensures every shell command passes through the
configured sandbox backend before execution.

Ref: #3983

* fix: wire up sandbox creation at ShellTool callsites

In `all_tools_with_runtime()`, create a sandbox from
`root_config.security` via `create_sandbox()` and pass it to
`ShellTool::new_with_sandbox()`. The `default_tools_with_runtime()`
path retains `ShellTool::new()` which defaults to `NoopSandbox`.

Ref: #3983

* test: add sandbox integration tests for ShellTool

Verify that ShellTool can be constructed with a sandbox via
`new_with_sandbox()`, that NoopSandbox leaves commands unmodified,
and that command execution works end-to-end with a sandbox attached.

Ref: #3983
2026-03-19 14:21:42 -04:00
Argenis
133dc46b41
fix(web): restore accidentally deleted logo file (#3988)
* fix: restore accidentally deleted logo file

The logo.png was removed in commit 48bdbde2 but is still referenced
by the web UI components. Restore it from git history.

Fixes #3984

* fix: copy logo to web/dist for rust-embed

The Rust binary embeds files from web/dist/ via rust-embed, so the
logo must also be present there to be served without a rebuild.

Fixes #3984
2026-03-19 14:21:15 -04:00
Argenis
ad03605cad
Merge pull request #3949 from Alix-007/issue-3817-cron-delivery-context
fix: default cron delivery to the active channel context
2026-03-19 14:20:59 -04:00
Argenis
ae1acf9b9c
Merge pull request #3950 from Alix-007/issue-3466-homebrew-service-workspace
fix(onboard): warn when Homebrew services use another workspace
2026-03-19 14:20:56 -04:00
Alix-007
cc91f22e9b
fix(skills): narrow shell shebang detection (#3944)
Co-authored-by: Alix-007 <267018309+Alix-007@users.noreply.github.com>
2026-03-19 14:10:51 -04:00
Martin
030f5fe288
fix(install): fix guided installer in LXC/container environments (#3947)
Replace subshell-based /dev/stdin probing in guided_input_stream with a
file-descriptor approach (guided_open_input) that works reliably in LXC
containers accessed over SSH.

The previous implementation probed /dev/stdin and /proc/self/fd/0 via
subshells before falling back to /dev/tty. In LXC containers these
probes fail even when FD 0 is perfectly usable, causing the guided
installer to exit with "requires an interactive terminal".

The fix:
- When stdin is a terminal (-t 0), assign GUIDED_FD=0 directly without
  any subshell probing — trusting the kernel's own tty check
- Otherwise, open /dev/tty as an explicit fd (exec {GUIDED_FD}</dev/tty)
- guided_read uses `read -u "$GUIDED_FD"` instead of `< "$file_path"`
- Add echo after silent reads (password prompts) for correct line handling

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-19 14:10:48 -04:00
Giulio V
c47bbcc972
fix(cron): add startup catch-up and drop login shell flag (#3948)
* fix(cron): add startup catch-up and drop login shell flag

Problems:
1. When ZeroClaw started after downtime (late boot, daemon restart),
   overdue jobs were picked up via `due_jobs()` but limited by
   `max_tasks` per poll cycle — with many overdue jobs, catch-up
   could take many cycles.
2. Cron shell jobs used `sh -lc` (login shell), which loads the
   full user profile on every execution — slow and may cause
   unexpected side effects.

Fixes:
- Add `all_overdue_jobs()` store query without `max_tasks` limit
- Add `catch_up_overdue_jobs()` startup phase that runs ALL overdue
  jobs once before entering the normal polling loop
- Extract `build_cron_shell_command()` helper using `sh -c` (non-login)
- Add structured tracing for catch-up progress
- Add tests for all new functions

* feat(cron): make catch-up configurable via API and control panel

Add `catch_up_on_startup` boolean to `[cron]` config (default: true).
When enabled, the scheduler runs all overdue jobs at startup before
entering the normal polling loop. Users can toggle this from:

- The Cron page toggle switch in the control panel
- PATCH /api/cron/settings { "catch_up_on_startup": false }
- The `[cron]` section of the TOML config editor

Also adds GET /api/cron/settings endpoint to read cron subsystem
settings without parsing the full config.

* fix(config): add catch_up_on_startup to CronConfig test constructors

The CI Lint job failed because the `cron_config_serde_roundtrip` test
constructs CronConfig directly and was missing the new field.
2026-03-19 14:10:37 -04:00
Argenis
72fbb22059
Merge pull request #3985 from zeroclaw-labs/fix/aur-ssh-publish
fix(ci): harden AUR SSH key setup and add diagnostics
2026-03-19 13:44:01 -04:00
argenis de la rosa
cbb3d9ae92 fix(ci): harden AUR SSH key setup and add diagnostics (#3952)
The AUR publish step fails with "Permission denied (publickey)".
Root cause is likely key formatting (Windows line endings from
GitHub secrets UI) or missing public key registration on AUR.

Changes:
- Normalize line endings (strip \r) when writing SSH key
- Set correct permissions on ~/.ssh (700) and ~/.ssh/config (600)
- Validate key with ssh-keygen before attempting clone
- Add SSH connectivity test for clearer error diagnostics

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-19 13:28:58 -04:00
Argenis
8d1eebad4d
Merge pull request #3980 from zeroclaw-labs/version-bump-0.5.1
chore: bump version to 0.5.1
2026-03-19 09:56:36 -04:00
argenis de la rosa
0fdd1ad490 chore: bump version to 0.5.1
Release highlights:
- Autonomy enforcement in gateway and channel paths (#3952)
- conversational_ai startup warning for unimplemented config (#3958)
- Heartbeat default interval 30→5min (#3938)
- Provider timeout and error handling improvements (#3973, #3978)
- Docker/CI postgres and Lark feature fixes (#3971, #3933)
- Tool path resolution fix (#3937)
- OTP config fix (#3936)
- README: Instagram badge + banner image (#3979)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-19 09:45:32 -04:00
Argenis
86bc60fcd1
Merge pull request #3979 from zeroclaw-labs/readme
docs: add Instagram badge and banner to README
2026-03-19 09:42:00 -04:00
argenis de la rosa
4837e1fe73 docs(readme): add Instagram social badge and switch to banner image
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-19 09:30:54 -04:00
Argenis
985977ae0c
fix(providers): exempt tool schema errors from non-retryable classification (#3978)
* fix: exempt tool schema validation errors from non-retryable classification

Groq returns 400 "tool call validation failed" which was classified as
non-retryable by is_non_retryable(), preventing the provider-level
fallback in compatible.rs from executing. Add is_tool_schema_error()
to detect these errors and return false from is_non_retryable(), allowing
the retry loop to pass control back to the provider's built-in fallback.

Fixes #3757

* test: add unit tests for tool schema error detection in reliable.rs

Verify is_tool_schema_error detects Groq-style validation failures and
that is_non_retryable returns false for tool schema 400s while still
returning true for other 400 errors like invalid API key.

* fix: escape format braces in test string literals for cargo check

The anyhow::anyhow! macro interprets curly braces as format
placeholders. Use explicit format argument to pass JSON-containing
strings in tests.
2026-03-19 09:25:49 -04:00
Argenis
72b10f12dd
Merge pull request #3975 from zeroclaw-labs/agent-loop
fix: enforce autonomy level in gateway/channel paths + conversational_ai warning
2026-03-19 09:20:21 -04:00
Argenis
3239f5ea07
fix(ci): include channel-lark feature in precompiled release binaries (#3933) (#3972)
Add channel-lark to the cargo --features flag in all release and
cross-platform build workflows, and to the Docker build-args.  This
gives users Feishu/Lark channel support out of the box without needing
to compile from source.

The channel-lark feature depends only on dep:prost (pure Rust protobuf),
so it is safe to enable on all platforms (Linux, macOS, Windows, Android).
2026-03-19 09:15:10 -04:00
Argenis
3353729b01
fix(openrouter): respect provider_timeout_secs and improve error messages (#3973)
* fix(openrouter): wire provider_timeout_secs through factory

Apply the configured provider_timeout_secs to OpenRouterProvider
in the provider factory, matching the pattern used for compatible
providers.

* fix(openrouter): add timeout_secs field to OpenRouterProvider

Add a configurable timeout_secs field (default 120s) and a
with_timeout_secs() builder method so the HTTP client timeout
can be overridden via provider config instead of being hardcoded.

* refactor(openrouter): improve response decode error messages

Read the response body as text first, then parse with
serde_json::from_str so that decode failures include a truncated
snippet of the raw body for easier debugging.

* test(openrouter): add timeout_secs configuration tests

Verify that the default timeout is 120s and that with_timeout_secs
correctly overrides it.

* style: run rustfmt on openrouter.rs
2026-03-19 09:12:14 -04:00
argenis de la rosa
b6c2930a70 fix(agent): enforce autonomy level in gateway and channel paths (#3952)
- Channel tool filtering (`non_cli_excluded_tools`) now respects
  `autonomy.level = "full"` — full-autonomy agents keep all tools
  available regardless of channel.
- Gateway `process_message` now creates and passes an `ApprovalManager`
  to `agent_turn`, so `ReadOnly`/`Supervised` policies are enforced
  instead of silently skipped.
- Gateway also applies `non_cli_excluded_tools` filtering with the same
  full-autonomy bypass.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-19 08:56:45 -04:00
argenis de la rosa
181cafff70 fix(config): warn when conversational_ai.enabled is set (#3958)
The conversational_ai config section is parsed but not yet consumed by
any runtime code. Emit a startup warning so users know the setting is
ignored, and update the doc comment to mark it as reserved for future use.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-19 08:56:38 -04:00
Argenis
d87f387111
fix(docker): add memory-postgres feature to Docker and CI builds (#3971)
The GHCR image v0.5.0 fails to start when users configure the postgres
memory backend because the binary was compiled without the
memory-postgres cargo feature. Add it to all build configurations:
Dockerfiles (default ARG), release workflows (cargo build --features),
and Docker push steps (ZEROCLAW_CARGO_FEATURES build-arg).

Fixes #3946
2026-03-19 08:51:20 -04:00
Argenis
7068079028
fix: make channel system prompt respect autonomy.level = full (#3952) (#3970)
When autonomy.level is set to "full", the channel/web system prompt no
longer includes instructions telling the model to ask for permission
before executing tools. Previously these safety lines were hardcoded
regardless of autonomy config, causing the LLM to simulate approval
dialogs in channel and web-interface modes even though the
ApprovalManager correctly allowed execution.

The fix adds an autonomy_level parameter to build_system_prompt_with_mode
and conditionally omits the "ask before acting" instructions when the
level is Full. Core safety rules (no data exfiltration, prefer trash)
are always included.
2026-03-19 08:48:38 -04:00
Argenis
a9b511e6ec
fix: omit experimental conversational_ai section from default config (#3969)
The [conversational_ai] config section was serialized into every
freshly-generated config.toml despite the feature being experimental
and not yet wired into the agent runtime. This confused new users who
found an undocumented section in their config.

Add skip_serializing_if = "ConversationalAiConfig::is_disabled" so the
section is only written when a user has explicitly enabled it. Existing
configs that already contain the section continue to deserialize
correctly via #[serde(default)].

Fixes #3958
2026-03-19 08:48:33 -04:00
Argenis
65cb4fe099
feat(heartbeat): default interval 30→5min + prune heartbeat from auto-save (#3938)
Lower the default heartbeat interval to 5 minutes to match the renewable
partial wake-lock cadence. Add `[heartbeat task` to the memory auto-save
skip filter so heartbeat prompts (both Phase 1 decision and Phase 2 task
execution) do not pollute persistent conversation memory.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-19 08:17:08 -04:00
Alix-007
1bbc159e0e style(zai): satisfy rustfmt in tool_stream request 2026-03-19 19:15:58 +08:00
Alix-007
0d28cca843 build(release): drop stale docker feature args 2026-03-19 19:14:07 +08:00
Alix-007
b1d20d38f9 feat(skills): add read_skill for compact mode 2026-03-19 17:53:40 +08:00
Alix-007
2bad6678ec fix(prompt): respect autonomy level in channel prompts 2026-03-19 16:54:51 +08:00
Alix-007
b6fe054915 fix(zai): send tool_stream for tool-capable requests 2026-03-19 16:32:06 +08:00
Alix-007
7ddd2aace3 build(release): ship postgres-capable release artifacts 2026-03-19 15:37:45 +08:00
Alix-007
c7b3b762e0 fix(onboard): warn when Homebrew service uses another workspace 2026-03-19 15:30:40 +08:00
Alix-007
4b00e8ba75 fix(cron): default channel delivery to active reply target 2026-03-19 15:11:47 +08:00
Alix-007
dd462a2b04 test(claude_code): isolate echo script per test run 2026-03-19 13:47:08 +08:00
Alix-007
2d68b880c2 Fix /new regression test lint scope 2026-03-19 12:19:31 +08:00
Alix-007
3a672a2ede Refresh skills after new channel sessions 2026-03-19 12:07:08 +08:00
Argenis
2e48cbf7c3
fix(tools): use resolve_tool_path for consistent path resolution (#3937)
Replace workspace_dir.join(path) with resolve_tool_path(path) in
file_write, file_edit, and pdf_read tools to correctly handle absolute
paths within the workspace directory, preventing path doubling.

Closes #3774
2026-03-18 23:51:35 -04:00
Argenis
e4910705d1
fix(config): add missing challenge_max_attempts field to OtpConfig (#3919) (#3936)
The OtpConfig struct uses deny_unknown_fields but was missing the
challenge_max_attempts field, causing zeroclaw config schema to fail
with a TOML parse error when the field appeared in config files.

Add challenge_max_attempts as an Option<u32>-style field with a default
of 3 and a validation check ensuring it is greater than 0.
2026-03-18 23:48:53 -04:00
Argenis
1b664143c2
fix: move misplaced include key from [lib] to [package] in Cargo.toml (#3935)
The `include` array was placed after `[lib]` without a section header,
causing Cargo to parse it as `lib.include` — an invalid manifest key.
This triggered a warning during builds and caused lockfile mismatch
errors when building with --locked in Docker (Dockerfile.debian).

Move the `include` key to the `[package]` section where it belongs and
regenerate Cargo.lock to stay in sync.

Fixes #3925
2026-03-18 23:48:50 -04:00
Argenis
950f996812
Merge pull request #3926 from zeroclaw-labs/fix/pairing-code-terminal-display
fix(gateway): move pairing code below dashboard URL in terminal
2026-03-18 20:34:08 -04:00
argenis de la rosa
b74c5cfda8 fix(gateway): move pairing code below dashboard URL in terminal banner
Repositions the one-time pairing code display to appear directly below
the dashboard URL for cleaner terminal output, and removes the duplicate
display that was showing at the bottom of the route list.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-18 19:50:26 -04:00
Argenis
02688eb124
feat(skills): autonomous skill creation from multi-step tasks (#3916)
Add SkillCreator module that persists successful multi-step task
executions as reusable SKILL.toml definitions under the workspace
skills directory.

- SkillCreationConfig in [skills.skill_creation] (disabled by default)
- Slug validation, TOML generation, embedding-based deduplication
- LRU eviction when max_skills limit is reached
- Agent loop integration post-success
- Gated behind `skill-creation` compile-time feature flag

Closes #3825.
2026-03-18 17:15:02 -04:00
Argenis
2c92cf913b
fix: ensure SOUL.md and IDENTITY.md exist in non-tty sessions (#3915)
When the workspace is created outside of `zeroclaw onboard` (e.g., via
cron, daemon, or `< /dev/null`), SOUL.md and IDENTITY.md were never
scaffolded, causing the agent to activate without identity files.

Added `ensure_bootstrap_files()` in `Config::load_or_init()` that
idempotently creates default SOUL.md and IDENTITY.md if missing.

Closes #3819.
2026-03-18 17:12:44 -04:00
Argenis
3c117d2d7b
feat(delegate): make sub-agent timeouts configurable via config.toml (#3909)
Add `timeout_secs` and `agentic_timeout_secs` fields to
`DelegateAgentConfig` so users can tune per-agent timeouts instead
of relying on the hardcoded 120s / 300s defaults.

Validation rejects values of 0 or above 3600s, matching the pattern
used by MCP timeout validation.

Closes #3898
2026-03-18 17:07:03 -04:00
Argenis
1f7c3c99e4
feat(i18n): externalize tool descriptions for translation (#3912)
Add a locale-aware tool description system that loads translations from
TOML files in tool_descriptions/. This enables non-English users to see
tool descriptions in their language.

- Add src/i18n.rs module with ToolDescriptions loader, locale detection
  (ZEROCLAW_LOCALE, LANG, LC_ALL env vars), and English fallback chain
- Add locale config field to Config struct for explicit locale override
- Create tool_descriptions/en.toml with all 47 tool descriptions
- Create tool_descriptions/zh-CN.toml with Chinese translations
- Integrate with ToolsSection::build() and build_tool_instructions()
  to resolve descriptions from locale files before hardcoded fallback
- Add PromptContext.tool_descriptions field for prompt-time resolution
- Add AgentBuilder.tool_descriptions() setter for Agent construction
- Include tool_descriptions/ in Cargo.toml package include list
- Add 8 unit tests covering locale loading, fallback chains, env
  detection, and config override

Closes #3901
2026-03-18 17:01:39 -04:00
Argenis
92940a3d16
Merge pull request #3904 from zeroclaw-labs/fix/install-stale-build-cache
fix(install): clean stale build cache on upgrade
2026-03-18 15:49:10 -04:00
Argenis
d77c616905
fix: reset tool call dedup cache each iteration to prevent loops (#3910)
The seen_tool_signatures HashSet was initialized outside the iteration loop, causing cross-iteration deduplication of legitimate tool calls. This triggered a self-correction spiral where the agent repeatedly attempted skipped calls until hitting max_iterations.

Moving the HashSet inside the loop ensures deduplication only applies within a single iteration, as originally intended.

Fixes #3798
2026-03-18 15:45:10 -04:00
Argenis
ac12470c27
fix(channels): respect ack_reactions config for Telegram channel (#3834) (#3913)
The Telegram channel was ignoring the ack_reactions setting because it
sent setMessageReaction API calls directly in its polling loop, bypassing
the top-level channels_config.ack_reactions check.

- Add optional ack_reactions field to TelegramConfig so it can be set
  under [channels_config.telegram] without "unknown key" warnings
- Add ack_reactions field and with_ack_reactions() builder to
  TelegramChannel, defaulting to true
- Guard try_add_ack_reaction_nonblocking() behind self.ack_reactions
- Wire channel-level override with fallback to top-level default
- Add config deserialization and channel behavior tests
2026-03-18 15:40:31 -04:00
Nim G
a322e01b5f feat(channel): add interrupt_on_new_message support for Mattermost 2026-03-18 16:40:25 -03:00
Argenis
c5a1148ae9
fix: ensure install.sh creates config.toml and workspace files (#3852) (#3906)
When running install.sh with --docker --skip-build --prefer-prebuilt
(especially with podman via ZEROCLAW_CONTAINER_CLI), the script would
skip creating config.toml and workspace scaffold files because these
were only generated by the onboard wizard, which requires an interactive
terminal or explicit API key.

Add ensure_default_config_and_workspace() that creates a minimal
config.toml (with provider, workspace_dir, and optional api_key/model)
and seeds the workspace directory structure (sessions/, memory/, state/,
cron/, skills/ subdirectories plus IDENTITY.md, USER.md, MEMORY.md,
AGENTS.md, and SOUL.md) when they don't already exist.

This function is called:
- At the end of run_docker_bootstrap(), so config and workspace files
  exist on the host volume regardless of whether onboard ran inside the
  container.
- After the [3/3] Finalizing setup onboard block in the native install
  path, covering --skip-build, --prefer-prebuilt, --skip-onboard, and
  cases where the binary wasn't found.

The function is idempotent: it only writes files that don't already
exist, so it never overwrites config or workspace files created by a
successful onboard run.

Also makes the container onboard failure non-fatal (|| true) so that
the fallback config generation always runs.

Fixes #3852
2026-03-18 15:15:47 -04:00
Argenis
440ad6e5b5
fix: handle double-serialized schedule in cron_add and cron_update (#3860) (#3905)
When LLMs pass the schedule parameter as a JSON string instead of a JSON
object, serde fails with "invalid type: string, expected internally
tagged enum Schedule". Add a deserialize_maybe_stringified helper that
detects stringified JSON values and parses the inner string before
deserializing, providing backward compatibility for both object and
string representations.

Fixes #3860
2026-03-18 15:15:22 -04:00
Argenis
2e41cb56f6
fix: enable vision support for llamacpp provider (#3907)
The llamacpp provider was instantiated with vision disabled by default, causing image transfers from Telegram to fail. Use new_with_vision() with vision enabled, matching the behavior of other compatible providers.

Fixes #3802
2026-03-18 15:14:57 -04:00
Argenis
2227fadb66
fix(tools): include tool_search instruction in deferred tools system prompt (#3826) (#3914)
The deferred MCP tools section in the system prompt only listed tool
names inside <available-deferred-tools> tags without any instruction
telling the LLM to call tool_search to activate them. In daemon and
Telegram mode, where conversations are shorter and less guided, the
LLM never discovered it should call tool_search, so deferred tools
were effectively unavailable.

Add a "## Deferred Tools" heading with explicit instructions that
the LLM MUST call tool_search before using any listed tool. This
ensures the LLM knows to activate deferred tools in all modes
(CLI, daemon, Telegram) consistently.

Also add tests covering:
- Instruction presence in the deferred section
- Multiple-server deferred tool search
- Cross-server keyword search ranking
- Activation persistence across multiple tool_search calls
- Idempotent re-activation
2026-03-18 15:13:58 -04:00
Argenis
162efbb49c
fix(providers): recover from context window errors by truncating history (#3908)
When a provider returns a context-size-exceeded error, truncate the
oldest non-system messages from conversation history and retry instead
of immediately bailing out. This enables local models with small
context windows (llamafile, llama.cpp) to work by automatically
fitting the conversation within available context.

Closes #3894
2026-03-18 14:54:56 -04:00
argenis de la rosa
3c8b6d219a fix(test): use PID-scoped script path to prevent ETXTBSY in CI
The echo_provider() test helper writes a fake_claude.sh script to
a shared temp directory. When lib and bin test binaries run in
parallel (separate processes, separate OnceLock statics), one
process can overwrite the script while the other is executing it,
causing "Text file busy" (ETXTBSY). Scope the filename with PID
to isolate each test process.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-18 14:33:04 -04:00
Vasanth
58b98c59a8
feat(agent): add runtime model switching via model_switch tool (#3853)
Add support for switching AI models at runtime during a conversation.
The model_switch tool allows users to:
- Get current model state
- List available providers
- List models for a provider
- Switch to a different model

The switch takes effect immediately for the current conversation by
recreating the provider with the new model after tool execution.

Risk: Medium - internal state changes and provider recreation
2026-03-18 14:17:52 -04:00
argenis de la rosa
d72e9379f7 fix(install): clean stale build cache on upgrade
When upgrading an existing installation, stale build artifacts in
target/release/build/ can cause compilation failures (e.g.
libsqlite3-sys bindgen.rs not found). Run cargo clean --release
before building when an upgrade is detected.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-18 14:15:59 -04:00
Alix-007
e3e9db5210 fix(openrouter): respect provider timeout config 2026-03-19 01:36:57 +08:00
Nim G
ad8a209bd7 feat(channel): add /stop command to cancel in-flight tasks
Adds an explicit /stop slash command that allows users on any non-CLI
channel (Matrix, Telegram, Discord, Slack, etc.) to cancel an agent
task that is currently running.

Changes:
- is_stop_command(): new helper that detects /stop (case-insensitive,
  optional @botname suffix), not gated on channel type
- /stop fast path in run_message_dispatch_loop: intercepts /stop before
  semaphore acquisition so the target task is never replaced in the store;
  fires CancellationToken on the running task; sends reply via tokio::spawn
  using the established two-step channel lookup pattern
- register_in_flight separated from interrupt_enabled: all non-CLI tasks
  now enter the in_flight_by_sender store, enabling /stop to reach them;
  auto-cancel-on-new-message remains gated on interrupt_enabled (Telegram/
  Slack only) — this is a deliberate broadening, not a side effect

Deferred to follow-up (feat/matrix-interrupt-on-new-message):
- interrupt_on_new_message config field for Matrix
- thread-aware interruption_scope_key (requires per-channel thread_ts
  semantics analysis; Slack always sets thread_ts, Matrix only for replies)

Supersedes #2855

Tests: 7 new unit tests for is_stop_command; all 4075 tests pass.
2026-03-18 12:32:20 -04:00
Nim G
571ccd67cb feat(channel): add /stop command to cancel in-flight tasks
Adds an explicit /stop slash command that allows users on any non-CLI
channel (Matrix, Telegram, Discord, Slack, etc.) to cancel an agent
task that is currently running.

Changes:
- is_stop_command(): new helper that detects /stop (case-insensitive,
  optional @botname suffix), not gated on channel type
- /stop fast path in run_message_dispatch_loop: intercepts /stop before
  semaphore acquisition so the target task is never replaced in the store;
  fires CancellationToken on the running task; sends reply via tokio::spawn
  using the established two-step channel lookup pattern
- register_in_flight separated from interrupt_enabled: all non-CLI tasks
  now enter the in_flight_by_sender store, enabling /stop to reach them;
  auto-cancel-on-new-message remains gated on interrupt_enabled (Telegram/
  Slack only) — this is a deliberate broadening, not a side effect

Deferred to follow-up (feat/matrix-interrupt-on-new-message):
- interrupt_on_new_message config field for Matrix
- thread-aware interruption_scope_key (requires per-channel thread_ts
  semantics analysis; Slack always sets thread_ts, Matrix only for replies)

Supersedes #2855

Tests: 7 new unit tests for is_stop_command; all 4075 tests pass.
2026-03-18 13:29:40 -03:00
Argenis
959b933841
fix(providers): preserve conversation context in Claude Code CLI (#3885)
* fix(providers): preserve conversation context in Claude Code CLI provider

Override chat_with_history to format full multi-turn conversation
history into a single prompt for the claude CLI, instead of only
forwarding the last user message.

Closes #3878

* fix(providers): fix ETXTBSY race in claude_code tests

Use OnceLock to initialize the fake_claude.sh test script exactly
once, preventing "Text file busy" errors when parallel tests
concurrently write and execute the same script file.
2026-03-18 11:13:42 -04:00
Argenis
caf7c7194f
fix(cron): prevent one-shot jobs from re-executing indefinitely (#3886)
Handle Schedule::At jobs in reschedule_after_run by disabling them
instead of rescheduling to a past timestamp. Also add a fallback in
persist_job_result to disable one-shot jobs if removal fails.

Closes #3868
2026-03-18 11:03:44 -04:00
Argenis
ee7d542da6
fix: pass route-specific api_key through channel provider creation (#3881)
When using Channel mode with dynamic classification and routing, the
route-specific `api_key` from `[[model_routes]]` was silently dropped.
The system always fell back to the global `api_key`, causing 401 errors
when routing to `custom:` providers that require distinct credentials.

Root cause: `ChannelRouteSelection` only stored provider + model, and
`get_or_create_provider` always used `ctx.api_key` (the global key).

Changes:
- Add `api_key` field to `ChannelRouteSelection` so the matched route's
  credential survives through to provider creation.
- Update `get_or_create_provider` to accept and prefer a route-specific
  `api_key` over the global key.
- Use a composite cache key (provider name + api_key hash) to prevent
  cache poisoning when multiple routes target the same provider with
  different credentials.
- Wire the route api_key through query classification matching and the
  `/model` (SetModel) command path.

Fixes #3838
2026-03-18 10:06:06 -04:00
Argenis
d51ec4b43f
fix(docker): remove COPY commands for dockerignored paths (#3880)
The Dockerfile and Dockerfile.debian COPY `firmware/`, `crates/robot-kit/`,
and `crates/robot-kit/Cargo.toml`, but `.dockerignore` excludes both
`firmware/` and `crates/robot-kit/`, causing COPY failures during build.

Since these are hardware-only paths not needed for the Docker runtime:
- Remove COPY commands for `firmware/` and `crates/robot-kit/`
- Remove dummy `crates/robot-kit/src` creation in dep-caching steps
- Use sed to strip `crates/robot-kit` from workspace members in the
  copied Cargo.toml so Cargo doesn't look for the missing manifest

Fixes #3836
2026-03-18 10:06:03 -04:00
Alix-007
d81eeefe52 fix(tools): normalize workspace-prefixed paths 2026-03-18 10:35:25 +08:00
Argenis
3d92b2a652
Merge pull request #3833 from zeroclaw-labs/fix/pairing-code-display
fix(web): display pairing code in dashboard
2026-03-17 22:16:50 -04:00
argenis de la rosa
3255051426 fix(web): display pairing code in dashboard instead of terminal-only
Fetch the current pairing code from GET /admin/paircode (localhost-only)
and display it in both the initial PairingDialog and the /pairing
management page. Users no longer need to check the terminal to find
the 6-digit code — it appears directly in the web UI.

Falls back gracefully when the admin endpoint is unreachable (e.g.
non-localhost access), showing the original "check your terminal" prompt.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-17 22:01:03 -04:00
Argenis
dcaf330848
Merge pull request #3828 from zeroclaw-labs/fix/readme
fix(readme): update social links across all locales
2026-03-17 19:15:29 -04:00
argenis de la rosa
7f8de5cb17 fix(readme): update Facebook group URL and add Discord, TikTok, RedNote badges
Update Facebook group link from /groups/zeroclaw to /groups/zeroclawlabs
across all 31 README locale files. Add Discord, TikTok, and RedNote
social badges to the badge section of all READMEs.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-17 19:02:53 -04:00
Argenis
1341cfb296
Merge pull request #3827 from zeroclaw-labs/feat/plugin-wasm
feat(plugins): add WASM plugin system with Extism runtime
2026-03-17 18:51:41 -04:00
argenis de la rosa
9da620a5aa fix(ci): add cargo-audit ignore for wasmtime vulns from extism
cargo-audit uses .cargo/audit.toml (not deny.toml) for its ignore
list. These 3 wasmtime advisories are transitive via extism 1.13.0
with no upstream fix available. Plugin system is feature-gated.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-17 18:38:02 -04:00
argenis de la rosa
d016e6b1a0 fix(ci): ignore wasmtime vulns from extism 1.13.0 (no upstream fix)
RUSTSEC-2026-0006, RUSTSEC-2026-0020, RUSTSEC-2026-0021 are all in
wasmtime 37.x pinned by extism. No newer extism release available.
Plugin system is behind a feature flag to limit exposure.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-17 18:35:08 -04:00
argenis de la rosa
9b6360ad71 fix(ci): ignore unmaintained transitive deps from extism and indicatif
Add cargo-deny ignore entries for RUSTSEC-2024-0388 (derivative),
RUSTSEC-2025-0057 (fxhash), and RUSTSEC-2025-0119 (number_prefix).
All are transitive dependencies we cannot directly control.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-17 18:33:03 -04:00
argenis de la rosa
dc50ca9171 fix(plugins): update lockfile and fix ws.rs formatting
Sync Cargo.lock with new Extism/WASM plugin dependencies and apply
rustfmt line-wrap fix in gateway WebSocket handler.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-17 18:30:41 -04:00
argenis de la rosa
67edd2bc60 fix(plugins): integrate WASM tools into registry, add gateway routes and tests
- Wire WASM plugin tools into all_tools_with_runtime() behind
  cfg(feature = "plugins-wasm"), discovering and registering tool-capable
  plugins from the configured plugins directory at startup.
- Add /api/plugins gateway endpoint (cfg-gated) for listing plugin status.
- Add mod plugins declaration to main.rs binary crate so crate::plugins
  resolves when the feature is enabled.
- Add unit tests for PluginHost: empty dir, manifest discovery, capability
  filtering, lookup, and removal.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-17 18:10:24 -04:00
argenis de la rosa
dcf66175e4 feat(plugins): add example weather plugin and manifest
Add a standalone example plugin demonstrating the WASM plugin interface:
- example-plugin/Cargo.toml: cdylib crate targeting wasm32-wasip1
- example-plugin/src/lib.rs: mock weather tool using extism-pdk
- example-plugin/manifest.toml: plugin manifest declaring tool capability

This crate is intentionally NOT added to the workspace members since it
targets wasm32-wasip1 and would break the main build.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-17 18:09:54 -04:00
argenis de la rosa
b3bb79d805 feat(plugins): add PluginHost, WasmTool, and WasmChannel bridges
Implement the core plugin infrastructure:
- PluginHost: discovers plugins from the workspace plugins directory,
  loads manifest.toml files, supports install/remove/list/info operations
- WasmTool: bridges WASM plugins to the Tool trait (execute stub pending
  Extism runtime wiring)
- WasmChannel: bridges WASM plugins to the Channel trait (send/listen
  stubs pending Extism runtime wiring)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-17 18:09:54 -04:00
argenis de la rosa
c857b64bb4 feat(plugins): add Extism dependency, feature flag, and plugin module skeleton
Introduce the WASM plugin system foundation:
- Add extism 1.9 as an optional dependency behind `plugins-wasm` feature
- Create `src/plugins/` module with manifest types, error types, and stub host
- Add `Plugin` CLI subcommands (list, install, remove, info) behind cfg gate
- Add `PluginsConfig` to the config schema with sensible defaults

All plugin code is behind `#[cfg(feature = "plugins-wasm")]` so the default
build is unaffected.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-17 18:09:54 -04:00
Argenis
c051f0323e
Merge pull request #3822 from zeroclaw-labs/feat/pairing-dashboard
feat(gateway): add pairing dashboard with device management
2026-03-17 17:54:28 -04:00
Argenis
dea5c67ab0
Merge pull request #3821 from zeroclaw-labs/feat/self-test-update
feat(cli): add self-test and update commands
2026-03-17 17:54:25 -04:00
Argenis
a14afd7ef9
Merge pull request #3820 from zeroclaw-labs/feat/docker-fix
feat(docker): web-builder stage, healthcheck probe, resource limits
2026-03-17 17:54:22 -04:00
argenis de la rosa
4455b24056 fix(pairing): add SQLite persistence, fix config defaults, align with plan
- Add SQLite persistence to DeviceRegistry (backed by rusqlite)
- Rename config fields: ttl_secs -> code_ttl_secs, max_pending -> max_pending_codes, max_attempts -> max_failed_attempts
- Update defaults: code_length 6 -> 8, ttl_secs 300 -> 3600, max_pending 10 -> 3
- Add attempts tracking to PendingPairing struct
- Add token_hash() and authenticate_and_hash() to PairingGuard
- Fix route paths: /api/pairing/submit -> /api/pair, /api/devices/{id}/rotate -> /api/devices/{id}/token/rotate
- Add QR code placeholder to Pairing.tsx
- Pass workspace_dir to DeviceRegistry constructor

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-17 17:44:55 -04:00
argenis de la rosa
8ec6522759 fix(gateway): add new fields to test AppState and GatewayConfig constructors
Add device_registry, pending_pairings to test AppState instances and
pairing_dashboard to test GatewayConfig to fix compilation of tests
after the new pairing dashboard fields were introduced.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-17 17:36:01 -04:00
argenis de la rosa
a818edb782 feat(web): add pairing dashboard page
Add Pairing page with device list table, pairing code generation,
and device revocation. Create useDevices hook for reusable device
fetching. Wire /pairing route into App.tsx router.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-17 17:35:39 -04:00
argenis de la rosa
e0af3d98dd feat(gateway): extend WebSocket handshake with optional connect params
Add ConnectParams struct for an optional first-frame connect handshake.
If the first WebSocket message is {"type":"connect",...}, connection
parameters (session_id, device_name, capabilities) are extracted and
a "connected" ack is sent back. Old clients sending "message" first
still work unchanged (backward-compatible).

Extract process_chat_message() helper to avoid duplication between
fallback first-message handling and the main message loop.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-17 17:35:39 -04:00
argenis de la rosa
48bdbde26c feat(gateway): add device registry and pairing API handlers
Introduce DeviceRegistry, PairingStore, and five new API endpoints:
- POST /api/pairing/initiate — generate a new pairing code
- POST /api/pairing/submit — submit code with device metadata
- GET /api/devices — list paired devices
- DELETE /api/devices/{id} — revoke a paired device
- POST /api/devices/{id}/rotate — rotate a device token

Wire into AppState and gateway router. Registry is only created
when require_pairing is enabled.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-17 17:34:56 -04:00
argenis de la rosa
dc495a105f feat(config): add PairingDashboardConfig to gateway schema
Add PairingDashboardConfig struct with configurable code_length,
ttl_secs, max_pending, max_attempts, and lockout_secs fields.
Nested under GatewayConfig as `pairing_dashboard` with serde defaults.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-17 17:32:27 -04:00
argenis de la rosa
fe9addcfe0 fix(cli): align self-test and update commands with implementation plan
- Export commands module from lib.rs (pub mod commands) for external consumers
- Add --force and --version flags to the Update CLI command
- Wire version parameter through to check() and run() in update.rs,
  supporting targeted version fetches via GitHub releases/tags API
- Add WebSocket handshake check (check_websocket_handshake) to the full
  self-test suite in self_test.rs

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-17 17:24:59 -04:00
argenis de la rosa
5bfa5f18e1 feat(cli): add update command with 6-phase pipeline and rollback
Add `zeroclaw update` command with a 6-phase self-update pipeline:
1. Preflight — check GitHub releases API for newer version
2. Download — fetch platform-specific binary to temp dir
3. Backup — copy current binary to .bak for rollback
4. Validate — size check + --version smoke test on download
5. Swap — overwrite current binary with new version
6. Smoke test — verify updated binary runs, rollback on failure

Supports --check flag for update-check-only mode without installing.
Includes version comparison logic with unit tests.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-17 17:24:58 -04:00
argenis de la rosa
72b7e1e647 feat(cli): add self-test command with quick and full modes
Add `zeroclaw self-test` command with two modes:
- Quick mode (--quick): 8 offline checks including config, workspace,
  SQLite, provider/tool/channel registries, security policy, and version
- Full mode (default): adds gateway health and memory round-trip checks

Creates src/commands/ module structure with self_test and update stubs.
Adds indicatif and tempfile runtime dependencies for the update pipeline.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-17 17:24:58 -04:00
argenis de la rosa
413c94befe chore(docker): tighten compose resource limits
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-17 16:02:15 -04:00
argenis de la rosa
5aa6026fa1 feat(cli): add status --format=exit-code for Docker healthcheck
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-17 16:02:15 -04:00
argenis de la rosa
6eca841bd7 feat(docker): add web-builder stage and update .dockerignore
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-17 16:02:15 -04:00
Argenis
50e8d4f5f8
fix(ci): use pre-built binaries for Debian Docker image (#3814)
The Debian compatibility image was building from source with QEMU
cross-compilation for ARM64, which is extremely slow and was getting
cancelled by the concurrency group. Switch to using pre-built binaries
(same as the distroless image) with a debian:bookworm-slim runtime base.

- Add Dockerfile.debian.ci (mirrors Dockerfile.ci with Debian runtime)
- Update release-beta-on-push.yml to use docker-ctx + pre-built bins
- Update release-stable-manual.yml with same fix
- Drop GHA cache layers (no longer building from source)

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-17 15:21:15 -04:00
Argenis
fc2aac7c94
feat(gateway): persist WS chat sessions across restarts (#3813)
Gateway WebSocket chat sessions were in-memory only — conversation
history was lost on gateway restart, macOS sleep/wake, or client
reconnect. This wires up the existing SessionBackend (SQLite) to
the gateway WS handler so sessions survive restarts and reconnections.

Changes:
- Add delete_session() to SessionBackend trait + SQLite implementation
- Add session_persistence and session_ttl_hours to GatewayConfig
- Add Agent::seed_history() to hydrate agent from persisted messages
- Initialize SqliteSessionBackend in run_gateway() when enabled
- Send session_start message on WS connect with session_id + resumed
- Persist user/assistant messages after each turn
- Add GET /api/sessions and DELETE /api/sessions/{id} REST endpoints
- Bump version to 0.5.0

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-17 14:26:39 -04:00
Argenis
4caa3f7e6f
fix(web): remove duplicate dashboard keys in Turkish locale (#3812)
The Turkish (tr) locale section had a duplicate "Dashboard specific
labels" block that repeated 19 keys already defined earlier, causing
TypeScript error TS1117. Moved the unique keys (provider_model,
paired_yes, etc.) into the primary dashboard section and removed
the duplicate block.

Fixes build failure introduced by #3777.
2026-03-17 14:13:46 -04:00
Argenis
3bc6ec3cf5
fix: only tweet for stable releases, not beta builds (#3808)
Remove tweet job from beta workflow. Update tweet-release.yml to diff
against previous stable tag (excluding betas) to capture all features
across the full release cycle. Simplify tweet format to feature-focused
style without contributor counts.

Supersedes #3575.
2026-03-17 14:06:46 -04:00
Argenis
f3fbd1b094
fix(web): preserve provider runtime options in ws agent (#3807)
Co-authored-by: Alix-007 <267018309+Alix-007@users.noreply.github.com>
2026-03-17 14:06:22 -04:00
Yingpeng MA
79e8252d7a
feat(web/i18n): add full Chinese locale and complete Turkish translations (#3777)
- Add comprehensive Simplified Chinese (zh) translations for all UI strings
- Extend and complete Turkish (tr) translations
- Fill in missing English (en) translation keys
- Reset default locale to 'en'
- Update language toggle to cycle through all three locales: en → zh → tr
2026-03-17 13:40:25 -04:00
Marijan Petričević
924521c927
config/schema: add serde default to AutonomyConfig (#3691)
Co-authored-by: Argenis <theonlyhennygod@gmail.com>
2026-03-17 13:40:18 -04:00
Argenis
07ca270f03
fix(security): restore tokens.is_empty() guard, add re-pairing hint (#3738)
Revert "always generate pairing code" to tighter security posture:
codes are only generated on first startup when no tokens exist. Add
a CLI hint to the gateway banner so operators know how to re-pair
on demand. Fix install.sh to not use --new on fresh install (avoids
invalidating the auto-generated code). Fix onboard to show an
informational message instead of a throwaway PairingGuard.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-17 13:40:02 -04:00
Alix-007
e08091a2e2
fix(install): print PATH guidance after cargo install (#3769)
Co-authored-by: Alix-007 <267018309+Alix-007@users.noreply.github.com>
2026-03-17 13:39:53 -04:00
Alix-007
1f1123d071
fix(channels): allow low-risk shell in non-interactive mode (#3771)
Co-authored-by: Alix-007 <267018309+Alix-007@users.noreply.github.com>
2026-03-17 13:39:37 -04:00
Alix-007
d5bc46238a
fix(install): skip prebuilt flow on musl (#3788)
Co-authored-by: Alix-007 <267018309+Alix-007@users.noreply.github.com>
2026-03-17 13:39:29 -04:00
Alix-007
843973762a
ci(docker): publish debian compatibility image (#3789)
Co-authored-by: Alix-007 <267018309+Alix-007@users.noreply.github.com>
2026-03-17 13:39:20 -04:00
Alix-007
5f8d7d7347
fix(daemon): preserve deferred MCP tools in /api/chat (#3790)
Co-authored-by: Alix-007 <267018309+Alix-007@users.noreply.github.com>
2026-03-17 13:39:12 -04:00
Alix-007
7b3bea8d01
fix(agent): resolve deferred MCP tools by suffix (#3793)
Co-authored-by: Alix-007 <267018309+Alix-007@users.noreply.github.com>
2026-03-17 13:39:03 -04:00
Alix-007
ac461dc704
fix(docker): align debian image glibc baseline (#3794)
Co-authored-by: Alix-007 <267018309+Alix-007@users.noreply.github.com>
2026-03-17 13:38:55 -04:00
Alix-007
f04e56d9a1
feat(skills): support YAML frontmatter in SKILL.md (#3797)
* feat(skills): support YAML frontmatter in SKILL.md

* fix(skills): preserve nested open-skill names

---------

Co-authored-by: Alix-007 <267018309+Alix-007@users.noreply.github.com>
2026-03-17 13:38:49 -04:00
Alix-007
1d6f482b04
fix(build): rerun embedded web assets when dist changes (#3799)
Co-authored-by: Alix-007 <267018309+Alix-007@users.noreply.github.com>
2026-03-17 13:38:40 -04:00
Alix-007
ba6d0a4df9
fix(release): include matrix channel in official builds (#3800)
Co-authored-by: Alix-007 <267018309+Alix-007@users.noreply.github.com>
2026-03-17 13:38:33 -04:00
Alix-007
3cf873ab85
fix(groq): fall back on tool validation 400s (#3778)
Co-authored-by: Alix-007 <267018309+Alix-007@users.noreply.github.com>
Co-authored-by: Argenis <theonlyhennygod@gmail.com>
2026-03-17 09:23:39 -04:00
Argenis
025724913d
feat(runtime): add configurable reasoning effort (#3785)
* feat(runtime): add configurable reasoning effort

* fix(test): add missing reasoning_effort field in live test

Add reasoning_effort: None to ProviderRuntimeOptions construction in
openai_codex_vision_e2e.rs to fix E0063 compile error.

---------

Co-authored-by: Alix-007 <267018309+Alix-007@users.noreply.github.com>
2026-03-17 09:21:53 -04:00
project516
49dd4cd9da
Change AppImage to tar.gz in arduino-uno-q-setup.md (#3754)
Arduino App Lab is a tar.gz file for Linux, not an AppImage

Co-authored-by: Argenis <theonlyhennygod@gmail.com>
2026-03-17 09:19:38 -04:00
dependabot[bot]
0664a5e854
chore(deps): bump rust from 7d37016 to da9dab7 (#3776)
Bumps rust from `7d37016` to `da9dab7`.

---
updated-dependencies:
- dependency-name: rust
  dependency-version: 1.94-slim
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Argenis <theonlyhennygod@gmail.com>
2026-03-17 09:16:21 -04:00
Argenis
acd09fbd86
feat(ci): use pre-built binaries for Docker images (#3784)
Instead of compiling Rust from source inside Docker (~60 min),
download the already-built linux binaries from the build matrix
and copy them into a minimal distroless image (~2 min).

- Add Dockerfile.ci for release workflows (no Rust toolchain needed)
- Update both beta and stable workflows to use pre-built artifacts
- Drop Docker job timeout from 60 to 15 minutes
- Original Dockerfile unchanged for local dev builds
2026-03-17 09:03:13 -04:00
Alix-007
0f7d1fceeb
fix(channels): hide tool-call notifications by default (#3779)
Co-authored-by: Alix-007 <267018309+Alix-007@users.noreply.github.com>
Co-authored-by: Argenis <theonlyhennygod@gmail.com>
2026-03-17 08:52:49 -04:00
GhostC
01e13ac92d
fix(skills): allow sibling markdown links within skills root (#3781)
Made-with: Cursor
2026-03-17 08:31:20 -04:00
Argenis
a9a6113093
fix(docs): revert unauthorized CLAUDE.md additions from #3604 (#3761)
PR #3604 included CLAUDE.md changes referencing non-existent modules
(src/security/taint.rs, src/sop/workflow.rs) and duplicating content
from CONTRIBUTING.md. These additions violate the anti-pattern rule
against modifying CLAUDE.md in feature PRs.
2026-03-17 01:56:51 -04:00
Giulio V
906951a587
feat(multi): LinkedIn tool, WhatsApp voice notes, and Anthropic OAuth fix (#3604)
* feat(tools): add native LinkedIn integration tool

Add a config-gated LinkedIn tool that enables ZeroClaw to interact with
LinkedIn's REST API via OAuth2. Supports creating posts, listing own
posts, commenting, reacting, deleting posts, viewing engagement stats,
and retrieving profile info.

Architecture:
- linkedin.rs: Tool trait impl with action-dispatched design
- linkedin_client.rs: OAuth2 token management and API wrappers
- Config-gated via [linkedin] enabled = false (default off)
- Credentials loaded from workspace .env file
- Automatic token refresh with line-targeted .env update

39 unit tests covering security enforcement, parameter validation,
credential parsing, and token management.

* feat(linkedin): configurable content strategy and API version

- Expand LinkedInConfig with api_version and nested LinkedInContentConfig
  (rss_feeds, github_users, github_repos, topics, persona, instructions)
- Add get_content_strategy tool action so agents can read config at runtime
- Fix hardcoded LinkedIn API version 202402 (expired) → configurable,
  defaulting to 202602
- LinkedInClient accepts api_version as parameter instead of static header
- 4 new tests (43 total), all passing

* feat(linkedin): add multi-provider image generation for posts

Add ImageGenerator with provider chain (DALL-E, Stability AI, Imagen, Flux)
and SVG fallback card. LinkedIn tool create_post now supports generate_image
parameter. Includes LinkedIn image upload (register → upload → reference),
configurable provider priority, and 14 new tests.

* feat(whatsapp): add voice note transcription and TTS voice replies

- Add STT support: download incoming voice notes via wa-rs, transcribe
  with OpenAI Whisper (or Groq), send transcribed text to agent
- Add TTS support: synthesize agent replies to Opus audio via OpenAI
  TTS, upload encrypted media, send as WhatsApp voice note (ptt=true)
- Voice replies only trigger when user sends a voice note; text
  messages get text replies only. Flag is consumed after one use to
  prevent multiple voice notes per agent turn
- Fix transcription module to support OpenAI API key (not just Groq):
  auto-detect provider from API URL, check ANTHROPIC_OAUTH_TOKEN /
  OPENAI_API_KEY / GROQ_API_KEY env vars in priority order
- Add optional api_key field to TranscriptionConfig for explicit key
- Add response_format: opus to OpenAI TTS for WhatsApp compatibility
- Add channel capability note so agent knows TTS is automatic
- Wire transcription + TTS config into WhatsApp Web channel builder

* fix(providers): prefer ANTHROPIC_OAUTH_TOKEN over global api_key

When the Anthropic provider is used alongside a non-Anthropic primary
provider (e.g. custom: gateway), the global api_key would be passed
as credential override, bypassing provider-specific env vars. This
caused Claude Code subscription tokens (sk-ant-oat01-*) to be ignored
in favor of the unrelated gateway JWT.

Fix: for the anthropic provider, check ANTHROPIC_OAUTH_TOKEN and
ANTHROPIC_API_KEY env vars before falling back to the credential
override. This mirrors the existing MiniMax OAuth pattern and enables
subscription-based auth to work as a fallback provider.

* feat(linkedin): add scheduled post support via LinkedIn API

Add scheduled_at parameter to create_post and create_post_with_image.
When provided (RFC 3339 timestamp), the post is created as a DRAFT
with scheduledPublishOptions so LinkedIn publishes it automatically
at the specified time. This enables the cron job to schedule a week
of posts in advance directly on LinkedIn.

* fix(providers): prefer env vars for openai and groq credential resolution

Generalize the Anthropic OAuth fix to also cover openai and groq
providers. When used alongside a non-matching primary provider (e.g.
a custom: gateway), the global api_key would be passed as credential
override, causing auth failures. Now checks provider-specific env
vars (OPENAI_API_KEY, GROQ_API_KEY) before falling back to the
credential override.

* fix(whatsapp): debounce voice replies to voice final answer only

The voice note TTS was triggering on the first send() call, which was
often intermediate tool output (URLs, JSON, web fetch results) rather
than the actual answer. This produced incomprehensible voice notes.

Fix: accumulate substantive replies (>30 chars, not URLs/JSON/code)
in a pending_voice map. A spawned debounce task waits 4 seconds after
the last substantive message, then synthesizes and sends ONE voice
note with the final answer. Intermediate tool outputs are skipped.

This ensures the user hears the actual answer in the correct language,
not raw tool output in English.

* fix(whatsapp): voice in = voice out, text in = text out

Rewrite voice reply logic with clean separation:
- Voice note received: ALL text output suppressed. Latest message
  accumulated silently. After 5s of no new messages, ONE voice note
  sent with the final answer. No tool outputs, no text, just voice.
- Text received: normal text reply, no voice.

Atomic debounce: multiple spawned tasks race but only one can extract
the pending message (remove-inside-lock pattern). Prevents duplicate
voice notes.

* fix(whatsapp): voice replies send both text and voice note

Voice note in → text replies sent normally in real-time PLUS one
voice note with the final answer after 10s debounce. Only substantive
natural-language messages are voiced (tool outputs, URLs, JSON, code
blocks filtered out). Longer debounce (10s) ensures the agent
completes its full tool chain before the voice note fires.

Text in → text out only, no voice.

* fix(channels): suppress tool narration and ack reactions

- Add system prompt instruction telling the agent to NEVER narrate
  tool usage (no "Let me fetch..." or "I will use http_request...")
- Disable ack_reactions (emoji reactions on incoming messages)
- Users see only the final answer, no intermediate steps

* docs(claude): add full CONTRIBUTING.md guidelines to CLAUDE.md

Add PR template requirements, code naming conventions, architecture
boundary rules, validation commands, and branch naming guidance
directly to CLAUDE.md for AI assistant reference.

* fix(docs): add blank lines around headings in CLAUDE.md for markdown lint

* fix(channels): strengthen tool narration suppression and fix large_futures

- Move anti-narration instruction to top of channel system prompt
- Add emphatic instruction for WhatsApp/voice channels specifically
- Add outbound message filter to strip tool-call-like patterns (, 🔧)
- Box::pin the two-phase heartbeat agent::run call (16664 bytes on Linux)
2026-03-17 01:55:05 -04:00
Giulio V
220745e217
feat(channels): add Reddit, Bluesky, and generic Webhook adapters (#3598)
* feat(channels): add Reddit, Bluesky, and generic Webhook adapters

- Reddit: OAuth2 polling for mentions/DMs/replies, comment and DM sending
- Bluesky: AT Protocol session auth, notification polling, post replies
- Webhook: Axum HTTP server for inbound, configurable outbound POST/PUT
- All three follow existing channel patterns with tests

* fix(channels): use neutral test fixtures and improve test naming in webhook
2026-03-17 01:26:58 -04:00
Giulio V
61de3d5648
feat(knowledge): add knowledge graph for expertise capture and reuse (#3596)
* feat(knowledge): add knowledge graph for expertise capture and reuse

SQLite-backed knowledge graph system for consulting firms to capture,
organize, and reuse architecture decisions, solution patterns, lessons
learned, and expert matching across client engagements.

- KnowledgeGraph (src/memory/knowledge_graph.rs): node CRUD, edge
  creation, FTS5 full-text search, tag filtering, subgraph traversal,
  expert ranking by authored contributions, graph statistics
- KnowledgeTool (src/tools/knowledge_tool.rs): Tool trait impl with
  capture, search, relate, suggest, expert_find, lessons_extract, and
  graph_stats actions
- KnowledgeConfig (src/config/schema.rs): disabled by default,
  configurable db_path/max_nodes, cross_workspace_search off by default
  for client data isolation
- Wired into tools factory (conditional on config.knowledge.enabled)

20 unit tests covering node CRUD, edge creation, search ranking,
subgraph queries, expert ranking, and tool actions.

* fix: address CodeRabbit review findings

- Fix UTF-8 truncation panic in truncate_str by using char-based
  iteration instead of byte indexing
- Add config validation for knowledge.max_nodes > 0
- Add subgraph depth boundary validation (must be > 0, capped at 100)

* fix(knowledge): address remaining CodeRabbit review issues

- MAJOR: Add db_path non-empty validation in Config::validate()
- MAJOR: Reject tags containing commas in add_node (comma is separator)
- MAJOR: Fix subgraph depth boundary (0..depth instead of 0..=depth)
- MAJOR: Apply project and node_type filters consistently in both
  tag-only and similarity search paths

* fix: correct subgraph traversal test assertion and sync CI workflows
2026-03-17 01:11:29 -04:00
Giulio V
675a5c9af0
feat(tools): add Google Workspace CLI (gws) integration (#3616)
* feat(tools): add Google Workspace CLI (gws) integration

Adds GoogleWorkspaceTool for interacting with Google Drive, Sheets,
Gmail, Calendar, Docs, and other Workspace services via CLI.

- Config-gated (google_workspace.enabled)
- Service allowlist for restricted access
- Requires shell access for CLI delegation
- Input validation against shell injection
- Wrong-type rejection for all optional parameters
- Config validation for allowed_services (empty, duplicate, malformed)
- Registered in integrations registry and CLI discovery

Closes #2986

* style: fix cargo fmt + clippy violations

* feat(google-workspace): expand config with auth, rate limits, and audit settings

* fix(tools): define missing GWS_TIMEOUT_SECS constant

* fix: Box::pin large futures and resolve duplicate Default impl

---------

Co-authored-by: argenis de la rosa <theonlyhennygod@gmail.com>
2026-03-17 00:52:59 -04:00
Giulio V
b099728c27
feat(stt): multi-provider STT with TranscriptionProvider trait (#3614)
* feat(stt): add multi-provider STT with TranscriptionProvider trait

Refactors single-endpoint transcription to support multiple providers:
Groq (existing), OpenAI Whisper, Deepgram, AssemblyAI, and Google Cloud
Speech-to-Text. Adds TranscriptionManager for provider routing with
backward-compatible config fields.

* style: fix cargo fmt + clippy violations

* fix: Box::pin large futures and resolve merge conflicts with master

---------

Co-authored-by: argenis de la rosa <theonlyhennygod@gmail.com>
2026-03-17 00:33:41 -04:00
Alix-007
f87c7442b9 chore(pr): restore merge-base docs file for #3356 2026-03-17 12:11:35 +08:00
Alix-007
0a191fc02c chore(pr): drop unrelated docs delta from #3356 2026-03-17 12:08:20 +08:00
Argenis
1ca2092ca0
test(channel): add QQ markdown msg_type regression test (#3752)
Verify that QQ send body uses msg_type 2 with nested markdown object
instead of msg_type 0 with top-level content. Adapted from #3668.
2026-03-16 22:03:43 -04:00
Giulio V
5e3308eaaa
feat(providers): add Claude Code, Gemini CLI, and KiloCLI subprocess providers (#3615)
* feat(providers): add Claude Code, Gemini CLI, and KiloCLI subprocess providers

Adds three new local subprocess-based providers for AI CLI tools.
Each provider spawns the CLI as a child process, communicates via
stdin/stdout pipes, and parses responses into ChatResponse format.

* fix: resolve clippy unnecessary_debug_formatting and rustfmt violations

* fix: resolve remaining clippy unnecessary_debug_formatting in CLI providers

* fix(providers): add AiAgent CLI category for subprocess providers
2026-03-16 21:51:05 -04:00
Chris Hengge
ec255ad788
fix(tool): expand cron_add and cron_update parameter schemas (#3671)
The schedule field in cron_add used a bare {"type":"object"} with a
description string encoding a tagged union in pseudo-notation. The patch
field in cron_update was an opaque {"type":"object"} despite CronJobPatch
having nine fully-typed fields. Both gaps cause weaker instruction-following
models to produce malformed or missing nested JSON when invoking these tools.

Changes:
- cron_add: expand schedule into a oneOf discriminated union with explicit
  properties and required fields for each variant (cron/at/every), matching
  the Schedule enum in src/cron/types.rs exactly
- cron_add: add descriptions to all previously undocumented top-level fields
- cron_add: expand delivery from a bare inline comment to fully-specified
  properties with per-field descriptions
- cron_update: expand patch from opaque object to full properties matching
  CronJobPatch (name, enabled, command, prompt, model, session_target,
  delete_after_run, schedule, delivery)
- cron_update: schedule inside patch mirrors the same oneOf expansion
- Both: add inline NOTE comments flagging that oneOf is correct for
  OpenAI-compatible APIs but SchemaCleanr::clean_for_gemini must be
  applied if Gemini native tool calling is ever wired up
- Both: add schema-shape tests using the existing test_config/test_security
  helper pattern, covering oneOf variant structure, required fields, and
  delivery channel enum completeness

No behavior changes. No new dependencies. Backward compatible: the runtime
deserialization path (serde on Schedule/CronJobPatch) is unchanged.

Co-authored-by: Argenis <theonlyhennygod@gmail.com>
2026-03-16 21:45:49 -04:00
Sid Jain
7182f659ce
fix(slack): honor mention_only in runtime channel wiring (#3715)
* feat(slack): wire mention_only group reply policy

* feat(slack): expose mention_only in config and wizard defaults
2026-03-16 21:40:47 -04:00
Ericsunsk
ae7681209d
fix(openai-codex): decode utf-8 safely across stream chunks (#3723) 2026-03-16 21:40:45 -04:00
Markus Bergholz
ee3469e912
Fix: Support Nextcloud Talk Activity Streams 2.0 webhook format (#3737)
* fix

* fix

* format
2026-03-16 21:40:42 -04:00
Argenis
fec81d8e75
ci: auto-sync Scoop and AUR on stable release (#3743)
Add workflow_call triggers to pub-scoop.yml and pub-aur.yml so the
stable release workflow can invoke them automatically after publish.

Wire scoop and aur jobs into release-stable-manual.yml as post-publish
steps (parallel with tweet), gated on publish success.

Update ci-map.md trigger docs to reflect auto-called behavior.
2026-03-16 21:34:29 -04:00
Ricardo Madriz
9a073fae1a
fix(tools) Wire activated toolset into dispatch (#3747)
* fix(tools): wire ActivatedToolSet into tool dispatch and spec advertisement

When deferred MCP tools are activated via tool_search, they are stored
in ActivatedToolSet but never consulted by the tool call loop.
tool_specs is built once before the iteration loop and never refreshed,
so the provider API tools[] parameter never includes activated tools.
find_tool only searches the static registry, so execution dispatch also
fails silently.

Thread Arc<Mutex<ActivatedToolSet>> from creation sites through to
run_tool_call_loop. Rebuild tool_specs each iteration to merge base
registry specs with activated specs. Add fallback in execute_one_tool
to check the activated set when the static registry lookup misses.

Change ActivatedToolSet internal storage from Box<dyn Tool> to
Arc<dyn Tool> so we can clone the Arc out of the mutex guard before
awaiting tool.execute() (std::sync::MutexGuard is not Send).

* fix(tools): add activated_tools field to new ChannelRuntimeContext test site
2026-03-16 21:34:08 -04:00
Chris Hengge
f0db63e53c
fix(integrations): wire Cron and Browser status to config fields (#3750)
Both entries had hardcoded |_| IntegrationStatus::Available, ignoring
the live config entirely. Users with cron.enabled = true or
browser.enabled = true saw 'Available' on the /integrations dashboard
card instead of 'Active'.

Root cause: status_fn closures did not capture the Config argument.

Fix: replace the |_| stubs with |c| closures that check c.cron.enabled
and c.browser.enabled respectively, matching the pattern used by every
other wired entry in the registry (Telegram, Discord, Shell, etc.).

What did NOT change: ComingSoon entries, always-Active entries (Shell,
File System), platform entries, or any other registry logic.
2026-03-16 21:34:06 -04:00
Argenis
df4dfeaf66
chore: bump version to 0.4.3 (#3749)
Update version across Cargo.toml, Cargo.lock, Scoop manifest,
and AUR PKGBUILD/.SRCINFO for the v0.4.3 stable release.
2026-03-16 21:23:04 -04:00
Giulio V
e4ef25e913
feat(security): add Merkle hash-chain audit trail (#3601)
* feat(security): add Merkle hash-chain audit trail

Each audit entry now includes a SHA-256 hash linking it to the previous
entry (entry_hash, prev_hash, sequence), forming a tamper-evident chain.
Modifying any entry invalidates all subsequent hashes.

- Chain fields added to AuditEvent with #[serde(default)] for backward compat
- AuditLogger tracks chain state and recovers from existing logs on restart
- verify_chain() validates hash linkage, sequence continuity, and integrity
- Five new tests: genesis seed, multi-entry verify, tamper detection,
  sequence gap detection, and cross-restart chain recovery

* fix(security): replace personal name with neutral label in audit tests
2026-03-16 18:38:59 -04:00
Argenis
c3a3cfc9a6
fix(agent): prevent duplicate tool schema injection in XML dispatcher (#3744)
Remove duplicate tool listing from XmlToolDispatcher::prompt_instructions()
since tool listing is already handled by ToolsSection in prompt.rs. The
method now only emits the XML protocol envelope.

Also fix UTF-8 char boundary panics in memory consolidation truncation by
using char_indices() instead of manual byte-boundary scanning.

Fixes #3643
Supersedes #3678

Co-authored-by: TJUEZ <TJUEZ@users.noreply.github.com>
2026-03-16 18:38:44 -04:00
伊姆
013fca6ad2
fix(config): support socks proxy scheme for Clash Verge (#3001)
Co-authored-by: imu <imu@sgcc.com.cn>
2026-03-16 18:38:03 -04:00
linyibin
23a0f25b44
fix(web): ensure web/dist exists in fresh clones (#3114)
The Rust build expects web/dist to exist (static assets). Track an empty
placeholder via web/dist/.gitkeep and adjust ignore rules to keep build
artifacts ignored while allowing the placeholder file.

Made-with: Cursor
2026-03-16 18:37:14 -04:00
Giulio V
2eaa8c45f4
feat(whatsapp-web): add voice message transcription support (#3617)
Adds audio message detection and transcription to WhatsApp Web channel.
Voice messages (PTT) are downloaded, transcribed via the existing
transcription subsystem (Groq Whisper), and delivered as text content.

- TranscriptionConfig field with builder pattern
- Duration limit enforcement before download
- MIME type mapping for audio formats
- Graceful error handling (skip on failure)
- Preserves full retry/reconnect state machine from master
2026-03-16 18:34:50 -04:00
Sandeep Ghael
85bf649432
fix(channel): resolve multi-room reply routing regression (#3224) (#3378)
* fix(channel): resolve multi-room reply routing regression (#3224)

PR #3224 (f0f0f808, "feat(matrix): add multi-room support") changed the
channel name format in matrix.rs from "matrix" to "matrix:!roomId", but
the channel lookup in mod.rs still does an exact match against
channels_by_name, which is keyed by Channel::name() (returns "matrix").

This mismatch causes target_channel to always resolve to None for Matrix
messages, silently dropping all replies.

Fix: fall back to a prefix match on the base channel name (before ':')
when the exact lookup fails. This preserves multi-room conversation
isolation while correctly routing replies to the originating channel.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* style: apply cargo fmt to channel routing fix

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Sandeep (Claude) <sghael+claude@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-16 18:32:56 -04:00
Giulio V
3ea99a7619
feat(tools): add browser delegation tool (#3610)
* feat(tools): add browser delegation tool for corporate web app interaction

Adds BrowserDelegateTool that delegates browser-based tasks to Claude Code
(or other browser-capable CLIs) for interacting with corporate tools
(Teams, Outlook, Jira, Confluence) via browser automation. Includes domain
validation (allow/blocklist), task templates, Chrome profile persistence
for SSO sessions, and timeout management.

* fix: resolve clippy violation in browser delegation tool

* fix(browser-delegate): validate URLs embedded in task text against domain policy

Scan the task text for http(s):// URLs using regex and validate each
against the allow/block domain lists before forwarding to the browser
CLI subprocess. This prevents bypassing domain restrictions by
embedding blocked URLs in the task parameter.

* fix(browser-delegate): constrain URL schemes, gate on runtime, document config

- Add has_shell_access gate so BrowserDelegateTool is only registered on
  shell-capable runtimes (skipped with warning on WASM/edge runtimes)
- Add boundary tests for javascript: and data: URL scheme rejection
- URL scheme validation (http/https only) and config docs were already
  addressed by a prior commit on this branch

* fix(tools): address CodeRabbit review findings for browser delegation

Remove dead `max_concurrent_tasks` config field and expand doc comments
on the `[browser_delegate]` config section in schema.rs.
2026-03-16 18:32:20 -04:00
Christian Pojoni
14f58c77c1
fix(tool+channel): revert invalid model set via model_routing_config (#3497)
When the LLM hallucinates an invalid model ID through the
model_routing_config tool's set_default action, the invalid model gets
persisted to config.toml. The channel hot-reload then picks it up and
every subsequent message fails with a non-retryable 404, permanently
killing the connection with no user recovery path.

Fix with two layers of defense:

1. Tool probe-and-rollback: after saving the new model, send a minimal
   chat request to verify the model is accessible. If the API returns a
   non-retryable error (404, auth failure, etc.), automatically restore
   the previous config and return a failure notice to the LLM.

2. Channel safety net: in maybe_apply_runtime_config_update, reject
   config reloads when warmup fails with a non-retryable error instead
   of applying the broken config anyway.

Co-authored-by: Christian Pojoni <christian.pojoni@gmail.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-16 18:30:36 -04:00
dependabot[bot]
b833eb19f5
chore(deps): bump rust in the docker-all group (#3692)
Bumps the docker-all group with 1 update: rust.


Updates `rust` from 1.93-slim to 1.94-slim

---
updated-dependencies:
- dependency-name: rust
  dependency-version: 1.94-slim
  dependency-type: direct:production
  dependency-group: docker-all
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: Argenis <theonlyhennygod@gmail.com>
2026-03-16 18:28:51 -04:00
DotViegas
5db883b453
fix(providers): adjust temperature for OpenAI reasoning models (#2936)
Some OpenAI models (o1, o3, o4, gpt-5 variants) only accept temperature=1.0 and return errors with other values like 0.7. This change automatically adjusts the temperature parameter based on the model being used.

Changes:
- Add adjust_temperature_for_model() function to detect reasoning models
- Apply temperature adjustment in chat_with_system(), chat(), and chat_with_tools()
- Preserve user-specified temperature for standard models (gpt-4o, gpt-4-turbo, etc.)
- Force temperature=1.0 for reasoning models (o1, o3, o4, gpt-5, gpt-5-mini, gpt-5-nano, gpt-5.x-chat-latest)

Testing:
- Add 7 unit tests covering reasoning models, standard models, and edge cases
- All tests pass successfully
- Empirical testing documented in docs/openai-temperature-compatibility.md

Impact:
- Fixes temperature errors when using o1, o3, o4, and gpt-5 model families
- No breaking changes - transparent adjustment for end users
- Standard models continue to work with flexible temperature values

Risk: Low - isolated change within OpenAI provider, well-tested

Rollback: Revert this commit to restore previous behavior

Co-authored-by: Argenis <theonlyhennygod@gmail.com>
2026-03-16 18:28:01 -04:00
Eddie's AI Agent
0ae515b6b8
fix(channel): correct Matrix image marker casing to match canonical format (#3519)
Co-authored-by: Eddie Tong <xinhant@gmail.com>
2026-03-16 18:25:33 -04:00
Giulio V
2deb91455d
feat(observability): add Hands dashboard metrics and events (#3595)
Add HandStarted, HandCompleted, and HandFailed event variants to
ObserverEvent, and HandRunDuration, HandFindingsCount, HandSuccessRate
metric variants to ObserverMetric. Update all observer backends (log,
noop, verbose, prometheus, otel) to handle the new variants with
appropriate instrumentation. Prometheus backend registers hand_runs
counter, hand_duration histogram, and hand_findings counter. OTel
backend creates spans and records metrics for hand runs.
2026-03-16 18:24:47 -04:00
smallwhite
595b81be41
fix(telegram): avoid duplicate finalize_draft messages (#3259) 2026-03-16 18:24:19 -04:00
Chris Hengge
4f9d817ddb
fix(memory): serialize MemoryCategory as plain string and guard dashboard render crashes (#3051)
The /memory dashboard page rendered a black screen when MemoryCategory::Custom
was serialized by serde's derived impl as a tagged object {"custom":"..."} but
the frontend expected a plain string. No navigation was possible without using
the browser Back button.

Changes:
- src/memory/traits.rs: replace derived serde impls with custom serialize
  (delegates to Display, emits plain snake_case string) and deserialize
  (parses known variants by name, falls through to Custom(s) for unknown).
  Adds memory_category_serde_uses_snake_case and memory_category_custom_roundtrip
  tests. No persistent storage migration needed — all backends (SQLite, Markdown,
  Postgres) use their own category_to_str/str_to_category helpers and never
  read serde-serialized category values back from disk.
- web/src/App.tsx: export ErrorBoundary class so render crashes surface a
  recoverable UI instead of a black screen. Adds aria-live="polite" to the
  pairing error paragraph for screen reader accessibility.
- web/src/components/layout/Layout.tsx: wrap Outlet in ErrorBoundary keyed
  by pathname so the navigation shell stays mounted during a page crash and
  the boundary resets on route change.

Co-authored-by: Argenis <theonlyhennygod@gmail.com>
2026-03-16 18:23:03 -04:00
Darren.Zeng
d13f5500e9
fix(install): add missing libssl-dev for Debian/Ubuntu (#3285)
The install.sh script was missing libssl-dev package for Debian/Ubuntu
systems. This caused compilation failures on Debian 12 and other
Debian-based distributions when building ZeroClaw from source.

The package is already included for other distributions:
- Alpine: openssl-dev
- Fedora/RHEL: openssl-devel
- Arch: openssl

This change adds libssl-dev to the apt-get install command to ensure
OpenSSL headers are available during compilation.

Fixes #2914

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-authored-by: Claude <noreply@anthropic.com>
2026-03-16 18:22:10 -04:00
Chris Hengge
1ccfe643ba
fix(channel): bypass mention_only gate for Discord DMs (#2983)
When mention_only is enabled, the bot correctly requires an @mention in
guild (server) channels. However, Direct Messages have no guild_id and
are inherently private and addressed to the bot — requiring a @mention
in a DM is never correct and silently drops all DM messages.

Changes:
- src/channels/discord.rs: detect DMs via absence of guild_id in the
  gateway payload, compute effective_mention_only = self.mention_only && !is_dm,
  and pass that to normalize_incoming_content instead of self.mention_only.
  DMs bypass the mention gate; guild messages retain existing behaviour.
- Adds three tests: DM bypasses mention gate, guild message without mention
  is rejected, guild message with mention passes and strips the mention tag.

Co-authored-by: Argenis <theonlyhennygod@gmail.com>
2026-03-16 18:21:27 -04:00
Vadim Rutkovsky
d4d3e03e34
fix: add dummy src/lib.rs in Dockerfile.debian for dep caching stage (#3553) 2026-03-16 18:20:43 -04:00
simonfr
aa0f11b0a2
fix(docker): copy build.rs into builder stage to invalidate dummy binary cache (#3570)
* Fix build with Docker

* fix(docker): copy build.rs into builder stage to fix rag feature activation
2026-03-16 18:19:55 -04:00
Argenis
806f8b4020
fix(docker): purge stale zeroclawlabs fingerprints before build (#3741)
The BuildKit cache mount persists .fingerprint and .d files across
source tree changes, causing compilation errors when imports change.
Clear zeroclawlabs-specific artifacts before building to ensure
a clean recompile of the main crate while preserving dep caches.
2026-03-16 18:10:36 -04:00
Ericsunsk
83803cef5b
fix(memory): filter autosave noise and scope recall/store by session (#3695)
* fix(memory): filter autosave noise and scope memory by session

* style: format rebase-resolved gateway and memory loader

* fix(tests): update memory loader mock for session-aware context

* fix(openai-codex): decode utf-8 safely across stream chunks
2026-03-16 16:36:35 -04:00
Vast-stars
dcb182cdd5
fix(agent): remove bare URL → curl fallback in GLM-style tool call parser (#3694)
* fix(agent): remove bare URL → curl fallback in GLM-style tool call parser

The `parse_glm_style_tool_calls` function had a "Plain URL" fallback
that converted any bare URL line (e.g. `https://example.com`) into a
`shell` tool call running `curl -s '<url>'`. This caused:

- False positives: normal URLs in LLM replies misinterpreted as tool calls
- Swallowed replies: text with URLs not forwarded to the channel
- Unintended shell commands: `curl` executed without user intent

Explicit GLM-format tool calls like `browser_open/url>https://...` and
`shell/command>...` are unaffected — only the bare URL catch-all is
removed.

* style: cargo fmt

---------

Co-authored-by: argenis de la rosa <theonlyhennygod@gmail.com>
2026-03-16 16:36:27 -04:00
Argenis
7c36a403b0
chore: sync Scoop and AUR templates to v0.4.1 (#3736) 2026-03-16 16:36:26 -04:00
Argenis
058dbc8786
feat(channels): add X/Twitter and Mochat channel integrations (#3735)
* feat(channels): add X/Twitter and Mochat channel integrations

Add two new channel implementations to close competitive gaps:

- X/Twitter: Twitter API v2 with mentions polling, tweet threading
  (auto-splits at 280 chars), DM support, and rate limit handling
- Mochat: HTTP polling-based integration with Mochat customer service
  platform, configurable poll interval, message dedup

Both channels follow the existing Channel trait pattern with full
config schema integration, health checks, and dedup.

Closes competitive gap: NanoClaw had X/Twitter, Nanobot had Mochat.

* fix(channels): use write! instead of format_push_string for clippy

Replace url.push_str(&format!(...)) with write!(url, ...) to satisfy
clippy::format_push_string lint on CI.

* fix(channels): rename reply_to parameter to avoid legacy field grep

The component test source_does_not_use_legacy_reply_to_field greps
for "reply_to:" in source files. Rename the parameter to
reply_tweet_id to pass this check.
2026-03-16 16:35:21 -04:00
SimianAstronaut7
aff7a19494
Merge pull request #3641 from zeroclaw-labs/work-issues/3011-fix-dashboard-ws-protocols
fix(gateway): pass bearer token in WebSocket subprotocol for dashboard auth
2026-03-16 16:34:46 -04:00
SimianAstronaut7
0894429b54
Merge pull request #3640 from zeroclaw-labs/work-issues/2881-transcription-initial-prompt
feat(config): support initial_prompt in transcription config
2026-03-16 16:34:43 -04:00
SimianAstronaut7
6b03e885fc
Merge pull request #3639 from zeroclaw-labs/work-issues/3474-docker-restart-docs
docs(setup): add Docker/Podman stop/restart instructions
2026-03-16 16:34:41 -04:00
Argenis
84470a2dd2
fix(agent): strip vision markers from history for non-vision providers (#3734)
* fix(agent): strip vision markers from history for non-vision providers

When a user sends an image via Telegram to a non-vision provider, the
`[IMAGE:/path]` marker gets stored in the JSONL session file. Previously,
the rollback only removed it from in-memory history, not from the JSONL
file. On restart, the marker was reloaded and permanently broke the
conversation.

Two fixes:
1. `rollback_orphan_user_turn` now also calls `remove_last` on the
   session store so the poisoned entry is removed from disk.
2. When building history for a non-vision provider, `[IMAGE:]` markers
   are stripped from older history messages (and empty turns are dropped).

Fixes #3674

* fix(agent): only strip vision markers from older history, not current message

The initial fix stripped [IMAGE:] markers from all prior_turns including
the current message, which caused the vision check to never fire. Now
only strip from turns before the last one (the current request), so
fresh image sends still get a proper vision capability error.
2026-03-16 16:25:45 -04:00
Argenis
8a890be021
chore: bump version to 0.4.2 (#3733)
Update version across Cargo.toml, Cargo.lock, Scoop manifest,
and AUR PKGBUILD/.SRCINFO for the v0.4.2 stable release.
2026-03-16 16:02:34 -04:00
Argenis
74a5ff78e7
fix(qq): send markdown messages instead of plain text (#3732)
* fix(ci): decouple tweet from Docker push in release workflows

Remove Docker from the tweet job's dependency chain in both beta and
stable release workflows. Docker multi-platform builds are slow and
can be cancelled by concurrency groups, which was blocking the tweet
from ever firing. The tweet announces the GitHub Release, not the
Docker image.

* fix(qq): send markdown messages instead of plain text

Change msg_type from 0 (plain text) to 2 (markdown) and wrap content
in a markdown object per QQ's API documentation. This ensures markdown
formatting (bold, italic, code blocks, etc.) renders properly in QQ
clients instead of displaying raw syntax.

Fixes #3647
2026-03-16 15:56:09 -04:00
Argenis
93b16dece5
Merge pull request #3731 from zeroclaw-labs/fix/tweet-decouple-docker
fix(ci): decouple tweet from Docker push in release workflows
2026-03-16 15:52:05 -04:00
Argenis
c773170753
feat(providers): close AiHubMix, SiliconFlow, and Codex OAuth provider gaps (#3730)
Add env var resolution for AiHubMix (AIHUBMIX_API_KEY) and SiliconFlow
(SILICONFLOW_API_KEY) so users can authenticate via environment variables.

Add factory and credential resolution tests for AiHubMix, SiliconFlow,
and Codex OAuth to ensure all provider aliases work correctly.
2026-03-16 15:48:27 -04:00
argenis de la rosa
f210b43977 fix(ci): decouple tweet from Docker push in release workflows
Remove Docker from the tweet job's dependency chain in both beta and
stable release workflows. Docker multi-platform builds are slow and
can be cancelled by concurrency groups, which was blocking the tweet
from ever firing. The tweet announces the GitHub Release, not the
Docker image.
2026-03-16 15:32:29 -04:00
Argenis
50bc360bf4
Merge pull request #3729 from zeroclaw-labs/fix/crates-auto-publish-idempotent
fix(ci): make crates.io publish idempotent across all workflows
2026-03-16 15:30:30 -04:00
Argenis
fc8ed583a0
feat(providers): add VOLCENGINE_API_KEY env var for VolcEngine/ByteDance gateway (#3725) 2026-03-16 15:29:36 -04:00
argenis de la rosa
d593b6b1e4 fix(ci): make crates.io publish idempotent across all workflows
Both publish-crates-auto.yml and publish-crates.yml now treat
"already exists" from cargo publish as success instead of failing
the workflow. This prevents false failures when the auto-sync and
stable release workflows race or when re-running a publish.
2026-03-16 15:11:29 -04:00
Argenis
426faa3923
Merge pull request #3724 from zeroclaw-labs/fix/release-sync-tweet-and-crates
fix(ci): ensure tweet posts for stable releases and fix beta concurrency
2026-03-16 15:07:58 -04:00
argenis de la rosa
85429b3657 fix(ci): ensure tweet posts for stable releases and fix beta concurrency
- Tweet workflow: stable releases always tweet (no feature-check gate);
  beta tweets now also trigger on fix() commits, not just feat()
- Beta release: use cancel-in-progress to avoid queued runs getting
  stuck when rapid merges hit the concurrency group
2026-03-16 14:54:32 -04:00
Argenis
8adf05f307
Merge pull request #3722 from zeroclaw-labs/ci/restore-package-manager-syncs
ci: restore Homebrew and add Scoop/AUR package manager workflows
2026-03-16 14:30:09 -04:00
Argenis
a5f844d7cc
fix(daemon): ignore SIGHUP to survive terminal/SSH disconnect (#3721)
* fix(daemon): ignore SIGHUP to survive terminal/SSH disconnect (#3688)

* style: remove redundant continue flagged by clippy
2026-03-16 14:21:59 -04:00
Argenis
7a9e815948
fix(config): add serde default for cli field in ChannelsConfig (#3720)
* fix(config): add serde default for cli field in ChannelsConfig (#3710)

* style: fix rustfmt formatting in test
2026-03-16 14:21:51 -04:00
Argenis
46378cf8b4
fix(security): validate command before rate-limiting in cron once (#3699) (#3719) 2026-03-16 14:21:45 -04:00
Argenis
c2133e6e62
fix(docker): prevent dummy binary from being shipped in container (#3687) (#3718) 2026-03-16 14:21:37 -04:00
Argenis
e9b3148e73
Merge pull request #3717 from zeroclaw-labs/chore/version-bump-0.4.1
chore: bump version to 0.4.1
2026-03-16 14:21:20 -04:00
argenis de la rosa
3d007f6b55 docs: add Scoop and AUR workflows to CI map and release process 2026-03-16 14:17:18 -04:00
argenis de la rosa
f349de78ed ci(aur): add AUR PKGBUILD template and publishing workflow
Adds Arch Linux distribution via AUR:
- dist/aur/PKGBUILD: package build template with cargo dist profile
- dist/aur/.SRCINFO: AUR metadata
- .github/workflows/pub-aur.yml: manual workflow to push to AUR
2026-03-16 14:16:30 -04:00
argenis de la rosa
cd40051f4c ci(scoop): add Scoop manifest template and publishing workflow
Adds Windows package distribution via Scoop:
- dist/scoop/zeroclaw.json: manifest template with checkver/autoupdate
- .github/workflows/pub-scoop.yml: manual workflow to update Scoop bucket
2026-03-16 14:15:29 -04:00
argenis de la rosa
6e4b1ede28 ci(homebrew): restore Homebrew core formula publishing workflow
Re-adds the manual-dispatch workflow for bumping the zeroclaw formula
in Homebrew/homebrew-core via a bot-owned fork. Improved from the
previously removed version with safer env-var handling.

Requires secrets: HOMEBREW_CORE_BOT_TOKEN or HOMEBREW_UPSTREAM_PR_TOKEN
Requires variables: HOMEBREW_CORE_BOT_FORK_REPO, HOMEBREW_CORE_BOT_EMAIL
2026-03-16 14:12:48 -04:00
argenis de la rosa
cfba009833 chore: regenerate Cargo.lock for v0.4.1 2026-03-16 13:49:39 -04:00
argenis de la rosa
45abd27e4a chore: bump version to 0.4.1
Reflects the addition of heartbeat metrics, SQLite session backend,
and two-tier response cache in this release cycle.
2026-03-16 13:49:36 -04:00
Argenis
566e3cf35b
Merge pull request #3712 from zeroclaw-labs/feat/competitive-edge-heartbeat-sessions-caching
feat: competitive edge — heartbeat metrics, SQLite sessions, two-tier prompt cache
2026-03-16 13:47:42 -04:00
Argenis
d642b0f3c8
fix(ci): decouple tweet from crates.io and fix duplicate publish handling (#3716)
Drop crates-io from the tweet job's needs so the release announcement
goes out with GitHub release, Docker, and website — not blocked by
crates.io publish timing.

Also fix the duplicate detection: the previous curl-based check used
a URL that didn't match the actual crate, causing cargo publish to
hit "already exists" and fail the whole job. Now we just run cargo
publish and treat "already exists" as success.
2026-03-16 13:46:05 -04:00
Argenis
8153992a11
fix(ci): scope rust-cache by OS image in stable release workflow (#3711)
Same glibc cache mismatch fix as the beta workflow — ubuntu-22.04 builds
must not share cached build-script binaries with ubuntu-latest (24.04).
2026-03-16 12:56:30 -04:00
argenis de la rosa
98688c61ff feat(cache): wire two-tier response cache, multi-provider token tracking, and cache analytics
- Two-tier response cache: in-memory LRU (hot) + SQLite (warm) with TTL-aware eviction
- Wire response cache into agent turn loop (temp==0.0, text-only responses only)
- Parse Anthropic cache_creation_input_tokens/cache_read_input_tokens
- Parse OpenAI prompt_tokens_details.cached_tokens
- Add cached_input_tokens to TokenUsage, prompt_caching to ProviderCapabilities
- Add CacheHit/CacheMiss observer events with Prometheus counters
- Add response_cache_hot_entries config field (default: 256)
2026-03-16 12:44:48 -04:00
argenis de la rosa
9ba5ba5632 feat(sessions): add SQLite backend with FTS5, trait abstraction, and migration
- Add SessionBackend trait abstracting over storage backends (load,
  append, remove_last, list, search, cleanup_stale, compact)
- Add SqliteSessionBackend with WAL mode, FTS5 full-text search,
  session metadata tracking, and TTL-based cleanup
- Add remove_last() and compact() to JSONL SessionStore
- Implement SessionBackend for both JSONL and SQLite backends
- Add automatic JSONL-to-SQLite migration (renames .jsonl → .jsonl.migrated)
- Add config: session_backend ("jsonl"/"sqlite"), session_ttl_hours
- SQLite is the new default backend; JSONL preserved for backward compat
2026-03-16 12:23:18 -04:00
argenis de la rosa
318ed8e9f1 feat(heartbeat): add health metrics, adaptive intervals, and task history
- Add HeartbeatMetrics struct with uptime, consecutive success/failure
  counts, EMA tick duration, and total ticks
- Add compute_adaptive_interval() for exponential backoff on failures
  and faster polling when high-priority tasks are present
- Add SQLite-backed task run history (src/heartbeat/store.rs) mirroring
  the cron/store.rs pattern with output truncation and pruning
- Add dead-man's switch that alerts if heartbeat stops ticking
- Wire metrics, history recording, and adaptive sleep into daemon worker
- Add config fields: adaptive, min/max_interval_minutes,
  deadman_timeout_minutes, deadman_channel, deadman_to, max_run_history
- All new fields are backward-compatible with serde defaults
2026-03-16 12:08:32 -04:00
Argenis
5ddea82789
fix(ci): scope rust-cache by OS image and target in release beta workflow (#3708)
The aarch64-unknown-linux-gnu build runs on ubuntu-22.04 (glibc 2.35)
but was restoring cached build-script binaries compiled on ubuntu-latest
(24.04, glibc 2.39), causing GLIBC_2.39 not found failures.

Adding prefix-key with the OS image and target ensures each matrix entry
gets its own isolated cache, preventing cross-image glibc mismatches.
2026-03-16 11:37:14 -04:00
Argenis
a2b18bdb16
chore: gitignore stale target-* build cache dirs (#3707)
Adds /target-*/ to .gitignore to prevent alternative Cargo build
directories (e.g. target-ms365, target-pr) from being tracked.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-16 11:21:04 -04:00
Argenis
85271441b5
docs: remove duplicate Vietnamese docs and orphan Greek locale (#3701)
- Remove docs/i18n/vi/ — duplicate of canonical docs/vi/ per docs-contract
- Remove docs/i18n/el/ — orphan Greek locale, not in supported locales list
- Update docs/i18n/README.md to reflect correct canonical paths

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-16 10:41:43 -04:00
Argenis
813ae17f48
fix(install): correct version display, show pairing code, bump to 0.4.0 (#3669)
* style: cargo fmt Box::pin calls in cron scheduler

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(install): correct version display, show pairing code, bump to 0.4.0

- Reorder ZEROCLAW_BIN resolution to prefer ~/.cargo/bin over PATH,
  preventing stale binaries from shadowing the fresh install
- Sync binary to ~/.local/bin after cargo install
- Fetch and display gateway pairing code via `gateway get-paircode`
  after service restart instead of referencing non-existent output
- Update fallback message to actionable CLI command
- Bump Cargo.toml version from 0.3.4 to 0.4.0

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: update Cargo.lock for 0.4.0 version bump

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-16 09:58:31 -04:00
Argenis
bb99d2b57a
Merge branch 'master' into fix-2400-block-config-self-mutation 2026-03-16 09:21:57 -04:00
Argenis
794d87d95c
fix(config): add missing #[serde(default)] to Config struct fields (#3700)
Several recently-added Config fields (data_retention, cloud_ops,
conversational_ai, security_ops) were missing #[serde(default)],
causing deserialization failures when those sections are absent
from config files. Also fixes security field whose #[serde(default)]
was accidentally consumed by the backup doc comment.

Fixes test failures: agent_config_deserializes and
browser_config_backward_compat_missing_section.
2026-03-16 09:21:41 -04:00
李龙 0668001470
81256dbf42 test(config): fix helper lint and swarms fixture 2026-03-16 17:56:30 +08:00
李龙 0668001470
eb9b26cea0 test(config): centralize backward-compat fixtures 2026-03-16 17:45:20 +08:00
李龙 0668001470
6211824f01 fix(security): block runtime config state edits 2026-03-16 17:21:22 +08:00
Argenis
d6e5907b65
feat(tools): add cloud transformation accelerator tools (#3663)
* feat(tools): add cloud transformation accelerator tools

Add cloud_ops and cloud_patterns tools providing read-only cloud
transformation analysis: IaC review, migration assessment, cost
analysis, and Well-Architected Framework architecture review.

Includes CloudOpsConfig, SecurityOpsConfig, and ConversationalAiConfig
schema additions, Box::pin fixes for recursive async in cron scheduler,
and approval_manager field in ChannelRuntimeContext test constructors.

Original work by @rareba. Rebased on latest master with conflict
resolution (kept SwarmConfig/SwarmStrategy exports, swarm tool
registration, and approval_manager in test constructors).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* style: cargo fmt Box::pin calls

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-16 02:43:55 -04:00
Argenis
861dd3e2e9
feat(tools): add backup/restore and data management tools (#3662)
Add BackupTool for creating, listing, verifying, and restoring
timestamped workspace backups with SHA-256 manifest integrity
checking. Add DataManagementTool for retention status, time-based
purge, and storage statistics. Both tools are config-driven via
new BackupConfig and DataRetentionConfig sections.

Original work by @rareba. Rebased on latest master with conflict
resolution for SwarmConfig/SwarmStrategy exports and swarm tool
registration, and added missing approval_manager fields in
ChannelRuntimeContext test constructors.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-16 02:35:44 -04:00
Argenis
8a61a283b2
feat(security): add MCSS security operations tool (#3657)
* feat(security): add MCSS security operations tool

Add managed cybersecurity service (MCSS) tool with alert triage,
incident response playbook execution, vulnerability scan parsing,
and security report generation. Includes SecurityOpsConfig, playbook
engine with approval gating, vulnerability scoring, and full test
coverage. Also fixes pre-existing missing approval_manager field in
ChannelRuntimeContext test constructors.

Original work by @rareba. Supersedes #3599.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: add SecurityOpsConfig to re-exports, fix test constructors

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-16 02:28:54 -04:00
Argenis
dcc0a629ec
feat(tools): add project delivery intelligence tool (#3656)
Add a new read-only project_intel tool that provides:
- Status report generation (weekly/sprint/month)
- Risk scanning with configurable sensitivity
- Client update drafting (formal/casual, client/internal)
- Sprint summary generation
- Heuristic effort estimation

Includes multi-language report templates (EN, DE, FR, IT),
ProjectIntelConfig schema with validation, and comprehensive tests.

Also fixes missing approval_manager field in 4 ChannelRuntimeContext
test constructors.

Supersedes #3591 — rebased on latest master. Original work by @rareba.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-16 02:10:14 -04:00
Argenis
a8c6363cde
feat(nodes): add secure HMAC-SHA256 node transport layer (#3654)
Add a new `nodes` module with HMAC-SHA256 authenticated transport for
secure inter-node communication over standard HTTPS. Includes replay
protection via timestamped nonces and constant-time signature
comparison.

Also adds `NodeTransportConfig` to the config schema and fixes missing
`approval_manager` field in four `ChannelRuntimeContext` test
constructors that failed compilation on latest master.

Original work by @rareba. Rebased on latest master to resolve merge
conflicts (SwarmConfig/SwarmStrategy exports, duplicate MCP validation,
test constructor fields).

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-16 01:53:47 -04:00
Argenis
d9ab017df0
feat(tools): add Microsoft 365 integration via Graph API (#3653)
Add Microsoft 365 tool providing access to Outlook mail, Teams messages,
Calendar events, OneDrive files, and SharePoint search via Microsoft
Graph API. Includes OAuth2 token caching (client credentials and device
code flows), security policy enforcement, and config validation.

Rebased on latest master, resolving conflicts with SwarmConfig exports
and adding approval_manager to ChannelRuntimeContext test constructors.

Original work by @rareba.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-16 01:44:39 -04:00
李龙 0668001470
b4decb40c6 Merge upstream/master into fix/config-load-initialized-state 2026-03-16 13:18:30 +08:00
Argenis
249434edb2
feat(notion): add Notion database poller channel and API tool (#3650)
Add Notion integration with two components:
- NotionChannel: polls a Notion database for tasks with configurable
  status properties, concurrency limits, and stale task recovery
- NotionTool: provides CRUD operations (query_database, read_page,
  create_page, update_page) for agent-driven Notion interactions

Includes config schema (NotionConfig), onboarding wizard support,
and full unit test coverage for both channel and tool.

Supersedes #3609 — rebased on latest master to resolve merge conflicts
with swarm feature additions in config/mod.rs.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-16 00:55:23 -04:00
Argenis
62781a8d45
fix(lint): Box::pin crate::agent::run calls to satisfy large_futures (#3675)
Wrap all crate::agent::run() calls with Box::pin() across scheduler,
daemon, gateway tests, and main.rs to satisfy clippy::large_futures.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-16 00:54:27 -04:00
Chris Hengge
0adec305f9
fix(tools): qualify is_service_environment with super:: inside mod native_backend (#3659)
Commit 811fab3b added is_service_environment() as a top-level function and
called it from two sites. The call at line 445 is at module scope and resolves
fine. The call at line 1473 is inside mod native_backend, which is a child
module — Rust does not implicitly import parent-scope items, so the unqualified
name fails with E0425 (cannot find function in this scope).

Fix: prefix the call with super:: so it resolves to the parent module's
function, matching how mod native_backend already imports other parent items
(e.g. use super::BrowserAction).

The browser-native feature flag is required to reproduce:
  cargo check --features browser-native  # fails without this fix
  cargo check --features browser-native  # clean with this fix

Co-authored-by: Argenis <theonlyhennygod@gmail.com>
2026-03-16 00:35:09 -04:00
Argenis
75701195d7
feat(security): add Nevis IAM integration for SSO/MFA authentication (#3651)
* feat(security): add Nevis IAM integration for SSO/MFA authentication

Add NevisAuthProvider supporting OAuth2/OIDC token validation (local JWKS +
remote introspection), FIDO2/passkey/OTP MFA verification, session management,
and health checks. Add IamPolicy engine mapping Nevis roles to ZeroClaw tool
and workspace permissions with deny-by-default enforcement and audit logging.

Add NevisConfig and NevisRoleMappingConfig to config schema with client_secret
wired through SecretStore encrypt/decrypt. All features disabled by default.

Rebased on latest master to resolve merge conflicts in security/mod.rs (redact
function) and config/schema.rs (test section).

Original work by @rareba. Supersedes #3593.

Co-Authored-By: rareba <5985289+rareba@users.noreply.github.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* style: cargo fmt Box::pin calls

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: rareba <5985289+rareba@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-16 00:34:52 -04:00
Argenis
82fe2e53fd
feat(tunnel): add OpenVPN tunnel provider (#3648)
* feat(tunnel): add OpenVPN tunnel provider

Add OpenVPN as a new tunnel provider alongside cloudflare, tailscale,
ngrok, and custom. Includes config schema, validation, factory wiring,
and comprehensive unit tests.

Co-authored-by: rareba <rareba@users.noreply.github.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix: add missing approval_manager field to ChannelRuntimeContext constructors

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: rareba <rareba@users.noreply.github.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-16 00:34:34 -04:00
Argenis
327e2b4c47
style: cargo fmt Box::pin calls in cron scheduler (#3667)
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-15 23:34:26 -04:00
Argenis
5a5d9ae5f9
fix(lint): Box::pin large futures in cron scheduler and cron_run tool (#3666)
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-15 23:25:28 -04:00
Argenis
d8f228bd15
Merge pull request #3661 from zeroclaw-labs/work/multi-client-workspaces
feat(workspace): add multi-client workspace isolation
2026-03-15 23:10:01 -04:00
Argenis
d5eeaed3d9
Merge pull request #3649 from zeroclaw-labs/work/capability-tool-access
feat(security): add capability-based tool access control
2026-03-15 23:09:51 -04:00
Argenis
429094b049
Merge pull request #3660 from zeroclaw-labs/fix/cron-large-future
fix: add Box::pin for large future and missing approval_manager in tests
2026-03-15 23:08:47 -04:00
李龙 0668001470
2b30f060fe test(config): move initialized log regression away from merge hotspot 2026-03-16 11:06:18 +08:00
argenis de la rosa
80213b08ef feat(workspace): add multi-client workspace isolation
Add workspace profile management, security boundary enforcement, and
a workspace management tool for isolated client engagements.

Original work by @rareba. Supersedes #3597 — rebased on latest master.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-15 22:41:18 -04:00
argenis de la rosa
bf67124499 fix: add Box::pin for large future and missing approval_manager in tests
- Box::pin the cron_run execute_job_now call to satisfy clippy::large_futures
- Add missing approval_manager field to 4 query_classification test constructors

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-15 22:32:06 -04:00
李龙 0668001470
f994979380 fix(config): avoid clippy used_underscore_binding 2026-03-16 09:13:18 +08:00
argenis de la rosa
cb250dfecf fix: add missing approval_manager field to ChannelRuntimeContext test constructors
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-15 20:03:59 -04:00
argenis de la rosa
fabd35c4ea feat(security): add capability-based tool access control
Add an optional `allowed_tools` parameter that restricts which tools are
available to the agent. When `Some(list)`, only tools whose name appears
in the list are retained; when `None`, all tools remain available
(backward compatible). This enables fine-grained capability control for
cron jobs, heartbeat tasks, and CLI invocations.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-15 19:34:34 -04:00
Argenis
a695ca4b9c
fix(onboard): auto-detect TTY instead of --interactive flag (#3573)
Remove the --interactive flag from `zeroclaw onboard`. The command now
auto-detects whether stdin/stdout are a TTY: if yes and no provider
flags are given, it launches the full interactive wizard; otherwise it
runs the quick (scriptable) setup path.

This means all three install methods work with a single flow:
  curl -fsSL https://zeroclawlabs.ai/install.sh | bash
  cargo install zeroclawlabs && zeroclaw onboard
  docker run … zeroclaw onboard --api-key …
2026-03-15 19:25:55 -04:00
Argenis
811fab3b87
fix(service): headless browser works in service mode (systemd/OpenRC) (#3645)
When zeroclaw runs as a service, the process inherits a minimal
environment without HOME, DISPLAY, or user namespaces. Headless
browsers (Chromium/Firefox) need HOME for profile/cache dirs and
fail with sandbox errors without user namespaces.

- Detect service environment via INVOCATION_ID, JOURNAL_STREAM,
  or missing HOME on Linux
- Auto-apply --no-sandbox and --disable-dev-shm-usage for Chrome
  in service mode
- Set HOME fallback and CHROMIUM_FLAGS on agent-browser commands
- systemd unit: add Environment=HOME=%h and PassEnvironment
- OpenRC script: export HOME=/var/lib/zeroclaw with start_pre()
  to create the directory

Closes #3584
2026-03-15 19:16:36 -04:00
Argenis
1a5d91fe69
fix(channels): wire query_classification config into channel message processing (#3619)
The QueryClassificationConfig was parsed from config but never applied
during channel message processing. This adds the query_classification
field to ChannelRuntimeContext and invokes the classifier in
process_channel_message to override the route when a classification
rule matches a model_routes hint.

Closes #3579
2026-03-15 19:16:32 -04:00
Argenis
6eec1c81b9
fix(ci): use ubuntu-22.04 for Linux release builds (#3573)
Build against glibc 2.35 to ensure binary compatibility with Ubuntu 22.04+.
2026-03-15 18:57:30 -04:00
Argenis
602db8bca1
fix: exclude name field from Mistral tool_calls (#3572)
* fix: exclude name field from Mistral tool_calls (#3572)

Add skip_serializing_if to the compatibility fields (name, arguments,
parameters) on the ToolCall struct so they are omitted from the JSON
payload when None. Mistral's API returns 422 "Extra inputs are not
permitted" when these extra null fields are present in tool_calls.

* fix: format serde attribute for CI lint compliance
2026-03-15 18:38:41 -04:00
simianastronaut
2539bcafe0 fix(gateway): pass bearer token in WebSocket subprotocol for dashboard auth
The dashboard WebSocket client was only sending ['zeroclaw.v1'] as the
protocols parameter, omitting the bearer token subprotocol. When
require_pairing = true, the server extracts the token from
Sec-WebSocket-Protocol as a fallback (browsers cannot set custom
headers on WebSocket connections). Without the bearer.<token> entry
in the protocols array, subprotocol-based authentication always failed.

Include `bearer.<token>` in the protocols array when a token is
available, matching the server's extract_ws_token() expectation.

Closes #3011

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-15 16:37:41 -04:00
simianastronaut
37d76f7c42 feat(config): support initial_prompt in transcription config for proper noun recognition
Add `initial_prompt: Option<String>` to `TranscriptionConfig` and pass
it as the `prompt` field in the Whisper API multipart POST when present.
This lets users bias transcription toward expected vocabulary (proper
nouns, technical terms) via the config file.

Closes #2881

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-15 16:31:55 -04:00
simianastronaut
41b46f23e3 docs(setup): add Docker/Podman stop/restart instructions
Users who installed via `./install.sh --docker` had no documented way to
restart the container after stopping it. Add clear lifecycle instructions
(stop, start, restart, logs, health check) to both the bootstrap guide and
the operations runbook, covering docker-compose, manual `docker run`, and
Podman-specific flags.

Closes #3474

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-15 16:26:28 -04:00
SimianAstronaut7
314e1d3ae8
Merge pull request #3638 from zeroclaw-labs/work-issues/3487-channel-approval-manager
fix(security): enforce approval policy for channel-driven runs
2026-03-15 16:11:14 -04:00
SimianAstronaut7
82be05b1e9
Merge pull request #3636 from zeroclaw-labs/work-issues/3628-surface-tool-failures-in-chat
feat(agent): surface tool call failure reasons in chat
2026-03-15 16:07:38 -04:00
SimianAstronaut7
1373659058
Merge pull request #3634 from zeroclaw-labs/work-issues/3477-fix-matrix-channel-key
fix(channel): use plain "matrix" channel key for consistent outbound routing
2026-03-15 16:07:36 -04:00
Argenis
c7f064e866
fix(channels): surface visible warning when whatsapp-web feature is missing (#3629)
The WhatsApp Web QR code was not shown during onboarding channel launch
because the wizard allowed configuring WhatsApp Web mode even when the
binary was built without the `whatsapp-web` feature flag. At runtime,
the channel was silently skipped with only a tracing::warn that most
users never see.

- Add compile-time warning in the onboarding wizard when WhatsApp Web
  mode is selected but the feature is not compiled in
- Add eprintln! in collect_configured_channels so users see a visible
  terminal warning when the feature is missing at startup

Closes #3577
2026-03-15 16:07:13 -04:00
Giulio V
9c1d63e109
feat(hands): add autonomous knowledge-accumulating agent packages (#3603)
Introduce the Hands system — autonomous agent packages that run on
schedules and accumulate knowledge over time. Each Hand maintains a
rolling context of findings across runs so the agent grows smarter
with every execution.

This PR adds:
- Hand definition type (TOML-deserializable, reuses cron Schedule)
- HandRun / HandRunStatus for execution records
- HandContext for rolling cross-run knowledge accumulation
- File-based persistence (load/save context as JSON)
- Directory-based Hand loading from ~/.zeroclaw/hands/*.toml
- 20 unit tests covering deserialization, persistence roundtrip,
  history capping, fact deduplication, and error handling

Execution integration with the agent loop is deferred to a follow-up.
2026-03-15 16:06:14 -04:00
Argenis
966edf1553
Merge pull request #3635 from zeroclaw-labs/chore/bump-v0.3.4
chore: bump version to v0.3.4
2026-03-15 15:59:51 -04:00
simianastronaut
a1af84d992 fix(security): enforce approval policy for channel-driven runs
Channel-driven runs (Telegram, Matrix, Discord, etc.) previously bypassed
the ApprovalManager entirely — `None` was passed into the tool-call loop,
so `auto_approve`, `always_ask`, and supervised approval checks were
silently skipped for all non-CLI execution paths.

Add a non-interactive mode to ApprovalManager that enforces the same
autonomy config policies but auto-denies tools requiring interactive
approval (since no operator is present on channel runs). Specifically:

- Add `ApprovalManager::for_non_interactive()` constructor that creates
  a manager which auto-denies tools needing approval instead of prompting
- Add `is_non_interactive()` method so the tool-call loop can distinguish
  interactive (CLI prompt) from non-interactive (auto-deny) managers
- Update tool-call loop: non-interactive managers auto-deny instead of
  the previous auto-approve behavior for non-CLI channels
- Wire the non-interactive approval manager into ChannelRuntimeContext
  so channel runs enforce the full approval policy
- Add 8 tests covering non-interactive approval behavior

Security implications:
- `always_ask` tools are now denied on channels (previously bypassed)
- Supervised-mode unknown tools are now denied on channels (previously
  bypassed)
- `auto_approve` tools continue to work on channels unchanged
- `full` autonomy mode is unaffected (no approval needed regardless)
- `read_only` mode is unaffected (blocks execution elsewhere)

Closes #3487

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-15 15:56:57 -04:00
simianastronaut
0ad1965081 feat(agent): surface tool call failure reasons in chat progress messages
When a tool call fails (security policy block, hook cancellation, user
denial, or execution error), the failure reason is now included in the
progress message sent to the chat channel via on_delta. Previously only
a  icon was shown; now users see the actual reason (e.g. "Command not
allowed by security policy") without needing to check `zeroclaw doctor
traces`.

Closes #3628

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-15 15:49:27 -04:00
argenis de la rosa
70e8e7ebcd chore: bump version to v0.3.4
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-15 15:44:59 -04:00
Alix-007
2bcb82c5b3
fix(python): point docs URL at master branch (#3334)
Co-authored-by: Alix-007 <Alix-007@users.noreply.github.com>
2026-03-15 15:43:35 -04:00
simianastronaut
e211b5c3e3 fix(channel): use plain "matrix" channel key for consistent outbound routing
The Matrix channel listener was building channel keys as `matrix:<room_id>`,
but the runtime channel mapping expects the plain key `matrix`. This mismatch
caused replies to silently drop in deployments using the Matrix channel.

Closes #3477

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-15 15:42:43 -04:00
Argenis
8691476577
Merge pull request #3624 from zeroclaw-labs/feat/multi-swarm-and-bugfixes
feat(swarm): multi-agent swarm orchestration + bug fixes (#3572, #3573)
2026-03-15 15:42:03 -04:00
SimianAstronaut7
e34a804255
Merge pull request #3632 from zeroclaw-labs/work-issues/3544-fix-codex-sse-buffering
fix(provider): use incremental SSE stream reading for openai-codex responses
2026-03-15 15:34:39 -04:00
SimianAstronaut7
6120b3f705
Merge pull request #3630 from zeroclaw-labs/work-issues/3567-allow-commands-bypass-high-risk
fix(security): let explicit allowed_commands bypass high-risk block
2026-03-15 15:34:37 -04:00
SimianAstronaut7
f175261e32
Merge pull request #3631 from zeroclaw-labs/work-issues/3486-fix-matrix-image-marker
fix(channels): use canonical IMAGE marker in Matrix channel
2026-03-15 15:34:31 -04:00
simianastronaut
fd9f66cad7 fix(provider): use incremental SSE stream reading for openai-codex responses
Replace full-body buffering (`response.text().await`) in
`decode_responses_body()` with incremental `bytes_stream()` chunk
processing.  The previous approach held the HTTP connection open until
every byte had arrived; on high-latency links the long-lived connection
would frequently drop mid-read, producing the "error decoding response
body" failure on the first attempt (succeeding only after retry).

Reading chunks incrementally lets each network segment complete within
its own timeout window, eliminating the systematic first-attempt failure.

Closes #3544

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-15 15:22:55 -04:00
simianastronaut
d928ebc92e fix(channels): use canonical IMAGE marker in Matrix channel
Matrix image messages used lowercase `[image: ...]` format instead of
the canonical `[IMAGE:...]` marker used by all other channels (Telegram,
Slack, Discord, QQ, LinQ). This caused Matrix image attachments to
bypass the multimodal vision pipeline which looks for `[IMAGE:...]`.

Closes #3486

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-15 15:14:53 -04:00
simianastronaut
9fca9f478a fix(security): let explicit allowed_commands bypass high-risk block
When `block_high_risk_commands = true`, commands like `curl` and `wget`
were unconditionally blocked even if explicitly listed in
`allowed_commands`. This made it impossible to use legitimate API calls
in full autonomy mode.

Now, if a command is explicitly named in `allowed_commands` (not via
the wildcard `*`), it is exempt from the `block_high_risk_commands`
gate. The wildcard entry intentionally does NOT grant this exemption,
preserving the safety net for broad allowlists.

Other security gates (supervised-mode approval, rate limiting, path
policy, argument validation) remain fully enforced.

Closes #3567

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-15 15:13:32 -04:00
SimianAstronaut7
7106632b51
Merge pull request #3627 from zeroclaw-labs/work-issues/3533-fix-utf8-slice-panic
fix(agent): use char-boundary-safe slicing to prevent CJK text panic
2026-03-15 14:36:49 -04:00
SimianAstronaut7
b834278754
Merge pull request #3626 from zeroclaw-labs/work-issues/3563-fix-cron-add-nl-security
fix(cron): add --agent flag so natural language prompts bypass shell security
2026-03-15 14:36:46 -04:00
SimianAstronaut7
186f6d9797
Merge pull request #3625 from zeroclaw-labs/work-issues/3568-http-request-private-hosts
feat(tool): add allow_private_hosts option to http_request tool
2026-03-15 14:36:44 -04:00
simianastronaut
6cdc92a256 fix(agent): use char-boundary-safe slicing to prevent CJK text panic
Replace unsafe byte-index string slicing (`&text[..N]`) with
char-boundary-safe alternatives in memory consolidation and security
redaction to prevent panics when multi-byte UTF-8 characters (e.g.
Chinese/Japanese/Korean) span the slice boundary.

Fixes the same class of bug as the prior fix in `execute_one_tool`
(commit 8fcbb6eb), applied to two remaining instances:
- `src/memory/consolidation.rs`: truncation at byte 4000 and 200
- `src/security/mod.rs`: `redact()` prefix at byte 4

Adds regression tests with CJK input for both locations.

Closes #3533

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-15 14:27:09 -04:00
simianastronaut
02599dcd3c fix(cron): add --agent flag to CLI cron commands to bypass shell security validation
The CLI `cron add` command always routed the second positional argument
through shell security policy validation, which blocked natural language
prompts like "Check server health: disk space, memory, CPU load". This
adds an `--agent` flag to `cron add`, `cron add-at`, `cron add-every`,
and `cron once` so that natural language prompts are correctly stored as
agent jobs without shell command validation.

Closes #3563

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-15 14:26:45 -04:00
simianastronaut
fe64d7ef7e feat(tool): add allow_private_hosts option to http_request tool (#3568)
The http_request tool unconditionally blocked all private/LAN hosts with
no opt-out, preventing legitimate use cases like calling a local Home
Assistant instance or internal APIs. This adds an `allow_private_hosts`
config flag (default: false) under `[http_request]` that, when set to
true, skips the private-host SSRF check while still enforcing the domain
allowlist.

Closes #3568

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-15 14:23:54 -04:00
argenis de la rosa
996dbe95cf feat(swarm): multi-agent swarm orchestration, Mistral tool fix, restore --interactive
- Add SwarmTool with sequential (pipeline), parallel (fan-out/fan-in),
  and router (LLM-selected) strategies for multi-agent workflows
- Add SwarmConfig and SwarmStrategy to config schema
- Fix Mistral 422 error by adding skip_serializing_if to ToolCall
  compat fields (name, arguments, parameters, kind) — Fixes #3572
- Restore `zeroclaw onboard --interactive` flag with run_wizard
  routing and mutual-exclusion validation — Fixes #3573
- 20 new swarm tests, 2 serialization tests, 1 CLI test, config tests

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-15 14:23:20 -04:00
Argenis
45f953be6d
Merge pull request #3578 from zeroclaw-labs/chore/bump-v0.3.3
chore: bump version to v0.3.3
2026-03-15 09:53:13 -04:00
argenis de la rosa
82f29bbcb1 chore: bump version to v0.3.3
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-15 09:41:22 -04:00
Argenis
93b5a0b824
feat(context): token-based compaction, persistent sessions, and LLM consolidation (#3574)
Comprehensive long-running context upgrades:

- Token-based compaction: replace message-count trigger with token
  estimation (~4 chars/token). Compaction fires when estimated tokens
  exceed max_context_tokens (default 32K) OR message count exceeds
  max_history_messages. Cuts at user-turn boundaries only.

- Persistent sessions: JSONL append-only session files per channel
  sender in {workspace}/sessions/. Sessions survive daemon restarts.
  Hydrates in-memory history from disk on startup.

- LLM-driven memory consolidation: two-phase extraction after each
  conversation turn. Phase 1 writes a timestamped history entry (Daily).
  Phase 2 extracts new facts/preferences to Core memory (if any).
  Replaces raw message auto-save with semantic extraction.

- New config fields: agent.max_context_tokens (32000),
  channels_config.session_persistence (true).

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-15 09:25:23 -04:00
Argenis
08a67c4a2d
chore: bump version to v0.3.2 (#3564)
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-15 06:47:37 -04:00
Argenis
c86a0673ba
feat(heartbeat): two-phase execution, structured tasks, and auto-routing (#3562)
Upgrade heartbeat system with 4 key improvements:

- Two-phase heartbeat: Phase 1 asks LLM "skip or run?" to save API cost
  on quiet periods. Phase 2 executes only selected tasks.
- Structured task format: `- [priority|status] task text` with
  high/medium/low priority and active/paused/completed status.
- Decision intelligence: LLM-driven smart filtering via structured prompt
  at temperature 0.0 for deterministic decisions.
- Delivery routing: auto-detect best configured channel when no explicit
  target is set (telegram > discord > slack > mattermost).

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-15 06:11:59 -04:00
Argenis
cabf99ba07
Merge pull request #3539 from zeroclaw-labs/cleanup
chore: add .wrangler/ to gitignore
2026-03-14 22:53:12 -04:00
argenis de la rosa
2d978a6b64 chore: add .wrangler/ to gitignore and clean up stale files
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-14 22:41:17 -04:00
Argenis
4dbc9266c1
Merge pull request #3536 from zeroclaw-labs/test/termux-release-validation
test: add Termux release validation script
2026-03-14 22:21:21 -04:00
argenis de la rosa
ea0b3c8c8c test: add Termux release validation script
Validates the aarch64-linux-android release artifact: download, archive
integrity, ELF format, architecture, checksum, and install.sh detection.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-14 22:10:33 -04:00
Argenis
0c56834385
chore(ci): remove sync-readme workflow and script (#3535)
The auto-generated What's New and Recent Contributors README sections
have been removed, so the sync-readme workflow and its backing script
are no longer needed.
2026-03-14 21:38:04 -04:00
Argenis
caccf0035e
docs(readme): remove stale What's New and Contributors sections (#3534)
Clear auto-generated sections so they repopulate correctly on the next
release via sync-readme workflow.
2026-03-14 21:36:26 -04:00
Argenis
627b160f55
fix(install): clean up stale cargo tracking from old package name (#3532)
When the crate was renamed from `zeroclaw` to `zeroclawlabs`, users who
installed via install.sh (which uses `cargo install --path`) had the
binary tracked under the old package name. Later running
`cargo install zeroclawlabs` from crates.io fails with:

  error: binary `zeroclaw` already exists as part of `zeroclaw v0.1.9`

The installer now detects and removes stale tracking for the old
`zeroclaw` package name before installing, so both install.sh and
`cargo install zeroclawlabs` work cleanly for upgrades.
2026-03-14 21:20:02 -04:00
Argenis
6463bc84b0
fix(ci): sync tweet and README after all release artifacts are published (#3531)
The tweet was firing on `release: published` immediately when the GitHub
Release was created, but Docker push, crates.io publish, and website
redeploy were still in-flight. This meant users saw the tweet before
they could actually pull the new Docker image or install from crates.io.

Convert tweet-release.yml and sync-readme.yml to reusable workflows
(workflow_call) and call them as the final jobs in both release
pipelines, gated on completion of docker, crates-io, and
redeploy-website jobs.

Before: gh release create → tweet fires immediately (race condition)
After:  gh release create → docker + crates + website → tweet + readme
2026-03-14 21:19:56 -04:00
ZeroClaw Bot
f84f1229af docs(readme): auto-sync What's New and Contributors 2026-03-15 01:07:01 +00:00
ZeroClaw Bot
f85d21097b docs(readme): auto-sync What's New and Contributors 2026-03-15 01:01:55 +00:00
Argenis
306821d6a2
Merge pull request #3530 from zeroclaw-labs/chore/bump-v0.3.1
chore: bump version to v0.3.1
2026-03-14 20:29:13 -04:00
argenis de la rosa
06f9424274 chore: bump version to v0.3.1
Patch release for Termux/Android support. Corrects from v0.4.0 to v0.3.1.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-14 20:17:53 -04:00
Argenis
fa14ab4ab2
Merge pull request #3528 from zeroclaw-labs/chore/bump-v0.4.0
chore: bump version to v0.4.0
2026-03-14 20:06:32 -04:00
argenis de la rosa
36a0c8eba9 chore: bump version to v0.4.0
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-14 19:49:07 -04:00
Argenis
f4c82d5797
Merge pull request #3525 from zeroclaw-labs/feat/termux-release-support
feat(ci): add Termux (aarch64-linux-android) release target
2026-03-14 19:47:50 -04:00
ZeroClaw Bot
5edebf4869 docs(readme): auto-sync What's New and Contributors 2026-03-14 23:47:45 +00:00
argenis de la rosa
613fa79444 feat(ci): add Termux (aarch64-linux-android) release target
Add prebuilt binary support for Termux/Android devices:
- Add aarch64-linux-android to stable and beta release matrices with NDK setup
- Detect Termux in install.sh and map to the correct android target
- Add Termux pkg manager support for system dependency installation

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-14 19:36:44 -04:00
ZeroClaw Bot
44c8e1eaac docs(readme): auto-sync What's New and Contributors 2026-03-14 22:35:02 +00:00
Argenis
414b1fa8dd
fix(docs): remove stale onboarding flags after CLI changes (#3516)
Remove references to --interactive, --security-level, and --onboard
flags that no longer exist in the CLI. Update examples to use current
supported flags.

Closes #3484
2026-03-14 17:54:14 -04:00
Argenis
913d5ee851
fix(provider): skip empty text content blocks in Anthropic API requests (#3515)
Filter out empty/whitespace-only text content blocks from assistant,
tool, and user messages before sending to the Anthropic API.

Closes #3483
2026-03-14 17:52:53 -04:00
Argenis
5a4332d0e4
fix(install): avoid /dev/stdin file redirect in guided installer (#3514)
Probe /dev/stdin readability before selecting it, and add /proc/self/fd/0
as a fallback for constrained containers where /dev/stdin is inaccessible.

Closes #3482
2026-03-14 17:52:03 -04:00
ZeroClaw Bot
970c80d278 docs(readme): auto-sync What's New and Contributors 2026-03-14 21:46:24 +00:00
ZeroClaw Bot
9514c20038 docs(readme): auto-sync What's New and Contributors 2026-03-14 21:18:20 +00:00
Argenis
9cf31a732c
fix: increase recursion limit for matrix-sdk 0.16 on Rust 1.94+ (#3512)
matrix-sdk 0.16 generates deeply nested types that exceed the default
recursion limit (128) on newer Rust compilers. Bump to 256.

Closes #3468
2026-03-14 17:07:28 -04:00
Argenis
7f0ddf06a9
fix: wire Signal channel into scheduled announcement delivery (#3511)
Add SignalChannel import and match arm in deliver_announcement() so
cron jobs with delivery.channel = "signal" are handled instead of
rejected as unsupported.

Closes #3476
2026-03-14 17:07:26 -04:00
Argenis
b79e88662e
fix: resolve web dashboard 404 on static assets and SPA fallback (#3510)
* fix: resolve web dashboard 404 on static assets and SPA fallback

Strip leading slash from asset paths after prefix removal so rust-embed
can find the files. Return 503 with a helpful build hint when index.html
is missing instead of a generic 404.

Closes #3508

* fix: return concrete Response type to fix match arm type mismatch
2026-03-14 17:07:24 -04:00
ZeroClaw Bot
b71b69fa6e docs(readme): auto-sync What's New and Contributors 2026-03-14 21:02:25 +00:00
Argenis
8b961b7bd9
chore: bump version to v0.3.0 (#3509)
Milestone release covering:
- Comprehensive channel matrix test suite (50 tests across 16 platforms)
- Auto-synced README What's New and Contributors across 31 locales
- Anthropic consecutive message merging fix
- Nextcloud Talk Create event support
- Tweet content-hash dedup
- CI/CD stability improvements (Docker timeout, release token, tweet URLs)

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-14 16:55:38 -04:00
ZeroClaw Bot
3b67d8a797 docs(readme): auto-sync What's New and Contributors 2026-03-14 20:45:42 +00:00
Argenis
f0e111bf7b
feat(channels): comprehensive channel matrix tests + v0.2.2 (#3507)
* feat(channels): add comprehensive channel matrix tests and bump to v0.2.2

Add 50 integration tests covering the full Channel trait contract across
all 16 platforms: identity semantics, threading, draft lifecycle, reactions,
pinning, concurrency, edge cases, and cross-channel routing. Bump version
to 0.2.2.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* style: fix rustfmt formatting in channel matrix tests

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(lint): remove empty line after doc comment in channel matrix tests

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-14 16:34:00 -04:00
ZeroClaw Bot
b3071f622a docs(readme): auto-sync What's New and Contributors 2026-03-14 20:31:58 +00:00
Argenis
8dbf142c7b
feat(ci): auto-sync README What's New and Contributors on release (#3505)
* feat(ci): auto-sync README What's New and Contributors on release

Replace hardcoded What's New (v0.1.9b) and Recent Contributors sections
with marker-based auto-sync powered by a new GitHub Actions workflow.
The sync-readme workflow runs on each release and updates both sections
from git history.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat(ci): auto-sync What's New and Contributors across all 31 READMEs

Add marker comments to all 30 translated README files and update the
sync script to process all README*.md files on each release. Fix regex
to handle empty markers on first run.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-14 16:00:18 -04:00
Argenis
2b3603bacf
fix(anthropic): merge consecutive same-role messages to prevent 500 errors (#3501)
When multiple tool calls execute in a single turn, each tool result was
emitted as a separate role="user" message. Anthropic's API rejects
adjacent messages with the same role, and newer models like
claude-sonnet-4-6 respond with 500 Internal Server Error instead of a
descriptive 400.

Merge consecutive same-role messages in convert_messages() so that
multiple tool_result blocks are combined into one user message, and
consecutive user/assistant messages are also properly coalesced.

Fixes #3493
2026-03-14 15:10:37 -04:00
Argenis
02369d892d
fix(ci): bump Docker timeout to 60m and skip duplicate crates.io publish (#3503)
fix(ci): bump Docker timeout to 60m and skip duplicate crates.io publish
2026-03-14 15:07:57 -04:00
argenis de la rosa
b38db505ff fix(ci): bump Docker timeout to 60m and skip duplicate crates.io publish
Multi-platform Docker builds (linux/amd64 + linux/arm64) with fat LTO
and codegen-units=1 consistently exceed 30 minutes. Bump to 60m.

Also skip crates.io publish when the version already exists (the
auto-sync workflow may have published it before the stable release
runs).
2026-03-14 14:56:16 -04:00
Argenis
06b9525263
fix(nextcloud_talk): accept "Create" event type for new messages (#3500)
Nextcloud Talk bot webhooks send event type "Create" for new chat
messages, but the parser only accepted "message". This caused all
valid messages to be skipped with "skipping non-message event: Create".

Accept both "message" and "Create" as valid event types.

Closes #3491
2026-03-14 14:48:48 -04:00
Argenis
8fe573330c
fix(tweet): add content-hash dedup to prevent duplicate tweets (#3499)
Uses GitHub Actions cache with SHA-256 hash of tweet content. If the
exact same tweet text was already posted in a previous run, the tweet
is skipped with a warning. Works for both auto-release and manual
dispatch tweets.

Three-layer dedup:
1. feat() commit check — skip if no new features since last release
2. Content hash — skip if identical tweet already posted (cache hit)
3. Beta-to-beta diff — only new features since previous release tag

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-14 14:14:28 -04:00
Argenis
dc61f08f60
fix(ci): preserve release URL in tweets instead of truncating (#3498)
fix(ci): preserve release URL in tweets instead of truncating
2026-03-14 14:09:42 -04:00
Argenis
f7c8a3b8be
fix(ci): use RELEASE_TOKEN for stable release tag creation (#3496)
fix(ci): use RELEASE_TOKEN for stable release tag creation
2026-03-14 14:03:34 -04:00
argenis de la rosa
6cd8749eb4 fix(ci): preserve release URL in tweets instead of truncating it
The 280-char truncation was blindly chopping the tweet text, cutting
off the GitHub release URL. Now we extract the URL first, truncate
only the body text to fit (accounting for X's 23-char t.co counting),
then re-append the full URL so it always renders correctly.
2026-03-14 13:56:18 -04:00
argenis de la rosa
c54f374c7e fix(ci): use RELEASE_TOKEN for stable release creation
GITHUB_TOKEN lacks permission to create tags, causing 'Published
releases must have a valid tag' errors. Use RELEASE_TOKEN (same PAT
the beta workflow uses successfully).
2026-03-14 13:49:58 -04:00
argenis de la rosa
dd00b57f6f fix: add dummy src/lib.rs in Dockerfile for dep caching stage 2026-03-14 13:49:19 -04:00
argenis de la rosa
c8db32389d feat: auto-sync crates.io on version bump
Triggers on push to master when Cargo.toml changes. Detects version
bumps by comparing current vs previous commit, checks crates.io to
avoid duplicate publishes, then publishes automatically.
2026-03-14 13:28:52 -04:00
Argenis
611e4702c2
Merge pull request #3495 from zeroclaw-labs/chore/bump-v0.2.1
chore: bump version to v0.2.1
2026-03-14 13:17:34 -04:00
argenis de la rosa
99cfae1f00 fix: use --no-verify and clean web source to prevent build.rs from re-running npm 2026-03-14 13:05:03 -04:00
argenis de la rosa
1c624f0c51 chore: bump version to v0.2.1
GitHub's immutability policy prevents reuse of the v0.2.0 tag after the
original release was deleted. Bump to v0.2.1 for a clean stable release.
2026-03-14 13:02:05 -04:00
argenis de la rosa
8a8f946269 fix: clean web/node_modules before cargo package to prevent leaking into crate 2026-03-14 12:59:06 -04:00
argenis de la rosa
17469945f7 fix: keep lib/bin name as zeroclaw while publishing as zeroclawlabs
The source uses `zeroclaw::` internally so we need explicit bin/lib
names to avoid breaking crate references after package rename.
2026-03-14 12:52:17 -04:00
argenis de la rosa
6dcd7b7222 fix: regenerate Cargo.lock after crate rename to zeroclawlabs 2026-03-14 12:48:21 -04:00
argenis de la rosa
342f743720 fix: rename crate to zeroclawlabs (zeroclaw taken by another user) 2026-03-14 12:47:13 -04:00
Argenis
e05fd75763
feat(providers): add 17 providers, total now 61 — most in any AI agent (#3492)
New fast inference providers:
- Cerebras, SambaNova, Hyperbolic

New model hosting platforms:
- DeepInfra, Hugging Face, AI21 Labs, Reka, Baseten, Nscale,
  Anyscale, Nebius AI Studio, Friendli AI, Lepton AI

New Chinese AI providers:
- Stepfun, Baichuan, 01.AI (Yi), Tencent Hunyuan

Also fixed missing list_providers() entries for Telnyx and Azure OpenAI.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-14 12:39:28 -04:00
argenis de la rosa
4daeb380a0 fix: allow-dirty flag for cargo publish (web/dist built in CI) 2026-03-14 12:06:17 -04:00
Argenis
ddacb2a917
feat: add crates.io publish workflow and package config (#3490)
- Add include filter to Cargo.toml (750 -> 235 files) for clean crate packaging
- Add publish-crates.yml workflow for manual crates.io publishing with dry-run support
- Add crates-io job to release-stable-manual.yml for auto-publish on stable releases
- Requires CARGO_REGISTRY_TOKEN secret to be configured in repo settings
2026-03-14 12:01:53 -04:00
Argenis
77ca576be6
feat(release+providers): fix release race condition, add 3 providers (#3489)
* feat(release+providers): fix release race condition, add 3 providers

Release fix (two parts):
1. Replace softprops/action-gh-release with `gh release create` — the
   CLI uploads assets atomically with the release in a single call,
   avoiding the immutable release race condition
2. Move website redeploy to a separate job with `if: always()` — so the
   website updates regardless of publish outcome

Both release-beta-on-push.yml and release-stable-manual.yml are fixed.

Provider additions:
- SiliconFlow (siliconflow, silicon-flow)
- AiHubMix (aihubmix)
- LiteLLM router (litellm, lite-llm)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* chore: trigger CI

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-14 11:53:27 -04:00
Argenis
5a4f69e71f
Merge pull request #3488 from zeroclaw-labs/fix/release-immutable-asset-upload
fix(ci): fix immutable release error blocking release updates
2026-03-14 11:36:34 -04:00
argenis de la rosa
e952838eef fix(ci): harden release publish with notes-file and explicit repo flag
- Use --notes-file instead of --notes to safely handle multiline
  release notes with special characters
- Add explicit --repo flag for resilience
- Simplify redeploy-website condition to only run on publish success
2026-03-14 11:24:53 -04:00
argenis de la rosa
3fc4e66fb6 fix(ci): replace softprops/action-gh-release with gh CLI to fix immutable release error
The softprops/action-gh-release action creates the release first, then
uploads assets in separate API calls. GitHub's immutable release policy
rejects those subsequent uploads, causing "Cannot upload assets to an
immutable release" errors. The gh CLI uploads assets atomically with
release creation, avoiding this race condition.

Also moves website redeploy into a separate job that runs regardless of
publish outcome, so the site stays in sync even if asset upload fails.
2026-03-14 11:21:30 -04:00
Argenis
0a331a1440
docs: remove obsolete architecture/config/dev sections from all READMEs (#3485)
Remove Architecture, Security, Configuration, Python Companion, Identity
System, Gateway API, Commands, Development, CI/CD, and Build troubleshooting
sections from README.md and all 16 translated variants. These sections are
no longer relevant and the information lives in the docs/ directory.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-14 11:20:01 -04:00
guitaripod
5919becab9
fix(telegram): route image-extension Documents through vision pipeline (#3457)
Documents with image extensions (jpg, png, etc.) are routed to
[Document: name] /path instead of [IMAGE:/path], bypassing the
multimodal pipeline entirely. This causes the model to have no vision
input for images sent as Telegram Documents.

Re-applies fix from merged dev PR #1631 which was lost during the
master branch migration.

Co-authored-by: Argenis <theonlyhennygod@gmail.com>
2026-03-14 09:54:41 -04:00
guitaripod
287d9bdc17
fix(telegram): handle brackets in attachment filenames (#3458)
parse_attachment_markers uses .find(']') which matches the first ]
in the content. Filenames containing brackets (e.g. yt-dlp output
'Video [G4PvTrTp7Tc].mp4') get truncated at the inner bracket,
causing the send to fail with 'path not found'.

Uses depth-tracking bracket matching instead.

Re-applies fix from merged dev PR #1632 which was lost during the
master branch migration.

Co-authored-by: Argenis <theonlyhennygod@gmail.com>
2026-03-14 09:52:37 -04:00
guitaripod
f900d7079e
feat(channels): add show_tool_calls config to suppress tool notifications (#3480)
* feat(channels): add show_tool_calls config to suppress tool notifications

When show_tool_calls is false, the ChannelNotifyObserver drains tool
events silently instead of forwarding them as individual messages to
the channel. Server-side logs remain unaffected.

Defaults to true for backwards compatibility.

* docs: add before/after screenshots for show_tool_calls PR

* docs(config): add doc comment on show_tool_calls field

---------

Co-authored-by: Argenis <theonlyhennygod@gmail.com>
2026-03-14 09:49:35 -04:00
Argenis
5a5b5a4402
fix(tweet): prevent duplicates, show contributor count, add hashtags (#3481)
- Only tweet when there are NEW feat() commits since the previous
  release (including betas) — skips if no new features to announce
- Feature diff uses previous release tag (beta-to-beta) to prevent
  listing the same features across consecutive releases
- Contributor count uses stable-to-HEAD range for full cycle credit
- Shows count instead of names to avoid 280 char clipping
- Added #zeroclaw #rust #ai #opensource hashtags

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-14 09:41:18 -04:00
Argenis
06f65fb711
feat(release): bump to v0.2.0 with auto-tweet and contributor notes (#3475)
* feat(release): bump to v0.2.0, auto-tweet with features + contributors

- Bump version from 0.1.9 to 0.2.0
- Release notes now auto-generate from feat() commits only (no bug
  fixes) keeping them clean and concise
- Contributors are always featured in release notes, pulled from
  git log authors and Co-Authored-By trailers
- Tweet workflow now fires on all releases (beta + stable), auto-
  generates punchy feature-focused tweets with contributor shoutouts
- Stable release workflow gets the same release notes treatment
- Tweet gracefully skips if Twitter secrets aren't configured

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat(release): credit all contributors, wider range, filter bots

- Use wider git range for contributor collection — skips same-version
  betas to capture everyone who contributed to the release, not just
  the last incremental push
- Filter out bot accounts (dependabot, github-actions, copilot, etc.)
  from contributor lists
- Tweet shows up to 6 contributors with "+ N more" when there are
  extras, ensuring everyone gets credit
- Deduplicate contributors across git authors and Co-Authored-By
- Deduplicate features with sort -uf for cleaner notes

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(release): use last stable tag for contributor range, not last beta

Ensures the tweet and release notes capture ALL contributors since the
last stable release (e.g. v0.1.9a), not just since the last beta tag.
This gives proper credit to everyone who contributed across the full
release cycle.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-14 07:59:04 -04:00
Argenis
46d4b13c22
feat(install): branded one-click installer with secure pairing flow (#3471)
* feat(install): consolidate one-click installer with branded output and inline onboarding

- Add blue color scheme with 🦀 crab emoji branding throughout installer
- Add structured [1/3] [2/3] [3/3] step output with ✓/·/✗ indicators
- Consolidate onboarding into install.sh: inline provider selection menu,
  API key prompt, and model override — no separate wizard step needed
- Replace --onboard/--interactive-onboard with --skip-onboard (opt-out)
- Add OS detection display, install method, version detection, upgrade vs
  fresh install logic
- Add post-install gateway service install/restart, doctor health check
- Add dashboard URL (port 42617) with clipboard copy and browser auto-open
- Add docs link (https://www.zeroclawlabs.ai/docs) to success output
- Display pairing code after onboarding in Rust CLI (src/main.rs)
- Remove --interactive flag from `zeroclaw onboard` CLI command
- Remove redundant scripts/install-release.sh legacy redirect
- Update all --interactive references across codebase

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* feat(onboard): auto-pair and include bearer token in dashboard URL

After onboarding, the CLI now auto-pairs using the generated pairing
code to produce a bearer token, then displays the dashboard URL with
the token embedded (e.g. http://127.0.0.1:42617?token=zc_...) so
users can access the dashboard immediately without a separate pairing
step. The token is also persisted to config for gateway restarts.

The install script captures this token-bearing URL from the onboard
output and uses it for clipboard copy and browser auto-open.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* security(onboard): revert token-in-URL, keep pairing code terminal-only

Removes the auto-pair + token-in-URL approach in favor of the original
secure pairing flow. Bearer tokens should never appear in URLs where
they can leak via browser history, Referer headers, clipboard, or
proxy logs. The pairing code stays in the terminal and the user enters
it in the dashboard to complete the handshake securely.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* style: apply cargo fmt formatting

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-14 07:33:14 -04:00
Jacobinwwey
8fcbb6eb2d
fix(channels): harden slack threading and utf8 truncation (#3461)
* fix(channels): harden slack threading and utf8 truncation

* refactor(channel): collapse interrupt flags to satisfy clippy

---------

Co-authored-by: Argenis <theonlyhennygod@gmail.com>
2026-03-14 07:31:10 -04:00
Argenis
ce22eba7d0
fix(channels): proactively trim conversation history before provider call (#3473)
Conversation history on long-running channel sessions (e.g. Feishu) grew
unbounded until the provider returned a context-window-exceeded error.
The existing reactive compaction only kicked in *after* the error,
causing the user's message to be lost and requiring a resend.

Add proactive_trim_turns() which estimates total character count and
drops the oldest turns before the request reaches the provider.  The
budget (400 k chars ≈ 100 k tokens) leaves headroom for system prompt,
memory context, and model output.

Closes #3460
2026-03-14 07:15:34 -04:00
Argenis
7ba4d06e78
fix(docs): use absolute URL for install.sh One-Click Setup link (#3470)
The relative href="install.sh" in README nav headers resolves to
zeroclawlabs.ai/install.sh when viewed on the website, which returns
404 since the website does not serve repo-root files. Replace with
the raw.githubusercontent.com URL used elsewhere in the docs.

Fixes #3463
2026-03-14 07:14:47 -04:00
lilstaz
dc12d03876
feat(gateway): enable multi-turn chat for WebSocket connections (#3467)
Replace single-turn chat with persistent Agent to maintain conversation
history across WebSocket turns within the same connection.

Co-authored-by: staz <starzwan2333@gmail.com>
Co-authored-by: Argenis <theonlyhennygod@gmail.com>
2026-03-14 07:04:59 -04:00
Argenis
3151604b04
fix(ci): include install.sh in release assets and website dispatch (#3469)
The website at zeroclawlabs.ai shows a copy button with
curl -fsSL https://zeroclawlabs.ai/install.sh | bash but the website
does not serve the script, returning 404.

Add install.sh as a GitHub release asset in both beta and stable
workflows so it is always available at a stable URL. Pass the raw
GitHub URL in the website redeploy dispatch payload so the website
can configure a redirect or proxy for /install.sh.

Closes #3463
2026-03-14 07:04:17 -04:00
guitaripod
c5fcda06ad
fix(agent): add channel media markers to system prompt (#3459)
The system prompt has no documentation of channel media markers
([Voice], [IMAGE:], [Document:]), causing the LLM to misinterpret
transcribed voice messages as unprocessable audio attachments instead
of responding to the transcribed text content.

Re-applies fix from merged dev PR #1697 which was lost during the
master branch migration.

Co-authored-by: Argenis <theonlyhennygod@gmail.com>
2026-03-14 06:58:40 -04:00
Ericsunsk
51a52dcadb
fix(memory): pass embedding_routes in gateway and agent loop (#3462)
Co-authored-by: Argenis <theonlyhennygod@gmail.com>
2026-03-14 06:56:55 -04:00
argenis de la rosa
dd9e26eac6 ci: trigger website redeploy after release publish
After publishing a beta or stable release, dispatch a
repository_dispatch event to zeroclaw-website so it rebuilds
with the latest version tag automatically.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-14 06:08:09 -04:00
argenis de la rosa
c4b2a21c61 ci: upgrade tweet-release with OpenClaw-style formatting, image support, and manual dispatch 2026-03-14 05:01:56 -04:00
argenis de la rosa
d6170ab49b ci: add tweet-release workflow to post to X on stable releases 2026-03-14 04:59:18 -04:00
Argenis
399c896c3b
chore: bump release notes to v0.1.9b (#3455)
Update What's New and Recent Contributors sections to reference
v0.1.9b release.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 22:57:11 -04:00
Argenis
71d32c3b04
docs(readme): add v0.1.9a release highlights and contributor credits (#3453)
* docs(readme): add v0.1.9a release highlights and contributor credits

Add "What's New in v0.1.9a" section covering web dashboard restyle,
new providers/channels, MCP tools, infrastructure updates, and fixes.
Add "Recent Contributors" table crediting key contributors with their
specific highlights for this release cycle.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(docs): use table format for release highlights to pass MD036 lint

Replace bold-text subsections with a table to satisfy the markdown
linter's no-emphasis-as-heading rule (MD036).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 22:41:11 -04:00
Christopher Wong
27936b051d
Windows/wizard deadlock (#3451)
* ci: add x86_64-pc-windows-msvc to build matrix

* fix: prevent test deadlock in ensure_onboard_overwrite_allowed

Gate non-interactive terminal check behind cfg!(not(test)) so tests with
force=false do not hang waiting on stdin. cfg!(test) path bails immediately
with a clear message. No changes to extra_headers, mcp, nodes, or shellexpand.

---------

Co-authored-by: Argenis <theonlyhennygod@gmail.com>
2026-03-13 21:59:05 -04:00
Alix-007
39d788a95f
fix(linq): accept latest webhook payload shape (#3351)
* fix(linq): accept current webhook payload shape

* style(linq): satisfy clippy lifetime lint

---------

Co-authored-by: argenis de la rosa <theonlyhennygod@gmail.com>
2026-03-13 21:53:41 -04:00
Alix-007
5d921bd37d
fix(config): decrypt feishu channel secrets on load (#3355)
* fix(config): decrypt feishu channel secrets on load

* style(config): format feishu secret assertions

---------

Co-authored-by: Argenis <theonlyhennygod@gmail.com>
2026-03-13 21:53:11 -04:00
Alix-007
d17f0a946c
fix(config): recover docker runtime path on save (#3354)
* fix(config): recover docker runtime path on save

* style(config): align docker save path formatting

---------

Co-authored-by: Argenis <theonlyhennygod@gmail.com>
2026-03-13 21:52:40 -04:00
Alix-007
2710ce65cc
fix(cli): handle empty invocation before clap parse (#3353) 2026-03-13 21:52:15 -04:00
Alix-007
51e8fc8423
fix(install): restore legacy release installer path (#3352) 2026-03-13 21:51:56 -04:00
Alix-007
ecf9d477bd
fix(docs): use master install script URL (#3349)
Co-authored-by: Alix-007 <Alix-007@users.noreply.github.com>
2026-03-13 21:51:34 -04:00
Christopher Wong
625784c25f
ci: add x86_64-pc-windows-msvc to build matrix (#3449)
Co-authored-by: Argenis <theonlyhennygod@gmail.com>
2026-03-13 21:45:08 -04:00
Asuta
348c0c37b7
feat(agent): 支持交互会话状态持久化与恢复 (#3421)
Co-authored-by: Argenis <theonlyhennygod@gmail.com>
2026-03-13 18:55:42 -04:00
zq
11a1dae55b
docs(i18n/zh-CN): Complete full Chinese documentation translation and… (#3429)
* docs(i18n/zh-CN): Complete full Chinese documentation translation and i18n migration

  - Translate 58 core Chinese docs covering all modules (setup, reference, operations, security, hardware,
  contributor guides, etc.)
  - Migrate all .zh-CN.md files to i18n/zh-CN directory preserving original directory structure
  - Update internal links across 3 entry files:
    * Root README.zh-CN.md
    * docs/README.zh-CN.md
    * docs/SUMMARY.zh-CN.md
  - All links point to correct i18n paths with full accessibility
  - Align with Vietnamese i18n directory structure per project conventions

* fix(i18n/zh-CN): resolve all 30 blocking markdown lint errors
- Fix all MD022 blank lines around headings issues across 10 files
- Fix MD036 emphasis used as heading issue in refactor-candidates.md

* docs(i18n/zh-CN): resolve broken document reference link in root README.zh-CN.md

* fix(docs): repair double-quote HTML bug and broken relative links in zh-CN docs

- Remove stray extra `"` after href closing quotes in 6 HTML links in
  root README.zh-CN.md
- Fix relative link depth for files under docs/i18n/zh-CN/ that
  reference repo-root files (CONTRIBUTING.md, README.zh-CN.md,
  .github/workflows/, tests/, docs/SUMMARY.zh-CN.md): use correct
  number of `../` levels based on actual file location in the tree

---------

Co-authored-by: argenis de la rosa <theonlyhennygod@gmail.com>
2026-03-13 18:46:29 -04:00
Argenis
dd1681be44
docs(i18n): add documentation hub translations for all 30 languages (#3450)
Add README and SUMMARY translations for 25 missing languages to match
the root-level README coverage. Update English docs hub and SUMMARY
with links to all localized hubs.

New languages: ar, bn, cs, da, de, el, es, fi, he, hi, hu, id, it,
ko, nb, nl, pl, pt, ro, sv, th, tl, tr, uk, ur. Also adds missing
SUMMARY.vi.md for Vietnamese.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 18:42:26 -04:00
Fausto
cabd3de3cb
Allow ZEROCLAW_PROVIDER_URL env variable to override api_url (#3414)
* Ignore JetBrains .idea folder

* fix(ollama): support stringified JSON tool call arguments

* providers: allow ZEROCLAW_PROVIDER_URL env var to override Ollama base URL

Supports container deployments where Ollama runs on a Docker network host
(e.g. http://ollama:11434) without requiring config.toml changes.

Includes regression test ensuring the environment override works.

* fix(clippy): replace Default::default() with ProviderRuntimeOptions::default()

---------

Co-authored-by: Argenis <theonlyhennygod@gmail.com>
2026-03-13 18:24:37 -04:00
Argenis
87cf6b0e93
feat(gateway): add dynamic node discovery and capability advertisement (#3448)
Add a WebSocket endpoint at /ws/nodes where external processes and
devices can connect and advertise their capabilities at runtime.
The gateway tracks connected nodes in a NodeRegistry and exposes
their capabilities as dynamically available tools via NodeTool.

- Add src/gateway/nodes.rs: WebSocket endpoint, NodeRegistry, protocol
- Add src/tools/node_tool.rs: Tool trait wrapper for node capabilities
- Add NodesConfig to config schema (disabled by default)
- Wire /ws/nodes route into gateway router
- Add NodeRegistry to AppState and all test constructions
- Re-export NodesConfig and NodeTool from module roots

Closes #3093
2026-03-13 18:23:48 -04:00
Marcelo Correa
2e2c1da4fa
fix(cron): skip unparseable job rows instead of aborting the scheduler (#3405)
A single cron job with a malformed `next_run` timestamp in the database
was silently stopping all scheduled jobs. The `due_jobs` query matched
rows whose `next_run` was lexicographically past-due (including
non-RFC3339 values like "2026-03-12 03:11:13" which sort before valid
RFC3339 strings), then `map_cron_job_row` failed to parse the timestamp,
the `row?` propagation caused `due_jobs` to return `Err`, and the
scheduler marked itself as `error` and skipped every subsequent tick —
taking down all other healthy jobs with it.

The fix changes the row iteration in `due_jobs` to log a warning and
skip unparseable rows rather than aborting the entire result set. Valid
jobs continue to fire; the broken row is surfaced in the logs without
collateral damage to the scheduler.

Co-authored-by: ZeroClaw <zeroclaw@users.noreply.github.com>
Co-authored-by: Argenis <theonlyhennygod@gmail.com>
2026-03-13 18:17:08 -04:00
SimianAstronaut7
736347c71b
fix(workflows): use RELEASE_TOKEN for beta release tag creation (#3366)
The default GITHUB_TOKEN cannot bypass the "Release Tags - Restricted
Operators" ruleset, causing beta releases to fail with a 422 validation
error. Switch to a PAT stored as RELEASE_TOKEN that has bypass permissions.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Argenis <theonlyhennygod@gmail.com>
2026-03-13 18:09:03 -04:00
Argenis
c384c34c31
feat(provider): support custom API path suffix for custom: endpoints (#3447)
* feat(provider): support custom API path suffix for custom: endpoints

Allow users to configure a custom API path for custom/compatible
providers instead of hardcoding /v1/chat/completions. Some self-hosted
LLM servers use different API paths.

Adds an optional `api_path` field to:
- Config (top-level and model_providers profile)
- ProviderRuntimeOptions
- OpenAiCompatibleProvider

When set, the custom path is appended to base_url instead of the
default /chat/completions suffix.

Closes #3125

* fix: add missing api_path field to test ModelProviderConfig initializers
2026-03-13 17:54:21 -04:00
Argenis
4ca5fa500b
feat(web): preserve message draft in agent chat across view switches (#3443)
Add an in-memory DraftContext that persists textarea content when the
AgentChat component unmounts due to route navigation. The draft is
restored when the user returns to the chat view. The store is
session-scoped (not localStorage) so drafts are cleared on page reload.

Closes #3129
2026-03-13 17:40:23 -04:00
Argenis
1a0441a006
feat(web): electric blue dashboard restyle with animations and logo (#3445)
Restyle the entire web dashboard with an electric blue theme featuring
glassmorphism cards, smooth animations, and the ZeroClaw logo. Remove
duplicate Vite dev server infrastructure to ensure a single gateway.

- Add electric blue color palette and glassmorphism styling system
- Add 10+ keyframe animations (fade, slide, pulse-glow, shimmer, float)
- Restyle all 10 pages with glass cards and electric components
- Add ZeroClaw logo to sidebar, pairing screen, and favicon
- Remove Vite dev/preview scripts and proxy config (build-only now)
- Update pairing dialog with ambient glow and animated elements

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 17:27:41 -04:00
Argenis
ef770f15b9
feat(tool): on-demand MCP tool loading via tool_search (#3446)
Add deferred MCP tool activation to reduce context window waste.
When mcp.deferred_loading is true (the default), MCP tool schemas
are not eagerly included in the LLM context. Instead, only tool
names appear in an <available-deferred-tools> system prompt section,
and the LLM calls the built-in tool_search tool to fetch full schemas
on demand. Setting deferred_loading to false preserves the existing
eager behavior.

Closes #3095
2026-03-13 17:25:19 -04:00
Argenis
05a0cdf6f4
feat(tools): add Windows support for shell tool_call execution (#3442)
Use `cmd.exe /C` instead of `sh -c` on Windows via cfg(target_os).
Make the shell allowlist, forbidden paths, env vars, risk classification,
and path detection platform-aware so the shell tool works correctly on
Windows without changing Unix behavior.

Closes #3327

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 17:12:16 -04:00
Argenis
fc1b555b31
feat(matrix): add read markers and typing notifications (#3441)
Send a read receipt after receiving each message, start a typing
notification while processing, and stop it before sending the response.
This gives Matrix users visual feedback that the bot has seen their
message and is working on a reply.

Closes #3357

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 17:06:48 -04:00
Argenis
b8ebe7bcd3
feat(gateway): add cron run history API and dashboard panel (#3440)
Add GET /api/cron/{id}/runs?limit=N endpoint that returns recent stored
runs for a cron job, with server-side limit clamping to 1-100 (default 20).

Frontend adds a CronRun type, API client function, and an expandable
run history panel on the Cron page showing status, timestamps, duration,
and output for each run, with loading, empty, error, and refresh states.

Closes #3299

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 17:06:44 -04:00
Argenis
049edf4eec
feat(channel): add WeCom (WeChat Enterprise) Bot Webhook channel (#3439)
Implement the Channel trait for WeCom Bot Webhook, supporting
outbound text messages via the WeCom webhook API. The channel
is send-only; inbound messages can be routed through the gateway
webhook subsystem.

Closes #3396
2026-03-13 16:44:34 -04:00
Argenis
856a651dd1
feat(channels): add ack_reactions config to disable channel reactions (#3438)
Users can now set `ack_reactions = false` in `[channels_config]` to
suppress the 👀//⚠️ acknowledgement reactions on incoming messages.
The option defaults to `true`, preserving existing behavior.

Closes #3403
2026-03-13 16:43:47 -04:00
Argenis
fd3b17edc1
feat(docker): add Debian-based container variant with shell tools (#3437)
The default distroless image has no shell, preventing the agent from
using shell-based tools (pwd, ls, git, etc.). Add Dockerfile.debian
that uses debian:bookworm-slim as the runtime base and includes bash,
git, curl, and ca-certificates. The existing distroless Dockerfile
remains unchanged for security-conscious deployments.

Closes #3359
2026-03-13 16:40:29 -04:00
Argenis
a9c7697c6b
fix(slack): subscribe to thread message events in polling mode (#3435)
The polling-based Slack listener only called conversations.history, which
returns top-level channel messages but not thread replies. Users replying
inside a thread were invisible to the bot after its initial response.

Add conversations.replies polling for active threads discovered in
channel history. Track thread parents with reply_count > 0, periodically
fetch new replies, and emit them as ChannelMessage with the correct
thread_ts so the bot can continue multi-turn conversations in-thread.
Stale threads are evicted after 24 hours or when the tracker exceeds
50 entries.

Closes #3084
2026-03-13 16:33:26 -04:00
Argenis
939edf5e86
fix: expose MCP tools to delegate subagents (#3436)
MCP tools were not visible to delegate subagents because parent_tools
was a static snapshot taken before MCP tool wiring. Switch to interior
mutability (parking_lot::RwLock) so MCP wrappers pushed after
DelegateTool construction are visible at sub-agent execution time.

Closes #3069
2026-03-13 16:26:01 -04:00
Alix-007
bbc82fd4f9
fix(observability): support verbose backend selection (#3374)
Co-authored-by: argenis de la rosa <theonlyhennygod@gmail.com>
2026-03-13 16:15:43 -04:00
Argenis
a606f308f5
fix(security): respect allowed_roots in tool-level path pre-checks (#3434)
When workspace_only=true and allowed_roots is configured, several tools
(file_read, content_search, glob_search) rejected absolute paths before
the allowed_roots allowlist was consulted. Additionally, tilde paths
(~/...) passed is_path_allowed but were then incorrectly joined with
workspace_dir as literal relative paths.

Changes:
- Add SecurityPolicy::resolve_tool_path() to properly expand tilde
  paths and handle absolute vs relative path resolution for tools
- Add SecurityPolicy::is_under_allowed_root() for tool pre-checks to
  consult the allowed_roots allowlist before rejecting absolute paths
- Update file_read to use resolve_tool_path instead of workspace_dir.join
- Update content_search and glob_search absolute-path pre-checks to
  allow paths under allowed_roots
- Add tests covering workspace_only + allowed_roots scenarios

Closes #3082
2026-03-13 16:15:30 -04:00
Argenis
1162e3adc2
fix(channel): resolve Matrix channel compilation errors with channel-matrix feature (#3433)
Add missing Relation import from ruma events::room::message, remove
unused InReplyTo import, suppress unused matrix_skip_context variable,
and fix additional clippy lints (split_once, single-char patterns,
collapsible replace, wildcard match, ignored unit pattern).

Closes #3425
2026-03-13 15:45:28 -04:00
Alix-007
bd75799644
fix(build): unblock strict 32-bit no-default-features builds (#3375)
Co-authored-by: argenis de la rosa <theonlyhennygod@gmail.com>
2026-03-13 15:45:03 -04:00
Argenis
35217bf457
fix: use cfg-conditional AtomicU32 fallback for 32-bit targets in mcp_client (#3432)
PR #3409 fixed AtomicU64 usage on 32-bit targets in other files but
missed src/tools/mcp_client.rs. Apply the same cfg(target_has_atomic)
pattern used in channels/irc.rs to conditionally select AtomicU64 vs
AtomicU32.

Closes #3430
2026-03-13 15:33:31 -04:00
Alix-007
e5e3761020
fix(cron): support Matrix announce delivery (#3373)
* fix(cron): support Matrix announce delivery

* fix(cron): expose Matrix delivery in tool schemas
2026-03-13 15:16:10 -04:00
Vernon Stinebaker
a52446c637
feat(agent): add tool_filter_groups for per-turn MCP tool schema filtering (#3395)
* feat(agent): add tool_filter_groups for per-turn MCP tool schema filtering

Introduces per-turn MCP tool schema filtering to reduce token overhead when
many MCP tools are registered. Filtering is driven by a new config field
`agent.tool_filter_groups`, which is a list of named groups that each
specify tool glob patterns and an activation mode (`always` or `dynamic`).

Built-in (non-MCP) tools always pass through unchanged; the feature is fully
backward-compatible — an empty `tool_filter_groups` list (the default) leaves
all existing behaviour untouched.

Changes:
- src/config/schema.rs: add `ToolFilterGroupMode`, `ToolFilterGroup` types
  and `tool_filter_groups` field on `AgentConfig`
- src/config/mod.rs: re-export `ToolFilterGroup`, `ToolFilterGroupMode`
- src/agent/loop_.rs: add `glob_match()`, `filter_tool_specs_for_turn()`,
  `compute_excluded_mcp_tools()` helpers; wire call sites in both single-shot
  and interactive REPL modes; add unit tests for all three functions
- docs/reference/api/config-reference.md: document `tool_filter_groups`
  field and sub-table schema with example
- docs/i18n/el/config-reference.md: add Greek locale config-reference with
  `tool_filter_groups` section (2026-03-12 update)

* Remove accidentally committed worktree directories

---------

Co-authored-by: argenis de la rosa <theonlyhennygod@gmail.com>
2026-03-13 14:23:57 -04:00
Vernon Stinebaker
292952e563
feat(tools/mcp): add MCP subsystem tools layer with multi-transport client (#3394)
* feat(tools/mcp): add MCP subsystem tools layer with multi-transport client

Introduces a new MCP (Model Context Protocol) subsystem to the tools layer,
providing a multi-transport client implementation (stdio, HTTP, SSE) that
allows ZeroClaw agents to connect to external MCP servers and register their
exposed tools into the runtime tool registry.

New files:
- src/tools/mcp_client.rs: McpRegistry — lifecycle manager for MCP server connections
- src/tools/mcp_protocol.rs: protocol types (request/response/notifications)
- src/tools/mcp_tool.rs: McpToolWrapper — bridges MCP tools to ZeroClaw Tool trait
- src/tools/mcp_transport.rs: transport abstraction (Stdio, Http, Sse)

Wiring changes:
- src/tools/mod.rs: pub mod + pub use for new MCP modules
- src/config/schema.rs: McpTransport, McpServerConfig, McpConfig types; mcp field
  on Config; validate_mcp_config; mcp unit tests
- src/config/mod.rs: re-exports McpConfig, McpServerConfig, McpTransport
- src/channels/mod.rs: MCP server init block in start_channels()
- src/agent/loop_.rs: MCP registry init in run() and process_message()
- src/onboard/wizard.rs: mcp: McpConfig::default() in both wizard constructors

* fix(tools/mcp): inject MCP tools after built-in tool filter, not before

MCP servers are user-declared external integrations. The built-in
agent.allowed_tools / agent.denied_tools filter (filter_primary_agent_tools_or_fail)
governs built-in tool governance only. Injecting MCP tools before that
filter would silently drop all MCP tools when a restrictive allowlist is
configured.

Add ordering comments at both call sites (run() CLI path and
process_message() path) to make this contract explicit for reviewers
and future merges.

Identified via: shady831213/zeroclaw-agent-mcp@3f90b78

* fix(tools/mcp): strip approved field from MCP tool args before forwarding

ZeroClaw's security model injects `approved: bool` into built-in tool
args for supervised-mode confirmation. MCP servers have no knowledge of
this field and reject calls that include it as an unexpected parameter.

Strip `approved` from object-typed args in McpToolWrapper::execute()
before forwarding to the MCP server. Non-object args pass through
unchanged (no silent conversion or rejection).

Add two unit tests:
- execute_strips_approved_field_from_object_args: verifies removal
- execute_handles_non_object_args_without_panic: verifies non-object
  shapes are not broken by the stripping logic

Identified via: shady831213/zeroclaw-agent-mcp@c68be01

---------

Co-authored-by: argenis de la rosa <theonlyhennygod@gmail.com>
2026-03-13 14:23:48 -04:00
SimianAstronaut7
d115b28a1f
fix(daemon): expand tilde to home directory in file paths (#3424)
Rust treats `~` as a literal path character, not a home directory
shorthand. Several config resolution paths used `PathBuf::from()` on
user-provided strings without expanding `~` first, causing a literal
`~` folder to be created in the working directory.

Apply `shellexpand::tilde()` to all user-facing path inputs:
- ZEROCLAW_CONFIG_DIR env var (config/schema.rs, onboard/wizard.rs)
- ZEROCLAW_WORKSPACE env var (config/schema.rs, onboard/wizard.rs,
  channels/matrix.rs)
- active_workspace.toml marker file config_dir (config/schema.rs)

The WhatsApp Web session_path was already correctly expanded via
shellexpand::tilde() in whatsapp_web.rs.

Closes #3417

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Argenis <theonlyhennygod@gmail.com>
2026-03-13 14:22:57 -04:00
Jacobinwwey
b563f5954e
fix(llamacpp): send responses fallback history in llama.cpp-compatible item shape (#3391)
Co-authored-by: Argenis <theonlyhennygod@gmail.com>
2026-03-13 14:21:17 -04:00
SimianAstronaut7
e3e711073a
feat(providers): support custom HTTP headers for LLM API requests (#3423)
Add `extra_headers` config field and `ZEROCLAW_EXTRA_HEADERS` env var
support so users can specify custom HTTP headers for provider API
requests. This enables connecting to providers that require specific
headers (e.g., User-Agent, HTTP-Referer, X-Title) without a reverse
proxy.

Config file headers serve as the base; env var headers override them.
Format: `Key:Value,Key2:Value2`

Closes #3189

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Argenis <theonlyhennygod@gmail.com>
2026-03-13 14:15:42 -04:00
SimianAstronaut7
1033287f38
fix(gateway): skip pairing dialog when require_pairing is disabled (#3422)
When `require_pairing = false` in config, the dashboard showed the
pairing dialog even though no pairing code exists, creating a deadlock.

Add `requiresPairing` field to AuthState (defaults to `true` as a safe
fallback) and update it from the `/health` endpoint response. Gate the
pairing dialog in App.tsx on both `!isAuthenticated` and
`requiresPairing` so the dashboard loads directly when pairing is
disabled.

Closes #3267

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 14:08:59 -04:00
Argenis
6a4ccaeb73
fix: strip stale tool_result from conversation history and memory context (#3418)
Prevent orphan `<tool_result>` blocks from leaking into LLM sessions:

- Strip `<tool_result>` blocks from cached prior turns in
  `process_channel_message` so the LLM never sees a tool result
  without a preceding tool call (Case A — in-memory accumulation).
- Skip memory entries containing `<tool_result` in both
  `should_skip_memory_context_entry` (channel path) and
  `build_context` (agent path) so SQLite-recalled tool output
  is never injected as memory context (Case B — post-restart).

Closes #3402
2026-03-13 09:55:57 -04:00
Argenis
4aead04916
fix: skip documentation URLs in cloudflare tunnel URL parser (#3416)
The URL parser captured the first https:// URL found in cloudflared
stderr output. When cloudflared emits a quic-go UDP buffer warning
containing a github.com link, that documentation URL was incorrectly
captured as the tunnel's public URL.

Extract URL parsing into a testable helper function that skips known
documentation domains (github.com, cloudflare.com/docs,
developers.cloudflare.com) and recognises tunnel-specific log prefixes
("Visit it at", "Route at", "Registered tunnel connection") and the
.trycloudflare.com domain.

Closes #3413
2026-03-13 09:40:02 -04:00
Argenis
7b23c8934c
docs: clarify master-only branch policy and clean up stale references (#3415)
Add a prominent migration notice to CONTRIBUTING.md with explicit
instructions for contributors who still have local or forked main
branches. Fix the last remaining main branch reference in
python/pyproject.toml. Stale merged branches and main-related remote
branches have been deleted.

Refs: #2929, #3061
2026-03-13 09:38:11 -04:00
Argenis
5d1543100d
fix: support Linq 2026-02-03 webhook payload shape (#3337) (#3410)
Closes #3337

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 09:31:53 -04:00
Argenis
e3a91bc805
fix: gate prometheus and fix AtomicU64 for 32-bit targets (#3409)
* fix: gate prometheus and fix AtomicU64 for 32-bit targets (#3335)

Closes #3335

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* style: fix import ordering for cfg-gated atomics

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 09:31:25 -04:00
Argenis
833fdefbe5
fix: restore MCP support missing from master branch (#3412)
MCP (Model Context Protocol) config and tool modules were added on the
old `main` branch but never made it to `master`. This restores the full
MCP subsystem: config schema, transport layer (stdio/HTTP/SSE), client
registry, tool wrapper, config validation, and channel wiring.

Closes #3379

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 09:20:37 -04:00
Argenis
13f74f0ecc
fix: restore web dashboard by auto-building frontend in build.rs (#3408)
The build.rs was reduced to only creating an empty web/dist/ directory,
which meant rust-embed would embed no files and the SPA fallback handler
would return 404 for every request including `/`. This is a regression
from v0.1.8 where web/dist/ was still tracked in git.

Update build.rs to detect when web/dist/index.html is missing and
automatically run `npm ci && npm run build` if npm is available. The
build is best-effort: when Node.js is absent the Rust build still
succeeds with an empty dist directory (release workflows pre-build the
frontend separately).

Closes #3386

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 09:20:34 -04:00
Argenis
9ff045d2e9
fix: resolve install.sh prebuilt download 404 by querying releases API (#3406)
The /releases/latest/download/ URL only resolves to the latest non-prerelease,
non-draft release. When that release has no binary assets (e.g. v0.1.9a),
--prebuilt-only fails with a 404. This adds resolve_asset_url() which queries
the GitHub releases API for the newest release (including prereleases) that
actually contains the requested asset, falling back to /releases/latest/ if
the API call fails.

Closes #3389

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 09:20:31 -04:00
Argenis
6fe8e3a5bb
fix: gracefully handle reasoning_enabled for unsupported Ollama models (#3411)
When reasoning_enabled is configured, the Ollama provider sends
think=true to all models. Models that don't support the think parameter
(e.g. qwen3.5:0.8b) cause request failures that the reliable provider
classifies as retryable, leading to an infinite retry loop.

Fix: when a request with think=true fails, automatically retry once
with think omitted. This lets the call succeed on models that lack
reasoning support while preserving thinking for capable models.

Closes #3183
Related #850

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 09:16:03 -04:00
Argenis
5dc1750df7
fix: add crypto.randomUUID fallback for older browsers (#3407)
Replace direct `crypto.randomUUID()` calls in the web dashboard with a
`generateUUID()` utility that falls back to a manual UUID v4 implementation
using `crypto.getRandomValues()` when `randomUUID` is unavailable (older
Safari, some Electron builds, Raspberry Pi browsers).

Closes #3303
Closes #3261

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-13 09:16:00 -04:00
SimianAstronaut7
b40c9e77af
Merge pull request #3365 from zeroclaw-labs/ci/fix-glibc-cache-mismatch
ci: pin release workflows to ubuntu-latest to fix glibc cache mismatch
2026-03-12 17:44:42 -04:00
simianastronaut
34cac3d9dd ci: pin release workflows to ubuntu-latest to fix glibc cache mismatch
CI workflows use ubuntu-latest (24.04, glibc 2.39) while release
workflows used ubuntu-22.04 (glibc 2.35). Swatinem/rust-cache keys
on runner.os ("Linux"), not the specific version, so cached build
scripts compiled on 24.04 would fail on 22.04 with GLIBC_2.39 not
found errors.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 17:37:31 -04:00
SimianAstronaut7
badf96dcab
Merge pull request #3363 from zeroclaw-labs/ci/faster-apple-build
ci: use thin LTO profile for faster CI builds
2026-03-12 17:31:29 -04:00
simianastronaut
c1e1228fb0 ci: use thin LTO profile for faster CI builds
The release profile uses fat LTO + codegen-units=1, which is
optimal for distribution binaries but unnecessarily slow for CI
validation builds. Add a dedicated `ci` profile with thin LTO and
codegen-units=16, and use it in both CI workflows.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 17:18:36 -04:00
Alix-007
04ea5093d4 fix(config): log existing config as initialized 2026-03-13 02:41:39 +08:00
SimianAstronaut7
d2b923ae07
Merge pull request #3322 from zeroclaw-labs/work-issues/2984-fix-cli-chinese-input-crash
fix(agent): use byte-level stdin reads to prevent CJK input crash
2026-03-12 16:58:46 +00:00
SimianAstronaut7
21fdef95f4
Merge pull request #3324 from zeroclaw-labs/work-issues/2907-channel-send-message
feat(channel): add `channel send` CLI command for outbound messages
2026-03-12 16:58:44 +00:00
SimianAstronaut7
d02fbf2d76
Merge pull request #3326 from zeroclaw-labs/work-issues/2978-tool-call-dedup-exempt
feat(agent): add tool_call_dedup_exempt config to bypass within-turn dedup
2026-03-12 16:58:41 +00:00
SimianAstronaut7
05cede29a8
Merge pull request #3328 from zeroclaw-labs/fix/2926-configurable-provider-timeout
feat(provider): make HTTP request timeout configurable
2026-03-12 16:58:38 +00:00
SimianAstronaut7
d34a2e6d3f
Merge pull request #3329 from zeroclaw-labs/work-issues/2443-approval-manager-shadowed-binding
fix(channel): remove shadowed variable bindings in test functions
2026-03-12 16:58:35 +00:00
SimianAstronaut7
576d22fedd
Merge pull request #3330 from zeroclaw-labs/work-issues/2884-ws-token-query-param
fix(gateway): restore multi-source WebSocket auth token extraction
2026-03-12 16:58:32 +00:00
SimianAstronaut7
d5455c694c
Merge pull request #3332 from zeroclaw-labs/work-issues/2896-discord-ws-silent-stop
fix(channel): handle websocket Ping frames and read errors in Discord gateway
2026-03-12 16:58:30 +00:00
SimianAstronaut7
90275b057e
Merge pull request #3339 from zeroclaw-labs/work-issues/2880-fix-workspace-path-blocked
fix(security): allow absolute paths within workspace when workspace_only is set
2026-03-12 16:58:27 +00:00
SimianAstronaut7
d46b4f29d2
Merge pull request #3341 from zeroclaw-labs/work-issues/2403-telegram-photo-duplicate
fix(channel): prevent first-turn photo duplication in memory context
2026-03-12 16:58:23 +00:00
simianastronaut
f25835f98c fix(channel): prevent first-turn photo duplication in memory context (#2403)
When auto_save is enabled and a photo is sent on the first turn of a
Telegram session, the [IMAGE:] marker was duplicated because:

1. auto_save stores the photo message (including the marker) to memory
2. build_memory_context recalls the just-saved entry as relevant context
3. The recalled marker is prepended to the original message content

Fix: skip memory context entries containing [IMAGE:] markers in
should_skip_memory_context_entry so auto-saved photo messages are not
re-injected through memory context enrichment.

Closes #2403

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 12:03:29 -04:00
simianastronaut
376579f9fa fix(security): allow absolute paths within workspace when workspace_only is set (#2880)
When workspace_only=true, is_path_allowed() blanket-rejected all
absolute paths.  This blocked legitimate tool calls that referenced
files inside the workspace using an absolute path (e.g. saving a
screenshot to /home/user/.zeroclaw/workspace/images/example.png).

The fix checks whether an absolute path falls within workspace_dir or
any configured allowed_root before rejecting it, mirroring the priority
order already used by is_resolved_path_allowed().  Paths outside the
workspace and allowed roots are still blocked, and the forbidden-paths
list continues to apply to all other absolute paths.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 11:59:35 -04:00
simianastronaut
b620fd6bba fix(channel): handle websocket Ping frames and read errors in Discord gateway
The Discord gateway event loop silently dropped websocket Ping frames
(via a catch-all `_ => continue`) without responding with Pong. After
splitting the websocket stream into read/write halves, automatic
Ping/Pong handling is disabled, so the server-side (Cloudflare/Discord)
eventually considers the client unresponsive and stops sending events.

Additionally, websocket read errors (`Some(Err(_))`) were silently
swallowed by the same catch-all, preventing reconnection on transient
failures.

This patch:
- Responds to `Message::Ping` with `Message::Pong` to maintain the
  websocket keepalive contract
- Breaks the event loop on `Some(Err(_))` with a warning log, allowing
  the supervisor to reconnect

Closes #2896

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 11:15:53 -04:00
simianastronaut
98d6c5af9e fix(gateway): restore multi-source WebSocket auth token extraction (#2884)
The Electric Blue dashboard (PR #2804) sends the pairing token as a
?token= query parameter, but the WS handler only checked that single
source. Earlier PR #2193 had established a three-source precedence
chain (header > subprotocol > query param) which was lost.

Add extract_ws_token() with the documented precedence:
  1. Authorization: Bearer <token> header
  2. Sec-WebSocket-Protocol: bearer.<token> subprotocol
  3. ?token=<token> query parameter

This ensures browser-based clients (which cannot set custom headers)
can authenticate via query param or subprotocol, while non-browser
clients can use the standard Authorization header.

Includes 9 unit tests covering each source, precedence ordering,
and empty-value fallthrough.

Closes #2884

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 11:01:26 -04:00
simianastronaut
c51ca19dc1 fix(channel): remove shadowed variable bindings in test functions (#2443)
Rename shadowed `histories` and `store` bindings in three test functions
to eliminate variable shadows that are flagged under stricter lint
configurations (clippy::shadow_unrelated). The initial bindings are
consumed by struct initialization; the second bindings that lock the
mutex guard are now named distinctly (`locked_histories`, `cleanup_store`).

Closes #2443

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 11:00:31 -04:00
simianastronaut
ea6abc9f42 feat(provider): make HTTP request timeout configurable (#2926)
The provider HTTP request timeout was hardcoded at 120 seconds in
`OpenAiCompatibleProvider::http_client()`. This makes it configurable
via the `provider_timeout_secs` config key and the
`ZEROCLAW_PROVIDER_TIMEOUT_SECS` environment variable, defaulting
to 120s for backward compatibility.

Changes:
- Add `provider_timeout_secs` field to Config with serde default
- Add `ZEROCLAW_PROVIDER_TIMEOUT_SECS` env var override
- Add `timeout_secs` field and `with_timeout_secs()` builder on
  `OpenAiCompatibleProvider`
- Add `provider_timeout_secs` to `ProviderRuntimeOptions`
- Thread config value through agent loop, channels, gateway, and tools
- Use `compat()` closure in provider factory to apply timeout to all
  compatible providers

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 10:40:18 -04:00
simianastronaut
e2f6f20bfb feat(agent): add tool_call_dedup_exempt config to bypass within-turn dedup (#2978)
Add `agent.tool_call_dedup_exempt` config key (list of tool names) to
allow specific tools to bypass the within-turn identical-signature
deduplication check in run_tool_call_loop. This fixes the browser
snapshot polling use case where repeated calls with identical arguments
are legitimate and should not be suppressed.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 10:28:42 -04:00
simianastronaut
88df3d4b2e feat(channel): add channel send CLI command for outbound messages (#2907)
Add a `zeroclaw channel send` subcommand that sends a one-off message
through a configured channel without starting the full agent loop.
This enables hardware sensor triggers (e.g., range sensors on
Raspberry Pi) to push notifications to Telegram and other platforms.

Usage:
  zeroclaw channel send 'Alert!' --channel-id telegram --recipient <chat_id>

Supported channels: telegram, discord, slack.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 10:18:04 -04:00
simianastronaut
0dba55959d fix(agent): use byte-level stdin reads to prevent CJK input crash
When running through a PTY chain (kubectl exec, SSH, remote terminals),
the transport layer may split data frames at space (0x20) boundaries,
interrupting multi-byte UTF-8 characters mid-sequence. Rust's
BufRead::read_line requires valid UTF-8 and returns InvalidData
immediately, crashing the interactive agent loop.

Replace stdin().read_line() with byte-level read_until(b'\n') followed
by String::from_utf8_lossy() in both the main input loop and the
/clear confirmation prompt. This reads raw bytes without UTF-8
validation during transport, then does lossy conversion (replacing any
truly invalid bytes with U+FFFD instead of crashing).

Also set ENV LANG=C.UTF-8 in both Dockerfile stages as defense-in-depth
to ensure the container locale defaults to UTF-8.

Closes #2984

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 10:08:49 -04:00
SimianAstronaut7
893788f04d
fix(security): strip URLs before high-entropy token extraction (#3064) (#3321)
The credential leak detector's check_high_entropy_tokens would
false-positive on URL path segments (e.g. long alphanumeric filenames)
because extract_candidate_tokens included '/' in the token character
set, creating long mixed-alpha-digit tokens that exceeded the Shannon
entropy threshold.

Fix: strip URLs from content before extracting candidate tokens for
entropy analysis. Structural pattern checks (API keys, JWTs, AWS
credentials) use dedicated regexes and are unaffected.

Closes #3064

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 13:53:38 +00:00
SimianAstronaut7
0fea62d114
fix(tool): resolve Brave API key lazily with decryption support (#3078) (#3320)
WebSearchTool previously stored the Brave API key once at boot and never
re-read it. This caused three failures: (1) keys set after boot via
web_search_config were ignored, (2) encrypted keys passed as raw enc2:
blobs to the Brave API, and (3) keys absent at startup left the tool
permanently broken.

The fix adds lazy key resolution at execution time. A fast path returns
the boot-time key when it is plaintext and non-empty. When the boot key
is missing or still encrypted, the tool re-reads config.toml, decrypts
the value through SecretStore, and uses the result. This also means
runtime config updates (e.g. `web_search_config set brave_api_key=...`)
are picked up on the next search invocation.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 13:53:35 +00:00
SimianAstronaut7
cca3cf8f84
fix(agent): use char-boundary-safe slicing in scrub_credentials (#3024) (#3319)
Replace byte-level `&val[..4]` slice with `char_indices().nth(4)` to
prevent a panic when the captured credential value contains multi-byte
UTF-8 characters (e.g. Chinese text).  Adds a regression test with
CJK input.

Closes #3024

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 13:53:32 +00:00
SimianAstronaut7
7170810e98
fix(channel): drop MutexGuard before .await in WhatsApp Web listen (#3315)
Extract `self.bot_handle.lock().take()` into a separate `let` binding
so the parking_lot::MutexGuard is dropped before the `.await`, making
the listen future Send again.

Closes #3312

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 13:16:40 +00:00
SimianAstronaut7
816fa87d60
fix(channel): restore Lark/Feishu channel compilation (#3318)
Replace the `use_feishu: bool` field on `LarkChannel` with a
`platform: LarkPlatform` enum field, add `mention_only` to the
`new_with_platform` constructor, and introduce `from_lark_config` /
`from_feishu_config` factory methods so the channel factory in
`mod.rs` and the existing tests compile.

Resolves #3302

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 13:16:31 +00:00
Argenis
dcffa4d7fb
fix(ci): add missing ci-run.yml workflow (#3268)
The ci-run.yml workflow was referenced in docs/contributing/ci-map.md
and branch protection rules but never existed in the repository,
causing push-triggered CI runs to fail immediately with zero jobs
and no logs.

This adds the workflow with all documented jobs: lint, strict delta
lint, test, build (linux + macOS), docs quality, and the CI Required
Gate composite check. Triggers on both push and pull_request to master.

Fixes #2853

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 12:48:16 +00:00
Argenis
e03dc4bfce
fix(security): unify cron shell validation across API/CLI/scheduler (#3270)
Centralize cron shell command validation so all entrypoints enforce the
same security policy (allowlist + risk gate + approval) before
persistence and execution.

Changes:
- Add validate_shell_command() and validate_shell_command_with_security()
  as the single validation gate for all cron shell paths
- Add add_shell_job_with_approval() and update_shell_job_with_approval()
  that validate before persisting
- Add add_once_validated() and add_once_at_validated() for one-shot jobs
- Make raw add_shell_job/add_job/add_once/add_once_at pub(crate) to
  prevent unvalidated writes from outside the cron module
- Route gateway API through validated creation path
- Route schedule tool through validated helpers (single validation)
- Route cron_add/cron_update tools through validated helpers
- Unify scheduler execution validation via validate_shell_command_with_security
- CLI update handler uses full validate_command_execution instead of
  just is_command_allowed
- Add focused tests for validation parity across entrypoints
- Standardize error format to "blocked by security policy: {reason}"

Closes #2741
Closes #2742

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 12:48:13 +00:00
Darren.Zeng
2cb57e6b2d
fix(cli): honor config default_temperature in agent command (#3106)
* fix(cli): honor config default_temperature in agent command

Fixes #3033

The agent command was using a hardcoded default_value of 0.7 for the
--temperature parameter, which ignored the default_temperature setting
in the config file.

Changes:
- Changed temperature from f64 to Option<f64>
- Removed hardcoded default_value
- Use config.default_temperature when --temperature is not provided

Users can now set default_temperature in config.toml and have it
honored when running 'zeroclaw agent' without --temperature.

Risk: low (behavior change: now honors config instead of hardcoded value)

* fix: resolve moved value error for config in agent command

Extract final_temperature before passing config to agent::run() to
avoid use-of-moved-value error (config is moved as the first argument
while config.default_temperature was being accessed in the same call).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: argenis de la rosa <theonlyhennygod@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 00:34:32 -04:00
Alix-007
079e972c81
fix(release): use master in cut_release_tag helper (#3249)
Co-authored-by: argenis de la rosa <theonlyhennygod@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-12 00:20:36 -04:00
Argenis
448682c440
feat(providers): add Azure OpenAI provider support (#3246)
Closes #3176
2026-03-12 00:06:11 -04:00
Argenis
3fe3fe23b1
feat(build): add 32-bit system support via feature gates (#3245)
Closes #3174
2026-03-12 00:06:08 -04:00
Alix-007
9e9052634d
fix(ci): use master merge-base in collect_changed_links (#3250)
Co-authored-by: Argenis <theonlyhennygod@gmail.com>
2026-03-11 23:45:36 -04:00
Alix-007
b227b1abc3
fix(docs): correct default branch in triage snapshot (#3253)
Co-authored-by: Argenis <theonlyhennygod@gmail.com>
2026-03-11 23:45:34 -04:00
Alix-007
fb9501afd5
fix(install): guard empty docker namespace args on Bash 3.2 (#3248)
Co-authored-by: Argenis <theonlyhennygod@gmail.com>
2026-03-11 23:43:21 -04:00
Alix-007
9df0a76f4a
fix(ci): use master merge-base in docs_quality_gate (#3251)
Co-authored-by: Argenis <theonlyhennygod@gmail.com>
2026-03-11 23:43:19 -04:00
Alix-007
9501555448
fix(ci): use master merge-base in rust_strict_delta_gate (#3252)
Co-authored-by: Argenis <theonlyhennygod@gmail.com>
2026-03-11 23:43:16 -04:00
Alix-007
138ada0fd6
fix(release): lower GNU Linux build runner baseline (#3257)
Co-authored-by: Argenis <theonlyhennygod@gmail.com>
2026-03-11 23:43:14 -04:00
Darren.Zeng
3c2c5aa78c
feat(web): add auto-expanding multiline chat composer (#3185)
* feat(web): add auto-expanding multiline chat composer

Convert chat input from single-line input to auto-expanding textarea:

- Replace input with textarea for multiline support
- Add auto-resize logic that grows up to max height (200px)
- Enter sends message, Shift+Enter adds new line
- Reset height after sending message
- Add overflow-y-auto for scrolling when max height reached
- Update placeholder to indicate Shift+Enter for new line

Fixes #3119

* fix: run cargo fmt to fix lint CI

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* chore: trigger CI re-run

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: argenis de la rosa <theonlyhennygod@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 23:35:38 -04:00
ImanHashemi
273bd00d08
feat(hooks): add webhook-audit builtin hook (#3212)
Co-authored-by: ImanHashemi
2026-03-11 23:34:17 -04:00
Darren.Zeng
546873e2bb
feat(cargo): add channel-feishu feature alias for channel-lark (#3105)
Fixes #3012

Adds a feature alias `channel-feishu` that maps to `channel-lark`.
This makes it easier for Feishu users to discover and enable the feature,
as Lark and Feishu are the same platform with different branding.

Users can now use either:
  cargo install zeroclaw --features channel-lark
  cargo install zeroclaw --features channel-feishu

Risk: low (feature alias only, no functional change)

Co-authored-by: Argenis <theonlyhennygod@gmail.com>
2026-03-11 23:28:03 -04:00
Darren.Zeng
bf7f568eef
docs(contributing): update contributing guide (#3109)
Co-authored-by: Argenis <theonlyhennygod@gmail.com>
2026-03-11 23:28:01 -04:00
Darren.Zeng
d33b73974b
feat(provider): add vision support for Anthropic provider (#3177)
Add vision support to Anthropic provider to enable image understanding:

- Add ImageSource struct for Anthropic's image content block format
- Add Image variant to NativeContentOut enum
- Implement capabilities() returning vision: true
- Update convert_messages() to parse [IMAGE:...] markers and convert
them to Anthropic's native image content blocks
- Support both data URIs and local file paths
- Add comprehensive tests for vision functionality

Fixes #3163

Co-authored-by: Argenis <theonlyhennygod@gmail.com>
2026-03-11 23:27:58 -04:00
Darren.Zeng
a99ec631aa
feat(web): add per-message copy button on hover (#3178)
* feat(web): add per-message copy button on hover

Add a copy affordance to each chat message that appears on hover/focus:

- Add MessageCopyButton component with clipboard integration
- Button appears on hover/focus for both user and agent messages
- Shows checkmark icon briefly after successful copy
- Graceful fallback for non-secure contexts
- Keyboard accessible with focus state
- Does not disrupt message layout or text selection

Fixes #3120

* fix: run cargo fmt to fix lint CI

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* chore: trigger CI re-run

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: argenis de la rosa <theonlyhennygod@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 23:27:56 -04:00
Darren.Zeng
ab6846cb9f
feat(channel): make email subject configurable (#3190)
Add default_subject field to EmailConfig to allow users to customize
the default subject line for outgoing emails. Previously hardcoded as
"ZeroClaw Message".

- Add default_subject field with serde default
- Update send() method to use configured default
- Add tests for new functionality

Fixes #2878

Co-authored-by: Argenis <theonlyhennygod@gmail.com>
2026-03-11 23:27:52 -04:00
Darren.Zeng
ca683816a7
fix(config): fix web_fetch allowed_domains serde default (#3192)
When [web_fetch] section was specified in config without explicit
allowed_domains, serde used Vec::default() (empty vector) instead of
the wildcard ["*"] default. This caused all web fetch requests to be
rejected unexpectedly.

Fix by adding explicit serde default function that returns vec!["*"].

Fixes #2941

Co-authored-by: Argenis <theonlyhennygod@gmail.com>
2026-03-11 23:27:49 -04:00
Argenis
59014641bf
fix(web): update rollup to patch path traversal vulnerability (#3258)
Updates rollup to 4.59.0 to fix CVE-2026-27606, an arbitrary file write
via path traversal vulnerability (GHSA-mw96-cpmx-2vgc).
Resolves Dependabot alert #2.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 23:24:04 -04:00
Darren.Zeng
8d7abb73e7
fix(config): accept openai-* aliases for wire_api config (#3191)
Add support for openai-responses, open-ai-responses, openai-chat-completions,
and open-ai-chat-completions as aliases for wire_api configuration values.
This aligns the parser with documented values that users expect to work.

Fixes #2735
2026-03-11 22:09:50 -04:00
Darren.Zeng
7ddf10f86d
feat(onboard): add --reinit flag to prevent accidental config overwrite (#3102)
* feat(onboard): add --reinit flag to prevent accidental config overwrite

Add --reinit flag to onboard command that:
- Backs up existing ~/.zeroclaw directory with timestamp
- Starts fresh initialization after backup
- Requires --interactive mode to work
- Prevents accidental configuration loss

This addresses issue #3013 where onboard could accidentally
overwrite all configuration without warning.

Closes #3013

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>

* fix(ci): SHA-pin all third-party GitHub Actions

Replace mutable version tags with immutable commit SHAs to prevent
tag-hijacking supply chain attacks (P1 finding).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* chore: retrigger CI after startup_failure

* fix(onboard): address PR #3102 review issues for --reinit flag

- Use resolve_runtime_dirs_for_onboarding() instead of hardcoded ~/.zeroclaw
- Remove unsafe relative path fallback, bail instead
- Add user confirmation prompt before reinitializing config
- Update docs/reference/cli/commands-reference.md with --reinit docs

* style: fix cargo fmt and clippy violations

- Fix import ordering in src/config/mod.rs (rustfmt)
- Collapse single-arg encrypt/decrypt calls in src/config/schema.rs (rustfmt)
- Box::pin large onboard futures to fix clippy::large_futures in src/main.rs

These violations were blocking CI lint checks.

---------

Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Simian Astronaut 7 <simianastronaut7@gmail.com>
2026-03-11 22:09:39 -04:00
Darren.Zeng
6352f024b2
docs(readme): add features documentation (#3107) 2026-03-11 22:09:36 -04:00
Darren.Zeng
ef2f8e9808
fix(daemon): handle SIGTERM for graceful shutdown (#3193)
The daemon previously only handled SIGINT (Ctrl+C), ignoring SIGTERM
which is the standard termination signal used by Docker, Kubernetes,
and systemd. This caused containers to wait for the grace period
then be force-killed with SIGKILL.

Now the daemon handles both SIGINT and SIGTERM for graceful shutdown.

Fixes #2529
2026-03-11 22:09:34 -04:00
Darren.Zeng
6f482051ec
docs: replace main branch references with master (#3194)
The repository uses master as the default branch, but many documentation
files and scripts were referencing main branch URLs that would 404.

Updated all references from zeroclaw-labs/zeroclaw/main to
zeroclaw-labs/zeroclaw/master in README files and documentation.

Fixes #2929
Fixes #3061

Co-authored-by: Argenis <theonlyhennygod@gmail.com>
2026-03-11 21:54:08 -04:00
Argenis
8bed9c5485
fix(ci,release): update all scripts from origin/main to origin/master (#3238)
* fix(ci): update rust_strict_delta_gate.sh to reference origin/master

* fix(ci): update docs_quality_gate.sh to reference origin/master

* fix(ci): update collect_changed_links.py to reference origin/master

* fix(release): update cut_release_tag.sh to reference origin/master
2026-03-11 21:53:48 -04:00
Argenis
6861664258
docs: add branching model section to CONTRIBUTING.md (#3237)
Adds a prominent Branching Model section to CONTRIBUTING.md explaining:
- master is the sole default branch (main has been removed)
- Contributors should use feat/* and fix/* branches off master
- Historical context for the migration (refs #2929, #3061, #3194)

Also updates fork workflow instructions to show both feat/ and fix/ prefixes.
2026-03-11 21:53:38 -04:00
Argenis
5cf1c77531
style: fix cargo fmt violations blocking CI lint (#3244)
* style: fix cargo fmt in ollama.rs test

* style: cargo fmt dispatcher.rs

* style: cargo fmt loop_.rs

* style: cargo fmt schema.rs

* style: cargo fmt mod.rs

* style: cargo fmt ws.rs

* style: cargo fmt ollama.rs
2026-03-11 21:53:25 -04:00
Argenis
483d3c0853
fix(ollama): strip <think> tags from Qwen responses and validate tool calls (#3079) (#3241)
- Strip `<think>...</think>` blocks in parse_tool_calls(), XmlToolDispatcher,
  and OllamaProvider before processing tool-call XML
- Add effective_content() fallback: when content is empty after stripping
  think tags, check the thinking field for tool-call XML
- Add strip_think_tags() to ollama.rs, loop_.rs, and dispatcher.rs
- Add comprehensive tests for think-tag stripping and tool-call parsing

Fixes #3079

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 20:35:33 -04:00
Argenis
f5cd6baec3
fix(gateway): add /api/integrations/settings, echo WS protocol, persist session ID (#3242)
- #3009: Add handle_api_integrations_settings endpoint returning JSON with
  per-integration enabled/category/status so /api/integrations/settings no
  longer falls through to the SPA static handler.

- #3010: Extract Sec-WebSocket-Protocol header in handle_ws_chat and echo
  back "zeroclaw.v1" via ws.protocols() when the client requests it.

- #3038: Generate a persistent session_id (crypto.randomUUID stored in
  sessionStorage) in the web WS client and pass it as a query parameter.
  Add session_id: Option<String> to WsQuery on the server side so the
  backend can key conversations by session across reconnects.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 20:35:27 -04:00
Argenis
dfe0221e49
feat(web): auto-expanding chat textarea and copy-on-hover for messages (#3119, #3120) (#3243)
Replace single-line <input> with auto-expanding <textarea> (min 44px,
max 200px) that resizes on each keystroke. Add a copy button that
appears on hover for every message bubble, showing a checkmark on
successful clipboard write.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 20:35:22 -04:00
Argenis
12cfe48047
fix(config): add channel secrets roundtrip test and gitignore entries (#3126) (#3240)
Add a test verifying Telegram bot_token is encrypted on save and
decrypted on load, and add .gitignore entries for local state backups.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 20:35:11 -04:00
Jaime Linares
0479bfca36
fix(channel): reconnect WhatsApp Web session and show QR on logout (#3045)
Wraps the WhatsApp Web listen() in a reconnect loop. When Event::LoggedOut
fires, the bot is torn down, stale session DB files are deleted, and after
exponential backoff (3s base, 300s cap, max 10 retries) the loop restarts
triggering fresh QR pairing.

- Broadcast channel for logout signaling from event handler to listen loop
- Session file cleanup (primary + WAL + SHM) only on explicit LoggedOut
- Proper resource ordering: client lock, abort handle, drop bot/device
- Tests for retry delay, counter, session purge, and file paths helpers
2026-03-11 20:30:28 -04:00
Argenis
67e581d8ae
fix(gateway): add health and pairing proxy routes to vite dev server (#3056)
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 20:08:22 -04:00
Argenis
195c7ba919
fix(agent): resolve display text for tool-only turns (#3054) 2026-03-11 20:08:16 -04:00
James Cowan
bb66df5276
fix(config): encrypt and decrypt all channel secrets on save/load (#3217)
Adds symmetric encrypt/decrypt calls for all channel secret fields in
Config::save() and Config::load_or_init(). Previously only nostr.private_key
was handled, leaving all other channel secrets (bot_token, app_token,
access_token, api_token, password, etc.) and gateway.paired_tokens stored
as plaintext when secrets.encrypt = true.

Closes #3175, closes #3173.

Co-authored-by: jameslcowan
2026-03-11 20:02:20 -04:00
Thomas Tuffin
d47f6703d8
fix: revert Rust bump to 1.93 in Dockerfile (#3208)
Reverts Dockerfile from rust:1.94-slim to rust:1.93-slim.

Rust 1.94 triggers a recursion limit overflow in matrix-sdk 0.16.0
(rust-lang/rust#152942, matrix-org/matrix-rust-sdk#6254). No upstream
fix available yet. CI uses Rust 1.92 and is unaffected.

Fixes #3207
2026-03-11 19:38:36 -04:00
Alan P John
74fe29d772
feat: Add Opencode-go provider (#3113)
Adds opencode-go as a first-class provider with dedicated API endpoint,
env var, onboarding wizard wiring, and test coverage.

CI failures are pre-existing on master (Rust 1.94 formatting/lint changes per #3207).
2026-03-11 19:35:43 -04:00
Argenis
86ca34ac1f
fix(tests): update assertions for live tool call notifications (#3232)
The live tool call notifications feature sends extra messages to the
channel when tools are invoked. Update 5 tests that assumed exactly
1 sent message to instead check the last message in the list.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 19:26:45 -04:00
Argenis
f0f0f80895
feat(matrix): add multi-room support (#3224)
Enable a single Matrix bot instance to respond in multiple rooms:
- Disable the room_id filter so messages from all joined rooms are processed
- Embed room_id in reply_target as "user||room_id" for routing replies
- Include room_id in channel field for per-room conversation isolation
- Extract room_id from recipient in send() for correct message routing

The configured room_id still serves as a fallback for direct sends
without a "||" separator in the recipient.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 19:13:18 -04:00
Argenis
01bcba83ad
feat(matrix): add file upload handling and voice message support (#3222) 2026-03-11 19:12:44 -04:00
Argenis
4da85cee6e
chore(github): update review ownership routing (#3216)
- Add @theonlyhennygod as first-listed code owner on all CODEOWNERS paths
- Add SimianAstronaut7 as maintainer with PR approval authority in docs
- Normalize WORKFLOW_OWNER_LOGINS casing to canonical GitHub logins
2026-03-11 19:11:53 -04:00
Argenis
bdc0f325bf
feat(matrix): add pin and unpin message support (#3220)
Implement pin_message and unpin_message for the Matrix channel using
the m.room.pinned_events state event. Adds default no-op trait methods
to the Channel trait so other channel implementations are unaffected.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 19:10:08 -04:00
Abdullah Imad
248348bd80
feat(matrix): reaction and threading support (#3219)
Implement add_reaction/remove_reaction for Matrix channel using
ReactionEventContent and redaction. Add threading support via
Relation::Thread in send() and thread_ts extraction from incoming
messages, enabling threaded conversations.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Argenis <theonlyhennygod@gmail.com>
2026-03-11 19:07:49 -04:00
Abdullah Imad
bd70c0f45b
feat(observer): live tool call notifications (#3221)
Add ChannelNotifyObserver that wraps the observer to forward tool-call
events as real-time threaded messages on messaging channels. Include
tool arguments (truncated) in ToolCallStart events for better
visibility into what tools are doing. Auto-thread final replies when
tools were used.

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Argenis <theonlyhennygod@gmail.com>
2026-03-11 19:07:34 -04:00
Abdullah Imad
d0edcec1f9
feat(prompt): refresh stale datetime (#3223)
The system prompt is built once at daemon startup and cached. The
"Current Date & Time" section becomes stale immediately. This patch
replaces it with a fresh timestamp every time build_channel_system_prompt
is called (i.e. per incoming message).

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 19:04:23 -04:00
SimianAstronaut7
9423b9c94e
Merge pull request #3215 from zeroclaw-labs/feat-readme-fix
docs(readme): remove retired social links
2026-03-11 22:30:30 +00:00
SimianAstronaut7
95da0062de
chore: bump version to 0.1.9 for stable release (#3225)
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 18:23:38 -04:00
SimianAstronaut7
7634d8d1f8
Merge branch 'master' into feat-readme-fix 2026-03-11 20:50:56 +00:00
JordanTheJet
744620bc34
Merge pull request #3214 from zeroclaw-labs/artifact-enhancement
fix(ci): downgrade action-gh-release to v2.4.2 to fix release finalization
2026-03-11 15:53:44 -04:00
argenis de la rosa
f67abd9fc8 docs(readme): remove retired social links
Remove WeChat, Xiaohongshu, and Telegram from the README social badges and aligned locale entry points.
2026-03-11 15:48:43 -04:00
SimianAstronaut7
94a4d72e71
Merge branch 'master' into artifact-enhancement 2026-03-11 19:41:45 +00:00
Aleksandr Prilipko
39353748fa
fix(memory): resolve embedding api_key from embedding_provider env var, not default_provider key (#3184)
When embedding_provider differs from default_provider (e.g. default=gemini,
embedding=openai), the caller-supplied api_key belongs to the chat provider.
Passing it to the embedding endpoint causes 401 Unauthorized (gemini key
sent to api.openai.com/v1/embeddings).

Add embedding_provider_env_key() which looks up OPENAI_API_KEY,
OPENROUTER_API_KEY, or COHERE_API_KEY before falling back to the
caller-supplied key. This matches the provider-specific env var resolution
in providers/mod.rs without introducing cross-module coupling.

Also add config_secrets_survive_save_load_roundtrip test: full save→load
cycle with channel credentials (telegram, discord, slack bot_token,
slack app_token) and gateway paired_tokens, verifying that enc2: values
are correctly decrypted by Config::load_or_init(). Regression guard for
issues #3173 and #3175.

Closes #3083

Co-authored-by: ZeroClaw Bot <zeroclaw_bot@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Argenis <theonlyhennygod@gmail.com>
2026-03-11 15:39:54 -04:00
Simian Astronaut 7
71f013af49 chore: update action-gh-release to version 2.4.2 in release workflows 2026-03-11 15:39:00 -04:00
Argenis
74f411bd2b
Merge pull request #3198 from panviktor/fix/channel-model-switch-resolve-routes
fix(channels): resolve provider from model_routes on /model, preserve context
2026-03-11 13:48:30 -04:00
Argenis
50a65ea0e5
Merge branch 'master' into fix/channel-model-switch-resolve-routes 2026-03-11 13:32:15 -04:00
Argenis
655e5fd56a
Merge pull request #3204 from zeroclaw-labs/codex/add-simian-codeowners-master
chore(codeowners): add SimianAstronaut7 to review routing
2026-03-11 12:33:14 -04:00
argenis de la rosa
072a10eff4 chore(codeowners): add SimianAstronaut7 to review routing 2026-03-11 12:32:49 -04:00
SimianAstronaut7
9baa02a40b
Merge pull request #3203 from zeroclaw-labs/build-artifact-fix
fix(ci): exclude stale artifacts from release and Docker builds
2026-03-11 16:31:35 +00:00
Simian Astronaut 7
0af0f0344e fix: remove data directory copy from Dockerfile and update artifact download patterns in release workflows 2026-03-11 12:20:54 -04:00
Argenis
9cfc88c38c
Merge pull request #3201 from whtiehack/pr/fix-channel-embedding-routes
fix(channels): pass embedding routes to channel memory init
2026-03-11 12:07:53 -04:00
Argenis
87127e2a08
Merge pull request #3200 from whtiehack/pr/fix-strip-tool-result-display
fix(agent): strip prompt-guided tool artifacts from visible replies
2026-03-11 12:07:08 -04:00
smallwhite
7cacdef2d1 fix(channels): pass embedding routes to channel memory init 2026-03-11 23:57:46 +08:00
SimianAstronaut7
9469b5bdbe
Merge pull request #3199 from zeroclaw-labs/e2e-testing
test: restructure test suite into five-level taxonomy. cleaned up + documented
2026-03-11 15:54:38 +00:00
Simian Astronaut 7
ecbe5e2c68 docs(testing): note both co-located and separate unit test patterns
The codebase uses both #[cfg(test)] blocks in source files (194 files)
and separate tests.rs files (src/agent/tests.rs). Document both.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 11:52:09 -04:00
Simian Astronaut 7
86a2fc2594 docs(testing): clarify unit tests are co-located in source files
Unit tests use #[cfg(test)] mod tests blocks inside src/**/*.rs,
not separate tests.rs files.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 11:50:47 -04:00
smallwhite
1c2a49459e fix(agent): strip prompt-guided tool artifacts from visible replies 2026-03-11 23:50:39 +08:00
SimianAstronaut7
e1f37a307e
Merge branch 'master' into e2e-testing 2026-03-11 15:26:19 +00:00
Argenis
fd40eeb408
Merge pull request #3182 from rareba/fix/cargo-fmt-config-imports
style: fix cargo fmt violations blocking all PR CI
2026-03-11 11:10:49 -04:00
panviktor
0ee3b6d617 fix(channels): resolve provider from model_routes on /model, preserve context
- `/model <name>` now auto-resolves provider from configured model_routes
  by matching model name or hint, fixing 404 when switching to models on
  different providers (e.g. `/model kimi-k2.5` with anthropic default)
- Conversation history is no longer cleared on `/model` or `/models` —
  users can explicitly reset via `/new`
- Matrix channel now supports `/model`, `/models`, and `/new` commands
- `/model` (no args) lists configured model routes with hints

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 16:10:42 +01:00
Simian Astronaut 7
d0781f31a5 test: restructure test suite into five-level taxonomy
Reorganize the flat tests/ directory into a structured taxonomy:
- component/ (174 tests): single-subsystem tests
- integration/ (65 tests): multi-component wiring tests
- system/ (5 tests): full-stack with real SQLite memory
- live/ (5 tests, #[ignore]): real external API tests
- manual/: shell/Python scripts for human-driven testing
- support/: shared mock infrastructure (MockProvider, EchoTool, etc.)
- fixtures/: JSON trace fixtures for declarative LLM replay

Deduplicates ~180 lines of identical mock code from agent_e2e.rs and
agent_loop_robustness.rs into shared support modules. Adds new component
tests for gateway (HMAC signature verification) and security (config
defaults, TOML round-trips). Adds system-level tests exercising the
full agent turn cycle with real SQLite memory.

Updates Cargo.toml with [[test]] entries, dev/ci.sh with level-specific
commands, and docs/contributing/testing.md with a comprehensive testing
guide.

Zero impact on src/.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 11:03:33 -04:00
SimianAstronaut7
a3bc7a04dd
Merge pull request #3196 from zeroclaw-labs/hardening-gateway-slack-tts
fix(multi): harden gateway auth, Slack file handling, and TTS subprocess safety
2026-03-11 14:32:40 +00:00
Simian Astronaut 7
5901f70dc0 fix(clippy): box-pin large onboard wizard futures
The onboard wizard futures exceed clippy's large_futures threshold
(16KB+). Wrap in Box::pin to heap-allocate and fix the lint.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 10:21:46 -04:00
Simian Astronaut 7
026158557c chore: fix rustfmt violations from TTS provider merge
Import ordering in config/mod.rs and line-wrapping in config/schema.rs
were left unformatted by PR #2994. Run cargo fmt to fix.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 10:09:06 -04:00
Simian Astronaut 7
162a2b65a5 fix(multi): harden Edge TTS path validation, add subprocess timeout, restore Slack symlink protection
Edge TTS: reject binary_path containing path separators to prevent
arbitrary executable paths bypassing the basename allowlist. Wrap
subprocess execution in tokio::time::timeout (60s) matching HTTP
providers.

Slack: restore symlink_metadata check before rename to prevent
symlink-following attacks on attachment output paths.

Docs: add TTS module misplacement note to refactor-candidates.md —
tts.rs belongs in src/tools/, not src/channels/.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 10:05:52 -04:00
Simian Astronaut 7
e6bfb1ebee fix(slack): add HTTP timeouts, proper jitter, and cache bounds
HTTP clients had no timeouts and could hang forever; add 30s total and
10s connect timeouts. Replace clock-nanos-based jitter with rand::random
for proper randomness. Add a 1000-entry cap to the user display name
cache with expired-entry pruning. Fix truncate_text to avoid scanning
the full string twice when checking for truncation.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 08:58:20 -04:00
Simian Astronaut 7
9a6de3b204 fix(tts): move Google API key to header and sanitize provider inputs
Google TTS was passing the API key as a URL query parameter, which can
appear in logs and proxy access records. Move it to the x-goog-api-key
header instead. Add input validation for ElevenLabs voice IDs (reject
non-alphanumeric/dash/underscore characters) and restrict Edge TTS
binary_path to allowed basenames (edge-tts, edge-playback).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 08:58:13 -04:00
Simian Astronaut 7
f894bc4140 fix(slack): only send bearer token on first redirect hop
Slack private file fetching was sending the bot token on every redirect
hop. Since Slack CDN redirects use pre-signed URLs, sending the bearer
token to CDN hosts is unnecessary credential exposure.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 08:58:05 -04:00
Simian Astronaut 7
a1c65558d8 fix(gateway): restrict admin endpoints to localhost and replace process::exit with graceful shutdown
Admin endpoints (/admin/shutdown, /admin/paircode, /admin/paircode/new) were
completely unauthenticated, allowing any network client to shut down the gateway
or read/generate pairing codes. Add require_localhost() guard that returns 403
for non-loopback IPs.

Replace std::process::exit(0) in shutdown handler with a tokio watch channel
for graceful shutdown, allowing proper destructor cleanup and connection
draining. Replace the 500ms sleep race in the restart command with a poll loop
that waits for the port to actually become free.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 08:57:34 -04:00
Giulio V
ed7c191fcd style: fix cargo fmt + clippy violations from TTS merge
- config/mod.rs: reorder TTS imports alphabetically (rustfmt)
- config/schema.rs: collapse single-arg encrypt/decrypt calls (rustfmt)
- main.rs: Box::pin large onboard futures to fix clippy::large_futures

These violations were introduced by the TTS providers merge (#2994)
and are blocking CI lint on all open PRs targeting master.
2026-03-11 11:41:11 +01:00
Giulio V
69abd217d7 style: fix cargo fmt violations in config module
The TTS providers merge (#2994) introduced import ordering and
line-wrapping that doesn't match rustfmt output, causing lint
failures on all open PRs targeting master.
2026-03-11 11:30:56 +01:00
Argenis
f2035819c2
Merge pull request #2994 from rareba/feature/tts-providers
feat(tts): add multi-provider TTS system
2026-03-11 05:23:50 -04:00
Argenis
46d68fc8ba
Merge pull request #3067 from kunalk16/fix-honor-default-temperature-agent-command
fix(config): honor default_temperature config while running "zeroclaw agent" without temperature parameter
2026-03-11 04:50:31 -04:00
Kunal Karmakar
06926a189d Merge branch 'master' of https://github.com/kunalk16/zeroclaw into fix-honor-default-temperature-agent-command 2026-03-11 08:36:35 +00:00
dependabot[bot]
8f68afb70c chore(deps): bump rust in the docker-all group
Bumps the docker-all group with 1 update: rust.


Updates `rust` from 1.93-slim to 1.94-slim

---
updated-dependencies:
- dependency-name: rust
  dependency-version: 1.94-slim
  dependency-type: direct:production
  dependency-group: docker-all
...

Signed-off-by: dependabot[bot] <support@github.com>
2026-03-11 04:33:19 -04:00
Sid Jain
6016491985 fix(slack): harden redirect chain and auth health checks 2026-03-11 04:32:56 -04:00
Sid Jain
bf11b7b1a0 fix(slack): resolve clippy warnings in URL and tests 2026-03-11 04:32:56 -04:00
Sid Jain
436619b015 chore(slack): satisfy rustfmt wrap in MIME warning 2026-03-11 04:32:56 -04:00
Sid Jain
c6233b066c Revert "chore(docker): make cargo profile configurable for builds"
This reverts commit 942e6cb97452b3330827ea485473fc176e5bf103.
2026-03-11 04:32:56 -04:00
Sid Jain
b33af6894e fix(slack): redact private URLs and harden attachment writes 2026-03-11 04:32:56 -04:00
Sid Jain
f98a18502f chore(docker): make cargo profile configurable for builds 2026-03-11 04:32:56 -04:00
Sid Jain
14de5e527e fix(slack): preserve auth and validate image bytes 2026-03-11 04:32:56 -04:00
Sid Jain
7b7e08dc21 fix(slack): align cron workspace dir and socket retry backoff 2026-03-11 04:32:56 -04:00
Sid Jain
526d63fd75 feat: add slack file reading capability to zeroclaw 2026-03-11 04:32:56 -04:00
dependabot[bot]
e20e0f7cb0 chore(deps): bump the rust-all group across 1 directory with 10 updates
Bumps the rust-all group with 10 updates in the / directory:

| Package | From | To |
| --- | --- | --- |
| [tokio](https://github.com/tokio-rs/tokio) | `1.49.0` | `1.50.0` |
| [toml](https://github.com/toml-rs/toml) | `1.0.3+spec-1.1.0` | `1.0.6+spec-1.1.0` |
| [shellexpand](https://gitlab.com/ijackson/rust-shellexpand) | `3.1.1` | `3.1.2` |
| [fantoccini](https://github.com/jonhoo/fantoccini) | `0.22.0` | `0.22.1` |
| [uuid](https://github.com/uuid-rs/uuid) | `1.21.0` | `1.22.0` |
| [chrono](https://github.com/chronotope/chrono) | `0.4.43` | `0.4.44` |
| [which](https://github.com/harryfei/which-rs) | `8.0.0` | `8.0.2` |
| [rustls](https://github.com/rustls/rustls) | `0.23.36` | `0.23.37` |
| [libc](https://github.com/rust-lang/libc) | `0.2.182` | `0.2.183` |
| [tempfile](https://github.com/Stebalien/tempfile) | `3.25.0` | `3.26.0` |



Updates `tokio` from 1.49.0 to 1.50.0
- [Release notes](https://github.com/tokio-rs/tokio/releases)
- [Commits](https://github.com/tokio-rs/tokio/compare/tokio-1.49.0...tokio-1.50.0)

Updates `toml` from 1.0.3+spec-1.1.0 to 1.0.6+spec-1.1.0
- [Commits](https://github.com/toml-rs/toml/compare/toml-v1.0.3...toml-v1.0.6)

Updates `shellexpand` from 3.1.1 to 3.1.2
- [Commits](https://gitlab.com/ijackson/rust-shellexpand/compare/shellexpand-3.1.1...shellexpand-3.1.2)

Updates `fantoccini` from 0.22.0 to 0.22.1
- [Commits](https://github.com/jonhoo/fantoccini/compare/v0.22.0...v0.22.1)

Updates `uuid` from 1.21.0 to 1.22.0
- [Release notes](https://github.com/uuid-rs/uuid/releases)
- [Commits](https://github.com/uuid-rs/uuid/compare/v1.21.0...v1.22.0)

Updates `chrono` from 0.4.43 to 0.4.44
- [Release notes](https://github.com/chronotope/chrono/releases)
- [Changelog](https://github.com/chronotope/chrono/blob/main/CHANGELOG.md)
- [Commits](https://github.com/chronotope/chrono/compare/v0.4.43...v0.4.44)

Updates `which` from 8.0.0 to 8.0.2
- [Release notes](https://github.com/harryfei/which-rs/releases)
- [Changelog](https://github.com/harryfei/which-rs/blob/master/CHANGELOG.md)
- [Commits](https://github.com/harryfei/which-rs/compare/8.0.0...8.0.2)

Updates `rustls` from 0.23.36 to 0.23.37
- [Release notes](https://github.com/rustls/rustls/releases)
- [Changelog](https://github.com/rustls/rustls/blob/main/CHANGELOG.md)
- [Commits](https://github.com/rustls/rustls/compare/v/0.23.36...v/0.23.37)

Updates `libc` from 0.2.182 to 0.2.183
- [Release notes](https://github.com/rust-lang/libc/releases)
- [Changelog](https://github.com/rust-lang/libc/blob/0.2.183/CHANGELOG.md)
- [Commits](https://github.com/rust-lang/libc/compare/0.2.182...0.2.183)

Updates `tempfile` from 3.25.0 to 3.26.0
- [Changelog](https://github.com/Stebalien/tempfile/blob/master/CHANGELOG.md)
- [Commits](https://github.com/Stebalien/tempfile/commits/v3.26.0)

---
updated-dependencies:
- dependency-name: tokio
  dependency-version: 1.50.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: rust-all
- dependency-name: toml
  dependency-version: 1.0.6+spec-1.1.0
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: rust-all
- dependency-name: shellexpand
  dependency-version: 3.1.2
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: rust-all
- dependency-name: fantoccini
  dependency-version: 0.22.1
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: rust-all
- dependency-name: uuid
  dependency-version: 1.22.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: rust-all
- dependency-name: chrono
  dependency-version: 0.4.44
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: rust-all
- dependency-name: which
  dependency-version: 8.0.2
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: rust-all
- dependency-name: rustls
  dependency-version: 0.23.37
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: rust-all
- dependency-name: libc
  dependency-version: 0.2.183
  dependency-type: direct:production
  update-type: version-update:semver-patch
  dependency-group: rust-all
- dependency-name: tempfile
  dependency-version: 3.26.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: rust-all
...

Signed-off-by: dependabot[bot] <support@github.com>
2026-03-11 04:32:15 -04:00
曾文锋0668000834
e3cc40c64c feat(gateway): add --new flag to get-paircode for non-disruptive pairing code generation
- Add --new flag to GetPaircode command in src/lib.rs
- Update main.rs to handle GetPaircode { new } parameter
- Add /admin/paircode/new POST endpoint in gateway/mod.rs
- Enhance documentation for constant_time_eq security function

Refs: #3015
2026-03-11 04:30:58 -04:00
曾文锋0668000834
9d182c6dd8 fix(security): restore constant-time comparison bitwise operator
The bitwise & operator is intentional in constant_time_eq() to prevent
timing side-channel attacks. Both comparisons must always execute to
ensure constant-time behavior regardless of the first comparison result.

- Revert logical && back to bitwise &
- Add #[allow(clippy::needless_bitwise_bool)] annotation
- Add explanatory comment documenting the intentional use
2026-03-11 04:30:58 -04:00
Darren.Zeng
d7d114eae7 Update package-lock.json 2026-03-11 04:30:58 -04:00
曾文锋0668000834
b1dfa192b8 fix(gateway): address CodeRabbit review feedback
- Add security warning for 0.0.0.0 binding in help text
- Implement proper gateway shutdown before restart via /admin/shutdown endpoint
- Fetch live pairing code from running gateway via /admin/paircode endpoint
- Extract duplicate code into helper functions
- Fix clippy warnings
2026-03-11 04:30:58 -04:00
曾文锋0668000834
7ed1bdc104 fix(gateway): address CodeRabbit review issues
- Add security warning for 0.0.0.0 binding in GatewayCommands::Start help
- Implement proper gateway shutdown via /admin/shutdown endpoint for Restart command
- Add /admin/paircode endpoint and update GetPaircode to fetch live pairing code
- Extract helper functions: resolve_gateway_addr, log_gateway_start, shutdown_gateway, fetch_paircode

Fixes: #3101
2026-03-11 04:30:58 -04:00
曾文锋0668000834
7f6bd651f7 fix(gateway): address CodeRabbit review feedback for PR #3101
- Fix Critical: Split illegal or-pattern (Some(...) | None) into separate match arms
- Fix Major: Implement restart command with graceful shutdown check
- Fix Major: Improve get-paircode to check gateway status and provide clear instructions
- Fix Minor: Update help text to document public-bind precondition

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2026-03-11 04:30:58 -04:00
曾文锋0668000834
491ac601e6 feat(gateway): add restart and get-paircode subcommands
Add GatewayCommands enum with three subcommands:
- start: Start the gateway server (default behavior preserved)
- restart: Restart the gateway server
- get-paircode: Show current pairing status without restarting

This improves gateway management by allowing users to:
1. Restart gateway without manual stop/start
2. Check pairing status without disrupting running gateway

Closes #3014
Closes #3015

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
2026-03-11 04:30:58 -04:00
dependabot[bot]
b4da085f0c chore(deps): bump actions/download-artifact from 4 to 8
Bumps [actions/download-artifact](https://github.com/actions/download-artifact) from 4 to 8.
- [Release notes](https://github.com/actions/download-artifact/releases)
- [Commits](https://github.com/actions/download-artifact/compare/v4...v8)

---
updated-dependencies:
- dependency-name: actions/download-artifact
  dependency-version: '8'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
2026-03-11 04:30:08 -04:00
Kunal Karmakar
0829bb92df Merge branch 'master' of https://github.com/kunalk16/zeroclaw into fix-honor-default-temperature-agent-command 2026-03-11 08:28:23 +00:00
曾文锋0668000834
7950f51bb9 fix(docker): add missing COPY data/ directive
Fixes #3063

The Dockerfile was missing the COPY directive for the data/ directory,
which contains security/attack-corpus-v1.jsonl required by the
semantic guard feature via include_str!() at compile time.

This caused Docker builds to fail with:
  error: couldn't read src/security/../../data/security/attack-corpus-v1.jsonl

Risk: low (build fix only, no runtime behavior change)
2026-03-11 04:23:21 -04:00
Jiecheng Wu
79cff39aa5 fix(docker): enable BuildKit and write config.toml via printf
- Dockerfile: replace heredoc with printf for config.toml to avoid
  Docker parse error (unknown instruction WORKSPACE_DIR)
- install.sh: set DOCKER_BUILDKIT=1 when building image so RUN --mount works
2026-03-11 04:21:44 -04:00
曾文锋0668000834
e94dafbbfb chore(examples): add example configuration 2026-03-11 04:18:55 -04:00
曾文锋0668000834
99460c3fff chore(editor): add .editorconfig 2026-03-11 04:18:41 -04:00
曾文锋0668000834
8e3775362e chore(git): add .gitattributes for line endings 2026-03-11 04:18:31 -04:00
曾文锋0668000834
1ef1b5d02d docs(changelog): add changelog 2026-03-11 04:18:16 -04:00
Kunal Karmakar
e2a3907d35 Merge branch 'master' of https://github.com/kunalk16/zeroclaw into fix-honor-default-temperature-agent-command 2026-03-11 08:08:22 +00:00
guan
980ad7ebc9 fix(docs): correct broken path references in contributing docs
Several files in docs/contributing/ referenced other docs files using
incorrect paths like "docs/pr-workflow.md" when the files are actually
in the same directory (docs/contributing/).

Changes:
- docs/contributing/README.md: fix Suggested Reading Order paths
- docs/contributing/pr-workflow.md: fix link text references
- docs/contributing/reviewer-playbook.md: fix link text reference
- docs/vi/contributing/README.md: fix Vietnamese translation paths
- docs/i18n/vi/contributing/README.md: fix Vietnamese i18n paths
- docs/setup-guides/zai-glm-setup.md: fix custom-providers.md link

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 03:31:14 -04:00
Rick Morgans
da0a592163 fix(imessage): parse attributedBody when text column is NULL
Modern macOS (Ventura+) stores iMessage content in the attributedBody
column as a binary typedstream blob rather than the text column. The
existing SQL filter `AND m.text IS NOT NULL` silently dropped all
incoming messages on affected systems.

Add a length-prefix extractor for the typedstream format and fall back
to attributedBody when text is NULL or empty. Includes real captured
blob fixtures and 14 new parser/integration tests.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-11 03:25:05 -04:00
Kunal Karmakar
a6cd877ac8 Merge branch 'master' of https://github.com/kunalk16/zeroclaw into fix-honor-default-temperature-agent-command 2026-03-11 02:13:12 +00:00
SimianAstronaut7
138ff4249c
Merge pull request #3147 from zeroclaw-labs/simianastronaut7/claude-skill-for-general-use
feat(skills): add zeroclaw operational skill for CLI and REST API usage
2026-03-11 00:26:04 +00:00
SimianAstronaut7
ff2da533a5
Merge branch 'master' into simianastronaut7/claude-skill-for-general-use 2026-03-10 23:49:34 +00:00
SimianAstronaut7
86ca1ba2b6
Merge pull request #3142 from zeroclaw-labs/simianastronaut7/api-key-preflight
feat(provider): add API key prefix pre-flight validation
2026-03-10 23:49:21 +00:00
SimianAstronaut7
f16a71826c
Merge branch 'master' into simianastronaut7/api-key-preflight 2026-03-10 23:41:23 +00:00
simianastronaut
0c5b41b288 fix(tests): remove nonexistent examples dir from regression scan
The `examples/` directory no longer exists, causing
`source_does_not_use_legacy_reply_to_field` to panic in CI.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-10 19:33:36 -04:00
simianastronaut
f127ca8c93 feat(skills): add zeroclaw operational skill for CLI and REST API usage
Adds a Claude Code skill that helps users operate their ZeroClaw instance
through both the CLI and gateway API. The skill provides adaptive expertise
(adjusts detail level based on user signals), discovery logic (finds the
binary, checks gateway status), and covers all core operations: agent chat,
memory, cron, providers, config, SSE events, estop, and troubleshooting.

Also adds gitignore entry for skill eval workspaces.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-10 19:05:30 -04:00
simianastronaut
2e86d83ed7 feat(api): add API key prefix validation to prevent cross-provider mismatches
Introduced a new function `check_api_key_prefix` to validate API key prefixes against their associated providers. This helps catch mismatches early in the process. Added unit tests to ensure correct functionality for various scenarios, including known and unknown key formats. This enhancement improves error handling and user guidance when incorrect provider keys are used.
2026-03-10 18:00:38 -04:00
SimianAstronaut7
9b9b85217c
Merge pull request #3140 from zeroclaw-labs/simianastronaut7/readme-contributors-grid
docs: add dynamic contributor badge and contributors grid to README
2026-03-10 20:49:17 +00:00
simianastronaut
0e006b91e4 docs: update contributor badge in README and add contributors section 2026-03-10 16:39:02 -04:00
SimianAstronaut7
d93d7dab84
Merge pull request #3139 from zeroclaw-labs/simianastronaut7/build-fix
fix(build): ensure web/dist directory exists at compile time
2026-03-10 19:38:50 +00:00
simianastronaut
ee7048502c fix(build): ensure web/dist directory exists at compile time
Add build.rs script that creates web/dist/ if missing, preventing
build failures when the web dashboard hasn't been pre-built.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-10 15:35:28 -04:00
SimianAstronaut7
2c203571d0
Merge pull request #3138 from zeroclaw-labs/mikeboensel/examples-refactored
docs: consolidate extension examples into contributing guide
2026-03-10 19:15:39 +00:00
simianastronaut
9781f07a98 docs: add extension examples and update references in contributing materials
- Introduced a new document, `extension-examples.md`, providing code examples for implementing custom extensions in ZeroClaw.
- Updated `SUMMARY.md` and `README.md` to include links to the new examples.
- Modified `change-playbooks.md` to reference the new examples for better guidance on extension implementation.
- Removed outdated example files for custom channels, memory, providers, and tools to streamline documentation.
2026-03-10 15:11:28 -04:00
Kunal Karmakar
4600469717 Merge branch 'fix-honor-default-temperature-agent-command' of https://github.com/kunalk16/zeroclaw into fix-honor-default-temperature-agent-command 2026-03-10 08:29:56 +00:00
Kunal Karmakar
b2857cb836 Fix clippy linting 2026-03-10 08:29:45 +00:00
Kunal Karmakar
76d54193f1 Merge branch 'master' of https://github.com/kunalk16/zeroclaw into fix-honor-default-temperature-agent-command 2026-03-10 07:19:03 +00:00
SimianAstronaut7
046040d535
Merge pull request #3099 from zeroclaw-labs/cicd-best-practices
chore(ci,docs): SHA-pin Actions, scope permissions, add gate job, and localized READMEs
2026-03-10 02:47:59 -04:00
Simian Astronaut 7
283385624e chore(ci): rename gate job for clarity
Updated the name of the CI gate job from "Gate" to "CI Required Gate" to enhance clarity and better reflect its purpose in the workflow.
2026-03-10 02:47:07 -04:00
Simian Astronaut 7
f73f39d33d fix(ci): replace taiki-e/install-action with cargo install
The taiki-e/install-action is likely not on the org/repo Actions
allowlist, causing startup_failure for the entire workflow. Revert
to cargo install for cargo-audit and cargo-deny.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-10 02:37:22 -04:00
Simian Astronaut 7
d07d314fd0 chore: retrigger CI after startup_failure 2026-03-10 02:34:44 -04:00
Simian Astronaut 7
f7b057a743 Merge remote-tracking branch 'origin/master' into cicd-best-practices 2026-03-10 02:24:31 -04:00
SimianAstronaut7
276e3f67ca
Merge pull request #3087 from zeroclaw-labs/docs/readme-update
docs: add localized README files for all 31 supported languages
2026-03-10 02:20:27 -04:00
SimianAstronaut7
0a9acf8150
Merge branch 'master' into docs/readme-update 2026-03-10 02:19:27 -04:00
SimianAstronaut7
535995f2ab
Merge pull request #3097 from zeroclaw-labs/fix/release-web-build-step
fix(ci): add web dashboard build step to release workflows
2026-03-10 02:18:35 -04:00
Simian Astronaut 7
863d731b92 fix(docs): address CodeRabbit review feedback on localized READMEs
- README.ar.md: translate hero performance callout from English to Arabic
- README.nl.md: fix "Implantatie" → "Imitatie" in anti-impersonation heading
- README.tr.md: fix ambiguous security wording for native runtime sandbox
- README.uk.md: update "проект" → "проєкт" (modern Ukrainian orthography)
- README.fr/vi/ja/ru/zh-CN.md: update 6-language selector to full 31-language
  selector matching README.md for navigation parity
- README.de.md: add note clarifying docs links route to English sources
- README.el.md: add note marking this as a condensed translation with pointer
  to full English README

Addresses review comments from PR #3087.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-10 02:16:37 -04:00
Simian Astronaut 7
f19d4951b6 refactor(ci): extract web build into shared job to avoid 4x redundancy
Instead of each build matrix job independently installing Node.js and
running npm ci/build, extract web dashboard build into a single job
that uploads web/dist/ as an artifact. Build jobs download it before
cargo build. Reduces total CI time by ~3 Node installs + builds.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-10 02:09:30 -04:00
Simian Astronaut 7
7399894cc8 fix(ci): add web dashboard build step to release workflows
RustEmbed requires web/dist/ to exist at compile time. The PR checks
workflow used a placeholder mkdir, but release workflows need real
built assets since they produce the distributed binary. Add Node.js
setup and npm ci/build before cargo build in all three release/build
workflows.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-10 02:03:15 -04:00
SimianAstronaut7
6e7c13a864
Merge pull request #3096 from zeroclaw-labs/project-cleanup
chore: project cleanup — restructure docs, rename firmware, update CI workflows

- Base branch target (`master` for all contributions): `master`
- Problem: Repository structure had grown organically — docs were flat, firmware directories had redundant prefixes, CI workflow names were unclear, and stale build artifacts (`web/dist`) were tracked.
- Why it matters: Cleaner repo structure improves contributor onboarding, reduces confusion, and establishes consistent naming conventions across the project.
- What changed: Restructured `docs/` into topic-based subdirectories (setup-guides, reference, ops, security, hardware, contributing, maintainers); renamed firmware crate directories to drop `zeroclaw-` prefix; renamed CI workflow files for clarity; migrated GitHub URLs from `theonlyhennygod` to `zeroclaw-labs`; added `.vscode` config; added Claude skills (github-issue, github-pr, skill-creator); slimmed down AGENTS.md/CLAUDE.md; unified install scripts; removed tracked `web/dist`; relocated test scripts.
- What did **not** change (scope boundary): No runtime behavior, no Rust `src/` logic changes beyond path/URL reference updates in peripherals and providers.
2026-03-10 01:54:23 -04:00
Simian Astronaut 7
7ef9d8a7b5 Addressed clippy lint issues 2026-03-10 01:48:19 -04:00
Simian Astronaut 7
8d48472c9e fix(ci): add gate job and use pre-built security tools
Add composite Gate job to checks-on-pr.yml so branch protection
only needs a single required check. Replace cargo-install with
taiki-e/install-action for cargo-audit and cargo-deny to cut
minutes off every PR run. Mark CI/CD P1/P2 findings as resolved
in refactor-candidates.md.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-10 01:10:19 -04:00
Simian Astronaut 7
d1fffc3b74 fix(ci): scope release workflow permissions per-job
Narrow workflow-level permissions to contents:read and grant
write access only to the specific jobs that need it (publish
gets contents:write, docker gets packages:write). Reduces blast
radius if a build step is compromised (P1 finding).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-10 01:09:51 -04:00
Simian Astronaut 7
8538c4105a fix(ci): SHA-pin all third-party GitHub Actions
Replace mutable version tags with immutable commit SHAs to prevent
tag-hijacking supply chain attacks (P1 finding).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-10 01:09:32 -04:00
Simian Astronaut 7
08d6959e0d security: Bump quinn-proto 0.11.13 → 0.11.14 to resolve high-severity (8.7)
denial-of-service vulnerability in Quinn endpoints. Lockfile-only
  change; no code modifications.
2026-03-10 00:55:30 -04:00
Simian Astronaut 7
ad8e1e65e0 fix(ci): resolve lint and build failures in PR checks
Apply cargo fmt to fix formatting diffs in openrouter.rs and serial.rs.
Add web/dist placeholder step to lint, test, and build jobs so
RustEmbed compiles without the gitignored frontend assets.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-10 00:36:24 -04:00
Simian Astronaut 7
c204d72cc4 TODOs 2026-03-10 00:19:18 -04:00
Simian Astronaut 7
852b67fff3 chore: add VSCode configuration files for project
- Introduced extensions.json to recommend essential Rust-related extensions.
- Added launch.json for debugging configurations with LLDB for various components.
- Created settings.json to configure editor preferences and Rust analyzer settings.
- Included tasks.json to define build, lint, test, and CI tasks for the project.
2026-03-10 00:19:12 -04:00
Simian Astronaut 7
df4e11bd1d ci(workflows): rename workflow files and add lint + security jobs
- Rename all 4 workflow files to match trigger and purpose
- Expand PR quality gate with dedicated lint and security audit jobs
- Align workflow display names, concurrency groups, and all doc references
2026-03-10 00:17:54 -04:00
Simian Astronaut 7
991a1ea9cc ignore coverage report 2026-03-09 23:51:44 -04:00
Simian Astronaut 7
97ebfe0549 minor typo 2026-03-09 23:51:26 -04:00
Simian Astronaut 7
ea61491f2a chore: migrate GitHub URLs from theonlyhennygod to zeroclaw-labs
- Replace all occurrences of theonlyhennygod/zeroclaw with zeroclaw-labs/zeroclaw
- Affects root metadata, source code headers, hardware setup docs, and crate docs
2026-03-09 23:49:56 -04:00
Simian Astronaut 7
1a78dcf447 refactor: update firmware path references to match directory renames
- Align src/peripherals/ and docs/hardware/ with the firmware directory renames
- Covers both compiled references (include_str!, constants) and documentation
2026-03-09 23:49:20 -04:00
Simian Astronaut 7
620e4adee5 refactor(firmware): update crate names and paths to match directory renames
- Align Cargo.toml package names and Cargo.lock with directory renames from previous commit
- Update build and flash instructions in firmware docs
2026-03-09 23:47:45 -04:00
Simian Astronaut 7
631e7f61b5 refactor(firmware): rename directories to drop zeroclaw- prefix
- zeroclaw-arduino → arduino
- zeroclaw-esp32 → esp32
- zeroclaw-esp32-ui → esp32-ui
- zeroclaw-nucleo → nucleo
- zeroclaw-uno-q-bridge → uno-q-bridge

Pure file moves — no content changes. Follow-up commits update
internal references (Cargo.toml names, include_str! paths, docs).
2026-03-09 23:46:17 -04:00
Simian Astronaut 7
36bc6c4aee refactor: Agents file done according to best practices (slim, light, project specific, avoids things the models already know) 2026-03-09 23:16:48 -04:00
Simian Astronaut 7
67b0ff0ee9 docs: update all internal links to match topic-based directory layout
- Bulk-fix internal links broken by docs/ reorganization into
  topic-based subdirectories
- Cover root READMEs (en/fr/ja/ru/vi/zh-CN), docs hubs, SUMMARY
  TOCs, and all subdirectory docs
- Fix ancillary issues: GitHub URL org name, LICENSE path, Vietnamese
  shim depths, installer rename references

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 23:09:09 -04:00
Simian Astronaut 7
c99a046b1f Merge remote-tracking branch 'origin/master' into project-cleanup 2026-03-09 22:40:46 -04:00
SimianAstronaut7
d17e9586ae
Merge pull request #2976 from zeroclaw-labs/cicd-improvements
chore: update .gitignore, CODEOWNERS, and dependabot configuration + Skills for Standardizing Github work
2026-03-09 22:38:02 -04:00
Simian Astronaut 7
9f2655dc4d refactor:
- Move `recompute_contributor_tiers.sh` to automate the labeling of contributors based on their merged PR counts.
- Removed web/dist from git tracking (mistake)
2026-03-09 22:33:59 -04:00
Simian Astronaut 7
0b9a975da5 docs: restructure docs/ into topic-based directory layout
- Move ~60 flat files from docs/ root into category subdirs:
  security/, hardware/, ops/, reference/{api,cli,sop},
  setup-guides/, contributing/, maintainers/, assets/
- Absorb and remove datasheets/, operations/, getting-started/,
  project/, structure/, sop/ directories
- Move testing-telegram.md to tests/telegram/ alongside test scripts
- Merge FutureRefactorings.md into refactor-candidates.md
2026-03-09 22:25:20 -04:00
Simian Astronaut 7
507c93fcf5 chore(tests): relocate test scripts and testing docs
Move RUN_TESTS.md and TESTING_TELEGRAM.md to docs/contributing/, and move
quick_test.sh, test_telegram_integration.sh, generate_test_messages.py to
tests/telegram/. Move scripts/test_dockerignore.sh to tests/. Update internal
cross-references to reflect new paths.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 21:44:20 -04:00
Simian Astronaut 7
a6e5a6ffda chore(docs): relocate assets and project docs to docs/ subdirectories
Move zeroclaw.png and zero-claw.jpeg to docs/assets/, CLA.md to
docs/contributing/cla.md, and TRADEMARK.md to docs/project/trademark.md.
Update all cross-references in root README files (en, fr, ja, ru, vi, zh-CN)
to point to new locations.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 21:44:08 -04:00
SimianAstronaut7
cf21cf093a
Merge branch 'master' into cicd-improvements 2026-03-09 21:35:24 -04:00
Simian Astronaut 7
b454a9d301 Removed unused file 2026-03-09 20:33:34 -04:00
argenis de la rosa
c8df83dd17 docs: add localized README files for all 31 supported languages
This commit adds complete README translations for all 31 languages supported
by the ZeroClaw web dashboard, enabling users to access documentation in
their native language.

New README files added:
- README.ko.md (Korean - 한국어)
- README.tl.md (Tagalog)
- README.es.md (Spanish - Español)
- README.pt.md (Portuguese - Português)
- README.it.md (Italian - Italiano)
- README.de.md (German - Deutsch)
- README.ar.md (Arabic - العربية) [RTL]
- README.hi.md (Hindi - हिन्दी)
- README.bn.md (Bengali - বাংলা)
- README.he.md (Hebrew - עברית) [RTL]
- README.pl.md (Polish - Polski)
- README.cs.md (Czech - Čeština)
- README.nl.md (Dutch - Nederlands)
- README.tr.md (Turkish - Türkçe)
- README.uk.md (Ukrainian - Українська)
- README.id.md (Indonesian - Bahasa Indonesia)
- README.th.md (Thai - ไทย)
- README.ur.md (Urdu - اردو) [RTL]
- README.ro.md (Romanian - Română)
- README.sv.md (Swedish - Svenska)
- README.el.md (Greek - Ελληνικά)
- README.hu.md (Hungarian - Magyar)
- README.fi.md (Finnish - Suomi)
- README.da.md (Danish - Dansk)
- README.nb.md (Norwegian Bokmål - Norsk)

Updated README.md with complete language selector links for all 31 languages.
RTL support added for Arabic, Hebrew, and Urdu.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-09 18:01:00 -04:00
Kunal Karmakar
225edecc3d Reuse default temperature 2026-03-09 14:54:29 +00:00
Kunal Karmakar
465b2ea6af Fix temperature range 2026-03-09 14:31:47 +00:00
Kunal Karmakar
e70ca52dc4 Update wording as per review comment 2026-03-09 14:07:13 +00:00
Kunal Karmakar
45d2368730 Default of 0.7 2026-03-09 13:52:33 +00:00
Kunal Karmakar
026f2609f9 Support config without default_temperature 2026-03-09 13:28:08 +00:00
Kunal Karmakar
a6cf7d4015 Honor default_temperature from config.toml for zeroclaw agent calls 2026-03-09 10:31:43 +00:00
Giulio V
48c7414bb3 feat(tts): add multi-provider TTS system (OpenAI, ElevenLabs, Google, Edge TTS)
Adds pluggable Text-to-Speech subsystem with TtsProvider trait,
TtsManager for provider selection, and per-provider config structs.
Includes secret encryption for TTS API keys.
2026-03-09 09:03:44 +01:00
SimianAstronaut7
f7fefd4b6c
Merge pull request #2954 from antonvice/fix/unused-imports-and-peripherals
fix: resolve unused import warnings
2026-03-07 22:33:12 -05:00
SimianAstronaut7
2d359c1f74
Merge branch 'master' into fix/unused-imports-and-peripherals 2026-03-07 22:19:17 -05:00
simianastronaut
8439b40145 Returning file to prior state 2026-03-07 22:05:49 -05:00
simianastronaut
57065b07a3 PR creation/update skill 2026-03-07 21:54:16 -05:00
simianastronaut
a7e295c966 Skill for making PRs in the proper format 2026-03-07 21:48:58 -05:00
simianastronaut
d4caba0967 Removing extraneous required fields from Github PRs 2026-03-07 21:39:31 -05:00
simianastronaut
8c0375a9ba Skill Creator (from anthropic) 2026-03-07 21:39:10 -05:00
simianastronaut
fa2faf408d chore: update .gitignore, CODEOWNERS, and dependabot configuration
- Add .zeroclaw/* to .gitignore to exclude ZeroClaw files from version control.
- Update CODEOWNERS to include @SimianAstronaut7 as a maintainer alongside @jordanthejet.
- Change dependabot target branch from dev to master for all update configurations.
- Revise master-branch-flow documentation to clarify active workflows and triggers.
2026-03-07 21:05:23 -05:00
Argenis
a6102f8dd6
Merge pull request #2928 from zeroclaw-labs/chore/master-branch-model
chore: migrate to single master branch model and update maintainers
2026-03-07 11:03:16 -05:00
jordanthejet
b4d619dd2b fix: remove deprecated macos-13 x86_64-apple-darwin target from release pipelines
The macos-13 runner is deprecated by GitHub Actions, causing the
x86_64-apple-darwin build to instantly cancel with no runner assigned.
This cascades to skip publish and docker jobs since they depend on all
matrix builds succeeding.

Intel Macs have been EOL since 2022; aarch64-apple-darwin via macos-14
covers all current macOS users (Rosetta handles x86_64 if needed).

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-07 10:40:46 -05:00
Anton Vice
e098c6d242 fix: resolve unused import warnings in channels and peripherals 2026-03-07 09:34:46 -05:00
jordanthejet
5dfe3372f5 fix: sync Vietnamese release-process with canonical English version
- Add missing Homebrew Core formula section (step 6)
- Add pub-homebrew-core.yml to workflow contract listing
- Update GHCR tag verification to include SHA tag detail

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-06 18:18:56 -05:00
jordanthejet
ce6349741b fix: sync Vietnamese ci-map with canonical English version
- Add missing Rust-gate parity line to CI merge-blocking section
- Add Sec Vorpal Reviewdog and Pub Homebrew Core workflow entries
- Add Pub Homebrew Core and Sec Vorpal Reviewdog to trigger map
- Update Dependabot trigger wording to match English (PRs target master)
- Add Homebrew formula triage entry and fix numbering

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-06 18:13:40 -05:00
jordanthejet
e3612880f3 fix: address remaining CodeRabbit review comments on PR #2928
- Fix Docker trigger semantics in Vietnamese ci-map docs to match
  canonical English wording (publish on tag push v*, smoke on master PRs)
- Add missing Rust-gate parity bullet to Vietnamese pr-workflow docs

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-06 18:07:45 -05:00
jordanthejet
73b862bb1f fix: address round-2 review comments on PR #2928
- docs/pr-workflow.md: replace hardcoded maintainer handles with
  generic WORKFLOW_OWNER_LOGINS + CODEOWNERS reference
- docs/vi/pr-workflow.md, docs/i18n/vi/pr-workflow.md: fix awkward
  "và hoặc là" phrasing, sync new branch-model bullets (workflow-owner
  config line + all PRs target master directly)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-06 17:20:44 -05:00
jordanthejet
31c027ed6d fix: replace remaining origin/main and main refs in release-process docs
- docs/release-process.md: 3 origin/main occurrences
- docs/vi/release-process.md: 3 origin/main + 6 `main` occurrences
- docs/i18n/vi/release-process.md: 3 origin/main + 6 `main` occurrences

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-06 16:53:03 -05:00
jordanthejet
571091ecef fix: resolve remaining main/dev references flagged in PR review
- docs/release-process.md: replace all `main` with `master` (lines 10,
  37, 45, 47, 54, 64, 73)
- docs/pr-workflow.md: fix "behind main" on line 219 to "behind master"
- docs/vi/pr-workflow.md: fix lines 216 and 273 (main -> master)
- docs/i18n/vi/pr-workflow.md: same fixes for canonical vi locale
- scripts/bootstrap.sh: fix raw URL from /main/ to /master/ (line 61),
  pin git clone to --branch master (line 909)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-06 16:37:36 -05:00
jordanthejet
44ac470d78 chore: migrate to single master branch model and update maintainers
- Replace all dev/main branch references with master across docs,
  templates, CI docs, and localized files (en, vi)
- Remove dev->main promotion model (no more Main Promotion Gate)
- Rename main-branch-flow.md to master-branch-flow.md and rewrite
  for single-branch workflow
- Update maintainers to theonlyhennygod and jordanthejet
- Update CODEOWNERS: replace @chumyin with @jordanthejet
- Update WORKFLOW_OWNER_LOGINS fallback references
- Update CODE_OF_CONDUCT enforcement contact to @argenistherose

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-06 13:01:32 -05:00
JordanTheJet
92e0f7aefd
Merge pull request #2895 from zeroclaw-labs/sundai
ci: replace all workflows with simplified CI/CD pipeline
2026-03-05 22:52:53 -05:00
jordanthejet
db536935bf fix: update coderabbit config for master branch and remove unsupported fields
- Changed base_branches from main/dev to master
- Removed unsupported base_branch_analysis field

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-05 22:36:24 -05:00
JordanTheJet
d64be99621
Update .github/workflows/ci-full.yml
macbook version update

Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
2026-03-05 22:35:07 -05:00
jordanthejet
f981d9ea69 fix: address CodeRabbit review comments
- Add concurrency group to promote-release workflow
- Fix markdown emphasis style in README (MD049)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-05 22:26:29 -05:00
jordanthejet
8230b26171 fix(ci): remove sccache, keep mold + nextest + no-incremental
sccache GHA cache backend is fragile — fails the entire build when
GitHub's artifact cache service is unavailable. Removed in favor of
Swatinem/rust-cache which handles failures gracefully.

Kept: mold linker, cargo-nextest, CARGO_INCREMENTAL=0.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-05 21:53:24 -05:00
jordanthejet
4f62fb2ecb ci: retrigger after allowlist update 2026-03-05 21:51:11 -05:00
jordanthejet
fee163fc74 perf(ci): add sccache, mold linker, and cargo-nextest for faster CI
- sccache: compiler caching for test builds (11-14% faster compilation)
- mold: faster linker on Linux builds
- cargo-nextest: parallel test runner (up to 35-60% faster tests)
- CARGO_INCREMENTAL=0: disable incremental compilation overhead in CI

Allowlist impact: added mozilla-actions/sccache-action@*

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-05 21:50:07 -05:00
jordanthejet
32e8dbbec5 feat(ci): split CI into auto (linux+macOS) and manual full matrix
Auto CI on PRs builds linux x86_64 and macOS arm64 only.
Remaining targets (linux arm64, macOS x86, Windows) available via
manual workflow_dispatch in ci-full.yml.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-05 21:43:33 -05:00
jordanthejet
6eef5bafcb feat(ci): full matrix build in CI + fix flaky bedrock test
- CI now builds across all 5 targets (linux x86/arm64, macOS x86/arm64,
  Windows) matching the release matrix
- Fix chat_fails_without_credentials test to accept "builder error"
  which occurs in CI environments without native TLS

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-05 21:40:45 -05:00
jordanthejet
53c1a3ecea fix(ci): target master branch instead of main
The default branch is master, not main. Updates CI and Beta Release
workflow triggers and corresponding docs references.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-05 21:34:33 -05:00
jordanthejet
db917bc37b docs: update actions source policy for simplified workflow system
Reflects the complete workflow overhaul from 22 workflows to 3.
Updates allowlist to match current action usage and removes stale entries.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-05 21:24:42 -05:00
jordanthejet
47ea46e694 chore: restore .coderabbit.yaml config
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-05 21:18:38 -05:00
jordanthejet
9923544769 ci: replace all workflows with simplified CI/CD pipeline
Remove 22 workflow files and 9 JS scripts. Replace with 3 workflows:
- ci.yml: test + build on PRs
- release.yml: auto beta release on merge to main
- promote-release.yml: manual stable release promotion

Update README Development section to document the new CI/CD system.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-05 21:15:36 -05:00
1600 changed files with 203253 additions and 212103 deletions

View File

@ -1,19 +0,0 @@
{
"arch": "arm",
"crt-static-defaults": true,
"data-layout": "e-m:e-p:32:32-Fi8-i64:64-v128:64:128-a:0:32-n32-S64",
"emit-debug-gdb-scripts": false,
"env": "musl",
"executables": true,
"is-builtin": false,
"linker": "arm-linux-gnueabihf-gcc",
"linker-flavor": "gcc",
"llvm-target": "armv6-unknown-linux-musleabihf",
"max-atomic-width": 32,
"os": "linux",
"panic-strategy": "unwind",
"relocation-model": "static",
"target-endian": "little",
"target-pointer-width": "32",
"vendor": "unknown"
}

12
.cargo/audit.toml Normal file
View File

@ -0,0 +1,12 @@
# cargo-audit configuration
# https://rustsec.org/
[advisories]
ignore = [
# wasmtime vulns via extism 1.13.0 — no upstream fix; plugins feature-gated
"RUSTSEC-2026-0006", # wasmtime f64.copysign segfault on x86-64
"RUSTSEC-2026-0020", # WASI guest-controlled resource exhaustion
"RUSTSEC-2026-0021", # WASI http fields panic
# instant crate unmaintained — transitive dep via nostr; no upstream fix
"RUSTSEC-2024-0384",
]

View File

@ -1,33 +1,13 @@
# macOS targets — pin minimum OS version so binaries run on supported releases.
# Intel (x86_64): target macOS 10.15 Catalina and later.
# Apple Silicon (aarch64): target macOS 11.0 Big Sur and later (no Catalina hardware exists).
[target.x86_64-apple-darwin]
rustflags = ["-C", "link-arg=-mmacosx-version-min=10.15"]
[target.aarch64-apple-darwin]
rustflags = ["-C", "link-arg=-mmacosx-version-min=11.0"]
[target.x86_64-unknown-linux-musl]
rustflags = ["-C", "link-arg=-static"]
[target.aarch64-unknown-linux-musl]
rustflags = ["-C", "link-arg=-static"]
rustflags = ["-C", "link-arg=-static", "-C", "link-arg=-Wl,-z,stack-size=8388608"]
# ARMv6 musl (Raspberry Pi Zero W)
[target.armv6l-unknown-linux-musleabihf]
rustflags = ["-C", "link-arg=-static"]
# Android targets (Termux-native defaults).
# CI/NDK cross builds can override these via CARGO_TARGET_*_LINKER.
# Android targets (NDK toolchain)
[target.armv7-linux-androideabi]
linker = "clang"
linker = "armv7a-linux-androideabi21-clang"
[target.aarch64-linux-android]
linker = "clang"
# Windows targets — increase stack size for large JsonSchema derives
[target.x86_64-pc-windows-msvc]
rustflags = ["-C", "link-args=/STACK:8388608"]
[target.aarch64-pc-windows-msvc]
rustflags = ["-C", "link-args=/STACK:8388608"]
linker = "aarch64-linux-android21-clang"
rustflags = ["-C", "link-arg=-Wl,-z,stack-size=8388608"]

View File

@ -0,0 +1,133 @@
# Skill: github-issue
File a structured GitHub issue (bug report or feature request) for ZeroClaw interactively from Claude Code.
## When to Use
Trigger when the user wants to file a GitHub issue, report a bug, or request a feature for ZeroClaw. Keywords: "file issue", "report bug", "feature request", "open issue", "create issue", "github issue".
## Instructions
You are filing a GitHub issue against the ZeroClaw repository using structured issue forms. Follow this workflow exactly.
### Step 1: Detect Issue Type and Read the Template
Determine from the user's message whether this is a **bug report** or **feature request**.
- If unclear, use AskUserQuestion to ask: "Is this a bug report or a feature request?"
Then read the corresponding issue template to understand the required fields:
- Bug report: `.github/ISSUE_TEMPLATE/bug_report.yml`
- Feature request: `.github/ISSUE_TEMPLATE/feature_request.yml`
Parse the YAML to extract:
- The `title` prefix (e.g. `[Bug]: `, `[Feature]: `)
- The `labels` array
- Each field in the `body` array: its `type` (dropdown, textarea, input, checkboxes, markdown), `id`, `attributes.label`, `attributes.options` (for dropdowns), `attributes.description`, `attributes.placeholder`, and `validations.required`
This is the source of truth for what fields exist, what they're called, what options are available, and which are required. Do not assume or hardcode any field names or options — always derive them from the template file.
### Step 2: Auto-Gather Context
Before asking the user anything, silently gather environment and repo context:
```bash
# Git context
git log --oneline -5
git status --short
git diff --stat HEAD~1 2>/dev/null
# For bug reports — environment detection
uname -s -r -m # OS info
sw_vers 2>/dev/null # macOS version
rustc --version 2>/dev/null # Rust version
cargo metadata --format-version=1 --no-deps 2>/dev/null | jq -r '.packages[] | select(.name=="zeroclaw") | .version' 2>/dev/null # ZeroClaw version
git rev-parse --short HEAD # commit SHA fallback
```
Also read recently changed files to infer the affected component and architecture impact.
### Step 3: Pre-Fill and Present the Form
Using the parsed template fields and gathered context, draft values for ALL fields from the template:
- **dropdown** fields: select the most likely option from `attributes.options` based on context. For dropdowns where you're uncertain, note your best guess and flag it for the user.
- **textarea** fields: draft content based on the user's description, git context, and the field's `attributes.description`/`attributes.placeholder` for guidance on what's expected.
- **input** fields: fill with auto-detected values (versions, OS) or draft from user context.
- **checkboxes** fields: auto-check all items (the skill itself ensures compliance with the stated checks).
- **markdown** fields: skip these — they're informational headers, not form inputs.
- **optional fields** (where `validations.required` is false): fill if there's enough context, otherwise note "(optional — not enough context to fill)".
Present the complete draft to the user in a clean readable format:
```
## Issue Draft: [Bug]: <title> / [Feature]: <title>
**Labels**: <from template>
### <Field Label>
<proposed value or selection>
### <Field Label>
<proposed value>
...
```
Use AskUserQuestion to ask the user to review:
- "Here's the pre-filled issue. Please review and let me know what to change, or say 'submit' to file it."
If the user requests changes, update the draft and re-present. Iterate until the user approves.
### Step 4: Scope Guard
Before final submission, analyze the collected content for scope creep:
- Does the bug report describe multiple independent defects?
- Does the feature request bundle unrelated changes?
If multi-concept issues are detected:
1. Inform the user: "This issue appears to cover multiple distinct topics. Focused, single-concept issues are strongly preferred and more likely to be accepted."
2. Break down the distinct groups found.
3. Offer to file separate issues for each group, reusing shared context (environment, etc.).
4. Let the user decide: proceed as-is or split.
### Step 5: Construct Issue Body
Build the issue body as markdown sections matching GitHub's form-field rendering format. GitHub renders form-submitted issues with `### <Field Label>` sections, so use that exact structure.
For each non-markdown field from the template, in order:
```markdown
### <attributes.label>
<value>
```
For optional fields with no content, use `_No response_` as the value (this matches GitHub's native rendering for empty optional fields).
For checkbox fields, render each option as:
```markdown
- [X] <option label text>
```
### Step 6: Final Preview and Submit
Show the final constructed issue (title + labels + full body) for one last confirmation.
Then submit using a HEREDOC for the body to preserve formatting:
```bash
gh issue create --title "<title prefix><user title>" --label "<label1>,<label2>" --body "$(cat <<'ISSUE_EOF'
<body content>
ISSUE_EOF
)"
```
Return the resulting issue URL to the user.
### Important Rules
- **Always read the template file** — never assume field names, options, or structure. The templates are the source of truth and may change over time.
- **Never include personal/sensitive data** in the issue. Redact secrets, tokens, emails, real names.
- **Use neutral project-scoped placeholders** per ZeroClaw's privacy contract.
- **One concept per issue** — enforce the scope guard.
- **Auto-detect, don't guess** — use real command output for environment fields.
- **Match GitHub's rendering** — use `### Field Label` sections so issues look consistent whether filed via web UI or this skill.

View File

@ -0,0 +1,209 @@
# Skill: github-pr
Open or update a GitHub Pull Request for ZeroClaw. Handles creating new PRs with a fully filled-out template body, and updating existing PRs (title, body sections, labels, comments). Use this skill whenever the user wants to open a PR, create a pull request, update a PR, edit PR description, add labels to a PR, or sync a PR after new commits — even if they don't say "PR" explicitly (e.g., "submit this for review", "push and open for merge").
## Instructions
This skill supports two modes: **Open** (create a new PR) and **Update** (edit an existing PR). Detect the mode from context — if there's already an open PR for the current branch and the user didn't say "open a new PR", default to update mode.
The PR template at `.github/pull_request_template.md` is the source of truth for the PR body structure. Read it every time — never assume or hardcode section names, fields, or their order. The template may change over time and the skill should always reflect its current state.
---
## Shared: Read the PR Template
Before opening or updating a PR body, read `.github/pull_request_template.md` and parse it to understand:
- The `## ` section headers (these are the top-level sections of the PR body)
- The bullet points, fields, and prompts within each section
- Which sections are marked `(required)` vs optional/recommended
- Any inline formatting conventions (backtick options, Yes/No fields, etc.)
This parsed structure drives how you fill, present, and edit the PR body.
---
## Mode: Open a New PR
### Step 1: Gather Context
Collect information to pre-fill the PR body. Run these in parallel:
```bash
# Branch and commit context
git branch --show-current
git log master..HEAD --oneline
git diff master...HEAD --stat
# Check if branch is pushed
git rev-parse --abbrev-ref --symbolic-full-name @{u} 2>/dev/null
# Environment (for validation evidence)
rustc --version 2>/dev/null
```
Also review the changed files and commit messages to understand the nature of the change (bug fix, feature, refactor, docs, chore, etc.) and which subsystems are affected.
### Step 2: Pre-Fill the Template
Using the parsed template structure and gathered context, draft a complete PR body:
- For each `## ` section from the template, fill in the bullet points and fields based on context from the commits, diff, and changed files.
- Use the field descriptions and placeholder text in the template as guidance for what each field expects.
- For Yes/No fields, infer from the diff (e.g., if no files in `src/security/` changed, security impact is likely all No).
- For required sections, always provide a substantive answer. For optional sections, fill if there's enough context, otherwise leave the template prompts in place.
- Draft a conventional commit-style PR title based on the changes (e.g., `feat(provider): add retry budget override`, `fix(channel): handle disconnect gracefully`, `chore(ci): update workflow targets`).
### Step 3: Present Draft for Review
Show the user the complete draft:
```
## PR Draft: <title>
**Branch**: <head> -> master
**Labels**: <suggested labels>
<full body with all sections filled>
```
Ask the user to review: "Here's the pre-filled PR. Review and let me know what to change, or say 'submit' to open it."
Iterate on changes until the user approves.
### Step 4: Push and Create
1. If the branch isn't pushed yet, push it:
```bash
git push -u origin <branch>
```
2. Create the PR using a HEREDOC for the body:
```bash
gh pr create --title "<title>" --base master --body "$(cat <<'PR_BODY_EOF'
<full body>
PR_BODY_EOF
)"
```
3. If labels were agreed on, add them:
```bash
gh pr edit <number> --add-label "<label1>,<label2>"
```
4. Return the PR URL to the user.
---
## Mode: Update an Existing PR
### Step 1: Identify the PR
1. **If a PR number or URL is given**: use that directly.
2. **If on a branch with an open PR**: auto-detect:
```bash
gh pr view --json number,title,body,labels,state,author,url,headRefName 2>/dev/null
```
3. **If neither**: ask the user for the PR number.
Verify the current user is the PR author:
```bash
CURRENT_USER=$(gh api user --jq '.login')
PR_AUTHOR=$(gh pr view <number> --json author --jq '.author.login')
```
If not the author, stop and inform the user.
### Step 2: Fetch Current State
```bash
gh pr view <number> --json number,title,body,labels,state,baseRefName,headRefName,url,author,reviewDecision,statusCheckRollup,commits
```
Display a summary:
```
## PR #<number>: <title>
**State**: <open/closed/merged>
**Branch**: <head> -> <base>
**Labels**: <label list>
**Checks**: <pass/fail/pending>
**URL**: <url>
```
### Step 3: Determine What to Update
Support these operations:
| Operation | How |
|---|---|
| **Edit title** | `gh pr edit <number> --title "<new title>"` |
| **Edit full body** | `gh pr edit <number> --body "<new body>"` |
| **Add labels** | `gh pr edit <number> --add-label "<label1>,<label2>"` |
| **Remove labels** | `gh pr edit <number> --remove-label "<label1>"` |
| **Edit specific section** | Parse body by `## ` headers, modify target section, re-submit full body |
| **Add a comment** | `gh pr comment <number> --body "<comment>"` |
| **Link an issue** | Edit the linked-issue section in the body |
| **Smart update after new commits** | Re-analyze and suggest section updates |
### Step 4: Handle Body Section Edits
When editing a specific section:
1. Parse the current PR body into sections by `## ` headers
2. Match the user's request to the corresponding section from the template
3. Show the current content of that section and the proposed replacement
4. On confirmation, modify only that section, reconstruct the full body, and submit
### Step 5: Smart Update After New Commits
When the user wants to sync the PR description after pushing new changes:
1. Identify new commits:
```bash
gh pr view <number> --json commits --jq '.commits[].messageHeadline'
git log <base>..<head> --oneline
git diff <base>...<head> --stat
```
2. Re-read the PR template. Analyze which sections are now stale based on the new changes — use the template's section names and field descriptions to identify what needs updating rather than relying on hardcoded assumptions.
3. Present proposed updates section-by-section and confirm before applying.
### Step 6: Apply Updates
For title/label changes, use direct `gh pr edit` flags.
For body edits, use a HEREDOC:
```bash
gh pr edit <number> --body "$(cat <<'PR_BODY_EOF'
<full updated body>
PR_BODY_EOF
)"
```
For comments:
```bash
gh pr comment <number> --body "$(cat <<'COMMENT_EOF'
<comment text>
COMMENT_EOF
)"
```
### Step 7: Confirm
Fetch and display the updated state:
```bash
gh pr view <number> --json number,title,labels,url
```
Return the PR URL.
---
## Important Rules
- **Always read `.github/pull_request_template.md`** before filling or editing a PR body. Never assume section names, fields, or structure — derive everything from the template. It's the source of truth and may change.
- **For updates, only modify requested sections.** Preserve everything else exactly as-is.
- **Always show diffs before applying body edits.** Present current vs proposed for each changed section.
- **Never include personal/sensitive data** in PR content per ZeroClaw's privacy contract.
- **For label changes**, only use labels that exist in the repository. Check with `gh label list` if unsure.
- **Fetch the latest body before editing** to avoid clobbering concurrent changes.
- **For new PRs**, push the branch before creating (with `-u` to set upstream tracking).

View File

@ -0,0 +1,202 @@
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
1. Definitions.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf
of any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS
APPENDIX: How to apply the Apache License to your work.
To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "[]"
replaced with your own identifying information. (Don't include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.
Copyright [yyyy] [name of copyright owner]
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

View File

@ -0,0 +1,485 @@
---
name: skill-creator
description: Create new skills, modify and improve existing skills, and measure skill performance. Use when users want to create a skill from scratch, edit, or optimize an existing skill, run evals to test a skill, benchmark skill performance with variance analysis, or optimize a skill's description for better triggering accuracy.
---
# Skill Creator
A skill for creating new skills and iteratively improving them.
At a high level, the process of creating a skill goes like this:
- Decide what you want the skill to do and roughly how it should do it
- Write a draft of the skill
- Create a few test prompts and run claude-with-access-to-the-skill on them
- Help the user evaluate the results both qualitatively and quantitatively
- While the runs happen in the background, draft some quantitative evals if there aren't any (if there are some, you can either use as is or modify if you feel something needs to change about them). Then explain them to the user (or if they already existed, explain the ones that already exist)
- Use the `eval-viewer/generate_review.py` script to show the user the results for them to look at, and also let them look at the quantitative metrics
- Rewrite the skill based on feedback from the user's evaluation of the results (and also if there are any glaring flaws that become apparent from the quantitative benchmarks)
- Repeat until you're satisfied
- Expand the test set and try again at larger scale
Your job when using this skill is to figure out where the user is in this process and then jump in and help them progress through these stages. So for instance, maybe they're like "I want to make a skill for X". You can help narrow down what they mean, write a draft, write the test cases, figure out how they want to evaluate, run all the prompts, and repeat.
On the other hand, maybe they already have a draft of the skill. In this case you can go straight to the eval/iterate part of the loop.
Of course, you should always be flexible and if the user is like "I don't need to run a bunch of evaluations, just vibe with me", you can do that instead.
Then after the skill is done (but again, the order is flexible), you can also run the skill description improver, which we have a whole separate script for, to optimize the triggering of the skill.
Cool? Cool.
## Communicating with the user
The skill creator is liable to be used by people across a wide range of familiarity with coding jargon. If you haven't heard (and how could you, it's only very recently that it started), there's a trend now where the power of Claude is inspiring plumbers to open up their terminals, parents and grandparents to google "how to install npm". On the other hand, the bulk of users are probably fairly computer-literate.
So please pay attention to context cues to understand how to phrase your communication! In the default case, just to give you some idea:
- "evaluation" and "benchmark" are borderline, but OK
- for "JSON" and "assertion" you want to see serious cues from the user that they know what those things are before using them without explaining them
It's OK to briefly explain terms if you're in doubt, and feel free to clarify terms with a short definition if you're unsure if the user will get it.
---
## Creating a skill
### Capture Intent
Start by understanding the user's intent. The current conversation might already contain a workflow the user wants to capture (e.g., they say "turn this into a skill"). If so, extract answers from the conversation history first — the tools used, the sequence of steps, corrections the user made, input/output formats observed. The user may need to fill the gaps, and should confirm before proceeding to the next step.
1. What should this skill enable Claude to do?
2. When should this skill trigger? (what user phrases/contexts)
3. What's the expected output format?
4. Should we set up test cases to verify the skill works? Skills with objectively verifiable outputs (file transforms, data extraction, code generation, fixed workflow steps) benefit from test cases. Skills with subjective outputs (writing style, art) often don't need them. Suggest the appropriate default based on the skill type, but let the user decide.
### Interview and Research
Proactively ask questions about edge cases, input/output formats, example files, success criteria, and dependencies. Wait to write test prompts until you've got this part ironed out.
Check available MCPs - if useful for research (searching docs, finding similar skills, looking up best practices), research in parallel via subagents if available, otherwise inline. Come prepared with context to reduce burden on the user.
### Write the SKILL.md
Based on the user interview, fill in these components:
- **name**: Skill identifier
- **description**: When to trigger, what it does. This is the primary triggering mechanism - include both what the skill does AND specific contexts for when to use it. All "when to use" info goes here, not in the body. Note: currently Claude has a tendency to "undertrigger" skills -- to not use them when they'd be useful. To combat this, please make the skill descriptions a little bit "pushy". So for instance, instead of "How to build a simple fast dashboard to display internal Anthropic data.", you might write "How to build a simple fast dashboard to display internal Anthropic data. Make sure to use this skill whenever the user mentions dashboards, data visualization, internal metrics, or wants to display any kind of company data, even if they don't explicitly ask for a 'dashboard.'"
- **compatibility**: Required tools, dependencies (optional, rarely needed)
- **the rest of the skill :)**
### Skill Writing Guide
#### Anatomy of a Skill
```
skill-name/
├── SKILL.md (required)
│ ├── YAML frontmatter (name, description required)
│ └── Markdown instructions
└── Bundled Resources (optional)
├── scripts/ - Executable code for deterministic/repetitive tasks
├── references/ - Docs loaded into context as needed
└── assets/ - Files used in output (templates, icons, fonts)
```
#### Progressive Disclosure
Skills use a three-level loading system:
1. **Metadata** (name + description) - Always in context (~100 words)
2. **SKILL.md body** - In context whenever skill triggers (<500 lines ideal)
3. **Bundled resources** - As needed (unlimited, scripts can execute without loading)
These word counts are approximate and you can feel free to go longer if needed.
**Key patterns:**
- Keep SKILL.md under 500 lines; if you're approaching this limit, add an additional layer of hierarchy along with clear pointers about where the model using the skill should go next to follow up.
- Reference files clearly from SKILL.md with guidance on when to read them
- For large reference files (>300 lines), include a table of contents
**Domain organization**: When a skill supports multiple domains/frameworks, organize by variant:
```
cloud-deploy/
├── SKILL.md (workflow + selection)
└── references/
├── aws.md
├── gcp.md
└── azure.md
```
Claude reads only the relevant reference file.
#### Principle of Lack of Surprise
This goes without saying, but skills must not contain malware, exploit code, or any content that could compromise system security. A skill's contents should not surprise the user in their intent if described. Don't go along with requests to create misleading skills or skills designed to facilitate unauthorized access, data exfiltration, or other malicious activities. Things like a "roleplay as an XYZ" are OK though.
#### Writing Patterns
Prefer using the imperative form in instructions.
**Defining output formats** - You can do it like this:
```markdown
## Report structure
ALWAYS use this exact template:
# [Title]
## Executive summary
## Key findings
## Recommendations
```
**Examples pattern** - It's useful to include examples. You can format them like this (but if "Input" and "Output" are in the examples you might want to deviate a little):
```markdown
## Commit message format
**Example 1:**
Input: Added user authentication with JWT tokens
Output: feat(auth): implement JWT-based authentication
```
### Writing Style
Try to explain to the model why things are important in lieu of heavy-handed musty MUSTs. Use theory of mind and try to make the skill general and not super-narrow to specific examples. Start by writing a draft and then look at it with fresh eyes and improve it.
### Test Cases
After writing the skill draft, come up with 2-3 realistic test prompts — the kind of thing a real user would actually say. Share them with the user: [you don't have to use this exact language] "Here are a few test cases I'd like to try. Do these look right, or do you want to add more?" Then run them.
Save test cases to `evals/evals.json`. Don't write assertions yet — just the prompts. You'll draft assertions in the next step while the runs are in progress.
```json
{
"skill_name": "example-skill",
"evals": [
{
"id": 1,
"prompt": "User's task prompt",
"expected_output": "Description of expected result",
"files": []
}
]
}
```
See `references/schemas.md` for the full schema (including the `assertions` field, which you'll add later).
## Running and evaluating test cases
This section is one continuous sequence — don't stop partway through. Do NOT use `/skill-test` or any other testing skill.
Put results in `<skill-name>-workspace/` as a sibling to the skill directory. Within the workspace, organize results by iteration (`iteration-1/`, `iteration-2/`, etc.) and within that, each test case gets a directory (`eval-0/`, `eval-1/`, etc.). Don't create all of this upfront — just create directories as you go.
### Step 1: Spawn all runs (with-skill AND baseline) in the same turn
For each test case, spawn two subagents in the same turn — one with the skill, one without. This is important: don't spawn the with-skill runs first and then come back for baselines later. Launch everything at once so it all finishes around the same time.
**With-skill run:**
```
Execute this task:
- Skill path: <path-to-skill>
- Task: <eval prompt>
- Input files: <eval files if any, or "none">
- Save outputs to: <workspace>/iteration-<N>/eval-<ID>/with_skill/outputs/
- Outputs to save: <what the user cares about e.g., "the .docx file", "the final CSV">
```
**Baseline run** (same prompt, but the baseline depends on context):
- **Creating a new skill**: no skill at all. Same prompt, no skill path, save to `without_skill/outputs/`.
- **Improving an existing skill**: the old version. Before editing, snapshot the skill (`cp -r <skill-path> <workspace>/skill-snapshot/`), then point the baseline subagent at the snapshot. Save to `old_skill/outputs/`.
Write an `eval_metadata.json` for each test case (assertions can be empty for now). Give each eval a descriptive name based on what it's testing — not just "eval-0". Use this name for the directory too. If this iteration uses new or modified eval prompts, create these files for each new eval directory — don't assume they carry over from previous iterations.
```json
{
"eval_id": 0,
"eval_name": "descriptive-name-here",
"prompt": "The user's task prompt",
"assertions": []
}
```
### Step 2: While runs are in progress, draft assertions
Don't just wait for the runs to finish — you can use this time productively. Draft quantitative assertions for each test case and explain them to the user. If assertions already exist in `evals/evals.json`, review them and explain what they check.
Good assertions are objectively verifiable and have descriptive names — they should read clearly in the benchmark viewer so someone glancing at the results immediately understands what each one checks. Subjective skills (writing style, design quality) are better evaluated qualitatively — don't force assertions onto things that need human judgment.
Update the `eval_metadata.json` files and `evals/evals.json` with the assertions once drafted. Also explain to the user what they'll see in the viewer — both the qualitative outputs and the quantitative benchmark.
### Step 3: As runs complete, capture timing data
When each subagent task completes, you receive a notification containing `total_tokens` and `duration_ms`. Save this data immediately to `timing.json` in the run directory:
```json
{
"total_tokens": 84852,
"duration_ms": 23332,
"total_duration_seconds": 23.3
}
```
This is the only opportunity to capture this data — it comes through the task notification and isn't persisted elsewhere. Process each notification as it arrives rather than trying to batch them.
### Step 4: Grade, aggregate, and launch the viewer
Once all runs are done:
1. **Grade each run** — spawn a grader subagent (or grade inline) that reads `agents/grader.md` and evaluates each assertion against the outputs. Save results to `grading.json` in each run directory. The grading.json expectations array must use the fields `text`, `passed`, and `evidence` (not `name`/`met`/`details` or other variants) — the viewer depends on these exact field names. For assertions that can be checked programmatically, write and run a script rather than eyeballing it — scripts are faster, more reliable, and can be reused across iterations.
2. **Aggregate into benchmark** — run the aggregation script from the skill-creator directory:
```bash
python -m scripts.aggregate_benchmark <workspace>/iteration-N --skill-name <name>
```
This produces `benchmark.json` and `benchmark.md` with pass_rate, time, and tokens for each configuration, with mean ± stddev and the delta. If generating benchmark.json manually, see `references/schemas.md` for the exact schema the viewer expects.
Put each with_skill version before its baseline counterpart.
3. **Do an analyst pass** — read the benchmark data and surface patterns the aggregate stats might hide. See `agents/analyzer.md` (the "Analyzing Benchmark Results" section) for what to look for — things like assertions that always pass regardless of skill (non-discriminating), high-variance evals (possibly flaky), and time/token tradeoffs.
4. **Launch the viewer** with both qualitative outputs and quantitative data:
```bash
nohup python <skill-creator-path>/eval-viewer/generate_review.py \
<workspace>/iteration-N \
--skill-name "my-skill" \
--benchmark <workspace>/iteration-N/benchmark.json \
> /dev/null 2>&1 &
VIEWER_PID=$!
```
For iteration 2+, also pass `--previous-workspace <workspace>/iteration-<N-1>`.
**Cowork / headless environments:** If `webbrowser.open()` is not available or the environment has no display, use `--static <output_path>` to write a standalone HTML file instead of starting a server. Feedback will be downloaded as a `feedback.json` file when the user clicks "Submit All Reviews". After download, copy `feedback.json` into the workspace directory for the next iteration to pick up.
Note: please use generate_review.py to create the viewer; there's no need to write custom HTML.
5. **Tell the user** something like: "I've opened the results in your browser. There are two tabs — 'Outputs' lets you click through each test case and leave feedback, 'Benchmark' shows the quantitative comparison. When you're done, come back here and let me know."
### What the user sees in the viewer
The "Outputs" tab shows one test case at a time:
- **Prompt**: the task that was given
- **Output**: the files the skill produced, rendered inline where possible
- **Previous Output** (iteration 2+): collapsed section showing last iteration's output
- **Formal Grades** (if grading was run): collapsed section showing assertion pass/fail
- **Feedback**: a textbox that auto-saves as they type
- **Previous Feedback** (iteration 2+): their comments from last time, shown below the textbox
The "Benchmark" tab shows the stats summary: pass rates, timing, and token usage for each configuration, with per-eval breakdowns and analyst observations.
Navigation is via prev/next buttons or arrow keys. When done, they click "Submit All Reviews" which saves all feedback to `feedback.json`.
### Step 5: Read the feedback
When the user tells you they're done, read `feedback.json`:
```json
{
"reviews": [
{"run_id": "eval-0-with_skill", "feedback": "the chart is missing axis labels", "timestamp": "..."},
{"run_id": "eval-1-with_skill", "feedback": "", "timestamp": "..."},
{"run_id": "eval-2-with_skill", "feedback": "perfect, love this", "timestamp": "..."}
],
"status": "complete"
}
```
Empty feedback means the user thought it was fine. Focus your improvements on the test cases where the user had specific complaints.
Kill the viewer server when you're done with it:
```bash
kill $VIEWER_PID 2>/dev/null
```
---
## Improving the skill
This is the heart of the loop. You've run the test cases, the user has reviewed the results, and now you need to make the skill better based on their feedback.
### How to think about improvements
1. **Generalize from the feedback.** The big picture thing that's happening here is that we're trying to create skills that can be used a million times (maybe literally, maybe even more who knows) across many different prompts. Here you and the user are iterating on only a few examples over and over again because it helps move faster. The user knows these examples in and out and it's quick for them to assess new outputs. But if the skill you and the user are codeveloping works only for those examples, it's useless. Rather than put in fiddly overfitty changes, or oppressively constrictive MUSTs, if there's some stubborn issue, you might try branching out and using different metaphors, or recommending different patterns of working. It's relatively cheap to try and maybe you'll land on something great.
2. **Keep the prompt lean.** Remove things that aren't pulling their weight. Make sure to read the transcripts, not just the final outputs — if it looks like the skill is making the model waste a bunch of time doing things that are unproductive, you can try getting rid of the parts of the skill that are making it do that and seeing what happens.
3. **Explain the why.** Try hard to explain the **why** behind everything you're asking the model to do. Today's LLMs are *smart*. They have good theory of mind and when given a good harness can go beyond rote instructions and really make things happen. Even if the feedback from the user is terse or frustrated, try to actually understand the task and why the user is writing what they wrote, and what they actually wrote, and then transmit this understanding into the instructions. If you find yourself writing ALWAYS or NEVER in all caps, or using super rigid structures, that's a yellow flag — if possible, reframe and explain the reasoning so that the model understands why the thing you're asking for is important. That's a more humane, powerful, and effective approach.
4. **Look for repeated work across test cases.** Read the transcripts from the test runs and notice if the subagents all independently wrote similar helper scripts or took the same multi-step approach to something. If all 3 test cases resulted in the subagent writing a `create_docx.py` or a `build_chart.py`, that's a strong signal the skill should bundle that script. Write it once, put it in `scripts/`, and tell the skill to use it. This saves every future invocation from reinventing the wheel.
This task is pretty important (we are trying to create billions a year in economic value here!) and your thinking time is not the blocker; take your time and really mull things over. I'd suggest writing a draft revision and then looking at it anew and making improvements. Really do your best to get into the head of the user and understand what they want and need.
### The iteration loop
After improving the skill:
1. Apply your improvements to the skill
2. Rerun all test cases into a new `iteration-<N+1>/` directory, including baseline runs. If you're creating a new skill, the baseline is always `without_skill` (no skill) — that stays the same across iterations. If you're improving an existing skill, use your judgment on what makes sense as the baseline: the original version the user came in with, or the previous iteration.
3. Launch the reviewer with `--previous-workspace` pointing at the previous iteration
4. Wait for the user to review and tell you they're done
5. Read the new feedback, improve again, repeat
Keep going until:
- The user says they're happy
- The feedback is all empty (everything looks good)
- You're not making meaningful progress
---
## Advanced: Blind comparison
For situations where you want a more rigorous comparison between two versions of a skill (e.g., the user asks "is the new version actually better?"), there's a blind comparison system. Read `agents/comparator.md` and `agents/analyzer.md` for the details. The basic idea is: give two outputs to an independent agent without telling it which is which, and let it judge quality. Then analyze why the winner won.
This is optional, requires subagents, and most users won't need it. The human review loop is usually sufficient.
---
## Description Optimization
The description field in SKILL.md frontmatter is the primary mechanism that determines whether Claude invokes a skill. After creating or improving a skill, offer to optimize the description for better triggering accuracy.
### Step 1: Generate trigger eval queries
Create 20 eval queries — a mix of should-trigger and should-not-trigger. Save as JSON:
```json
[
{"query": "the user prompt", "should_trigger": true},
{"query": "another prompt", "should_trigger": false}
]
```
The queries must be realistic and something a Claude Code or Claude.ai user would actually type. Not abstract requests, but requests that are concrete and specific and have a good amount of detail. For instance, file paths, personal context about the user's job or situation, column names and values, company names, URLs. A little bit of backstory. Some might be in lowercase or contain abbreviations or typos or casual speech. Use a mix of different lengths, and focus on edge cases rather than making them clear-cut (the user will get a chance to sign off on them).
Bad: `"Format this data"`, `"Extract text from PDF"`, `"Create a chart"`
Good: `"ok so my boss just sent me this xlsx file (its in my downloads, called something like 'Q4 sales final FINAL v2.xlsx') and she wants me to add a column that shows the profit margin as a percentage. The revenue is in column C and costs are in column D i think"`
For the **should-trigger** queries (8-10), think about coverage. You want different phrasings of the same intent — some formal, some casual. Include cases where the user doesn't explicitly name the skill or file type but clearly needs it. Throw in some uncommon use cases and cases where this skill competes with another but should win.
For the **should-not-trigger** queries (8-10), the most valuable ones are the near-misses — queries that share keywords or concepts with the skill but actually need something different. Think adjacent domains, ambiguous phrasing where a naive keyword match would trigger but shouldn't, and cases where the query touches on something the skill does but in a context where another tool is more appropriate.
The key thing to avoid: don't make should-not-trigger queries obviously irrelevant. "Write a fibonacci function" as a negative test for a PDF skill is too easy — it doesn't test anything. The negative cases should be genuinely tricky.
### Step 2: Review with user
Present the eval set to the user for review using the HTML template:
1. Read the template from `assets/eval_review.html`
2. Replace the placeholders:
- `__EVAL_DATA_PLACEHOLDER__` → the JSON array of eval items (no quotes around it — it's a JS variable assignment)
- `__SKILL_NAME_PLACEHOLDER__` → the skill's name
- `__SKILL_DESCRIPTION_PLACEHOLDER__` → the skill's current description
3. Write to a temp file (e.g., `/tmp/eval_review_<skill-name>.html`) and open it: `open /tmp/eval_review_<skill-name>.html`
4. The user can edit queries, toggle should-trigger, add/remove entries, then click "Export Eval Set"
5. The file downloads to `~/Downloads/eval_set.json` — check the Downloads folder for the most recent version in case there are multiple (e.g., `eval_set (1).json`)
This step matters — bad eval queries lead to bad descriptions.
### Step 3: Run the optimization loop
Tell the user: "This will take some time — I'll run the optimization loop in the background and check on it periodically."
Save the eval set to the workspace, then run in the background:
```bash
python -m scripts.run_loop \
--eval-set <path-to-trigger-eval.json> \
--skill-path <path-to-skill> \
--model <model-id-powering-this-session> \
--max-iterations 5 \
--verbose
```
Use the model ID from your system prompt (the one powering the current session) so the triggering test matches what the user actually experiences.
While it runs, periodically tail the output to give the user updates on which iteration it's on and what the scores look like.
This handles the full optimization loop automatically. It splits the eval set into 60% train and 40% held-out test, evaluates the current description (running each query 3 times to get a reliable trigger rate), then calls Claude to propose improvements based on what failed. It re-evaluates each new description on both train and test, iterating up to 5 times. When it's done, it opens an HTML report in the browser showing the results per iteration and returns JSON with `best_description` — selected by test score rather than train score to avoid overfitting.
### How skill triggering works
Understanding the triggering mechanism helps design better eval queries. Skills appear in Claude's `available_skills` list with their name + description, and Claude decides whether to consult a skill based on that description. The important thing to know is that Claude only consults skills for tasks it can't easily handle on its own — simple, one-step queries like "read this PDF" may not trigger a skill even if the description matches perfectly, because Claude can handle them directly with basic tools. Complex, multi-step, or specialized queries reliably trigger skills when the description matches.
This means your eval queries should be substantive enough that Claude would actually benefit from consulting a skill. Simple queries like "read file X" are poor test cases — they won't trigger skills regardless of description quality.
### Step 4: Apply the result
Take `best_description` from the JSON output and update the skill's SKILL.md frontmatter. Show the user before/after and report the scores.
---
### Package and Present (only if `present_files` tool is available)
Check whether you have access to the `present_files` tool. If you don't, skip this step. If you do, package the skill and present the .skill file to the user:
```bash
python -m scripts.package_skill <path/to/skill-folder>
```
After packaging, direct the user to the resulting `.skill` file path so they can install it.
---
## Claude.ai-specific instructions
In Claude.ai, the core workflow is the same (draft → test → review → improve → repeat), but because Claude.ai doesn't have subagents, some mechanics change. Here's what to adapt:
**Running test cases**: No subagents means no parallel execution. For each test case, read the skill's SKILL.md, then follow its instructions to accomplish the test prompt yourself. Do them one at a time. This is less rigorous than independent subagents (you wrote the skill and you're also running it, so you have full context), but it's a useful sanity check — and the human review step compensates. Skip the baseline runs — just use the skill to complete the task as requested.
**Reviewing results**: If you can't open a browser (e.g., Claude.ai's VM has no display, or you're on a remote server), skip the browser reviewer entirely. Instead, present results directly in the conversation. For each test case, show the prompt and the output. If the output is a file the user needs to see (like a .docx or .xlsx), save it to the filesystem and tell them where it is so they can download and inspect it. Ask for feedback inline: "How does this look? Anything you'd change?"
**Benchmarking**: Skip the quantitative benchmarking — it relies on baseline comparisons which aren't meaningful without subagents. Focus on qualitative feedback from the user.
**The iteration loop**: Same as before — improve the skill, rerun the test cases, ask for feedback — just without the browser reviewer in the middle. You can still organize results into iteration directories on the filesystem if you have one.
**Description optimization**: This section requires the `claude` CLI tool (specifically `claude -p`) which is only available in Claude Code. Skip it if you're on Claude.ai.
**Blind comparison**: Requires subagents. Skip it.
**Packaging**: The `package_skill.py` script works anywhere with Python and a filesystem. On Claude.ai, you can run it and the user can download the resulting `.skill` file.
**Updating an existing skill**: The user might be asking you to update an existing skill, not create a new one. In this case:
- **Preserve the original name.** Note the skill's directory name and `name` frontmatter field -- use them unchanged. E.g., if the installed skill is `research-helper`, output `research-helper.skill` (not `research-helper-v2`).
- **Copy to a writeable location before editing.** The installed skill path may be read-only. Copy to `/tmp/skill-name/`, edit there, and package from the copy.
- **If packaging manually, stage in `/tmp/` first**, then copy to the output directory -- direct writes may fail due to permissions.
---
## Cowork-Specific Instructions
If you're in Cowork, the main things to know are:
- You have subagents, so the main workflow (spawn test cases in parallel, run baselines, grade, etc.) all works. (However, if you run into severe problems with timeouts, it's OK to run the test prompts in series rather than parallel.)
- You don't have a browser or display, so when generating the eval viewer, use `--static <output_path>` to write a standalone HTML file instead of starting a server. Then proffer a link that the user can click to open the HTML in their browser.
- For whatever reason, the Cowork setup seems to disincline Claude from generating the eval viewer after running the tests, so just to reiterate: whether you're in Cowork or in Claude Code, after running tests, you should always generate the eval viewer for the human to look at examples before revising the skill yourself and trying to make corrections, using `generate_review.py` (not writing your own boutique html code). Sorry in advance but I'm gonna go all caps here: GENERATE THE EVAL VIEWER *BEFORE* evaluating inputs yourself. You want to get them in front of the human ASAP!
- Feedback works differently: since there's no running server, the viewer's "Submit All Reviews" button will download `feedback.json` as a file. You can then read it from there (you may have to request access first).
- Packaging works — `package_skill.py` just needs Python and a filesystem.
- Description optimization (`run_loop.py` / `run_eval.py`) should work in Cowork just fine since it uses `claude -p` via subprocess, not a browser, but please save it until you've fully finished making the skill and the user agrees it's in good shape.
- **Updating an existing skill**: The user might be asking you to update an existing skill, not create a new one. Follow the update guidance in the claude.ai section above.
---
## Reference files
The agents/ directory contains instructions for specialized subagents. Read them when you need to spawn the relevant subagent.
- `agents/grader.md` — How to evaluate assertions against outputs
- `agents/comparator.md` — How to do blind A/B comparison between two outputs
- `agents/analyzer.md` — How to analyze why one version beat another
The references/ directory has additional documentation:
- `references/schemas.md` — JSON structures for evals.json, grading.json, etc.
---
Repeating one more time the core loop here for emphasis:
- Figure out what the skill is about
- Draft or edit the skill
- Run claude-with-access-to-the-skill on test prompts
- With the user, evaluate the outputs:
- Create benchmark.json and run `eval-viewer/generate_review.py` to help the user review them
- Run quantitative evals
- Repeat until you and the user are satisfied
- Package the final skill and return it to the user.
Please add steps to your TodoList, if you have such a thing, to make sure you don't forget. If you're in Cowork, please specifically put "Create evals JSON and run `eval-viewer/generate_review.py` so human can review test cases" in your TodoList to make sure it happens.
Good luck!

View File

@ -0,0 +1,274 @@
# Post-hoc Analyzer Agent
Analyze blind comparison results to understand WHY the winner won and generate improvement suggestions.
## Role
After the blind comparator determines a winner, the Post-hoc Analyzer "unblids" the results by examining the skills and transcripts. The goal is to extract actionable insights: what made the winner better, and how can the loser be improved?
## Inputs
You receive these parameters in your prompt:
- **winner**: "A" or "B" (from blind comparison)
- **winner_skill_path**: Path to the skill that produced the winning output
- **winner_transcript_path**: Path to the execution transcript for the winner
- **loser_skill_path**: Path to the skill that produced the losing output
- **loser_transcript_path**: Path to the execution transcript for the loser
- **comparison_result_path**: Path to the blind comparator's output JSON
- **output_path**: Where to save the analysis results
## Process
### Step 1: Read Comparison Result
1. Read the blind comparator's output at comparison_result_path
2. Note the winning side (A or B), the reasoning, and any scores
3. Understand what the comparator valued in the winning output
### Step 2: Read Both Skills
1. Read the winner skill's SKILL.md and key referenced files
2. Read the loser skill's SKILL.md and key referenced files
3. Identify structural differences:
- Instructions clarity and specificity
- Script/tool usage patterns
- Example coverage
- Edge case handling
### Step 3: Read Both Transcripts
1. Read the winner's transcript
2. Read the loser's transcript
3. Compare execution patterns:
- How closely did each follow their skill's instructions?
- What tools were used differently?
- Where did the loser diverge from optimal behavior?
- Did either encounter errors or make recovery attempts?
### Step 4: Analyze Instruction Following
For each transcript, evaluate:
- Did the agent follow the skill's explicit instructions?
- Did the agent use the skill's provided tools/scripts?
- Were there missed opportunities to leverage skill content?
- Did the agent add unnecessary steps not in the skill?
Score instruction following 1-10 and note specific issues.
### Step 5: Identify Winner Strengths
Determine what made the winner better:
- Clearer instructions that led to better behavior?
- Better scripts/tools that produced better output?
- More comprehensive examples that guided edge cases?
- Better error handling guidance?
Be specific. Quote from skills/transcripts where relevant.
### Step 6: Identify Loser Weaknesses
Determine what held the loser back:
- Ambiguous instructions that led to suboptimal choices?
- Missing tools/scripts that forced workarounds?
- Gaps in edge case coverage?
- Poor error handling that caused failures?
### Step 7: Generate Improvement Suggestions
Based on the analysis, produce actionable suggestions for improving the loser skill:
- Specific instruction changes to make
- Tools/scripts to add or modify
- Examples to include
- Edge cases to address
Prioritize by impact. Focus on changes that would have changed the outcome.
### Step 8: Write Analysis Results
Save structured analysis to `{output_path}`.
## Output Format
Write a JSON file with this structure:
```json
{
"comparison_summary": {
"winner": "A",
"winner_skill": "path/to/winner/skill",
"loser_skill": "path/to/loser/skill",
"comparator_reasoning": "Brief summary of why comparator chose winner"
},
"winner_strengths": [
"Clear step-by-step instructions for handling multi-page documents",
"Included validation script that caught formatting errors",
"Explicit guidance on fallback behavior when OCR fails"
],
"loser_weaknesses": [
"Vague instruction 'process the document appropriately' led to inconsistent behavior",
"No script for validation, agent had to improvise and made errors",
"No guidance on OCR failure, agent gave up instead of trying alternatives"
],
"instruction_following": {
"winner": {
"score": 9,
"issues": [
"Minor: skipped optional logging step"
]
},
"loser": {
"score": 6,
"issues": [
"Did not use the skill's formatting template",
"Invented own approach instead of following step 3",
"Missed the 'always validate output' instruction"
]
}
},
"improvement_suggestions": [
{
"priority": "high",
"category": "instructions",
"suggestion": "Replace 'process the document appropriately' with explicit steps: 1) Extract text, 2) Identify sections, 3) Format per template",
"expected_impact": "Would eliminate ambiguity that caused inconsistent behavior"
},
{
"priority": "high",
"category": "tools",
"suggestion": "Add validate_output.py script similar to winner skill's validation approach",
"expected_impact": "Would catch formatting errors before final output"
},
{
"priority": "medium",
"category": "error_handling",
"suggestion": "Add fallback instructions: 'If OCR fails, try: 1) different resolution, 2) image preprocessing, 3) manual extraction'",
"expected_impact": "Would prevent early failure on difficult documents"
}
],
"transcript_insights": {
"winner_execution_pattern": "Read skill -> Followed 5-step process -> Used validation script -> Fixed 2 issues -> Produced output",
"loser_execution_pattern": "Read skill -> Unclear on approach -> Tried 3 different methods -> No validation -> Output had errors"
}
}
```
## Guidelines
- **Be specific**: Quote from skills and transcripts, don't just say "instructions were unclear"
- **Be actionable**: Suggestions should be concrete changes, not vague advice
- **Focus on skill improvements**: The goal is to improve the losing skill, not critique the agent
- **Prioritize by impact**: Which changes would most likely have changed the outcome?
- **Consider causation**: Did the skill weakness actually cause the worse output, or is it incidental?
- **Stay objective**: Analyze what happened, don't editorialize
- **Think about generalization**: Would this improvement help on other evals too?
## Categories for Suggestions
Use these categories to organize improvement suggestions:
| Category | Description |
|----------|-------------|
| `instructions` | Changes to the skill's prose instructions |
| `tools` | Scripts, templates, or utilities to add/modify |
| `examples` | Example inputs/outputs to include |
| `error_handling` | Guidance for handling failures |
| `structure` | Reorganization of skill content |
| `references` | External docs or resources to add |
## Priority Levels
- **high**: Would likely change the outcome of this comparison
- **medium**: Would improve quality but may not change win/loss
- **low**: Nice to have, marginal improvement
---
# Analyzing Benchmark Results
When analyzing benchmark results, the analyzer's purpose is to **surface patterns and anomalies** across multiple runs, not suggest skill improvements.
## Role
Review all benchmark run results and generate freeform notes that help the user understand skill performance. Focus on patterns that wouldn't be visible from aggregate metrics alone.
## Inputs
You receive these parameters in your prompt:
- **benchmark_data_path**: Path to the in-progress benchmark.json with all run results
- **skill_path**: Path to the skill being benchmarked
- **output_path**: Where to save the notes (as JSON array of strings)
## Process
### Step 1: Read Benchmark Data
1. Read the benchmark.json containing all run results
2. Note the configurations tested (with_skill, without_skill)
3. Understand the run_summary aggregates already calculated
### Step 2: Analyze Per-Assertion Patterns
For each expectation across all runs:
- Does it **always pass** in both configurations? (may not differentiate skill value)
- Does it **always fail** in both configurations? (may be broken or beyond capability)
- Does it **always pass with skill but fail without**? (skill clearly adds value here)
- Does it **always fail with skill but pass without**? (skill may be hurting)
- Is it **highly variable**? (flaky expectation or non-deterministic behavior)
### Step 3: Analyze Cross-Eval Patterns
Look for patterns across evals:
- Are certain eval types consistently harder/easier?
- Do some evals show high variance while others are stable?
- Are there surprising results that contradict expectations?
### Step 4: Analyze Metrics Patterns
Look at time_seconds, tokens, tool_calls:
- Does the skill significantly increase execution time?
- Is there high variance in resource usage?
- Are there outlier runs that skew the aggregates?
### Step 5: Generate Notes
Write freeform observations as a list of strings. Each note should:
- State a specific observation
- Be grounded in the data (not speculation)
- Help the user understand something the aggregate metrics don't show
Examples:
- "Assertion 'Output is a PDF file' passes 100% in both configurations - may not differentiate skill value"
- "Eval 3 shows high variance (50% ± 40%) - run 2 had an unusual failure that may be flaky"
- "Without-skill runs consistently fail on table extraction expectations (0% pass rate)"
- "Skill adds 13s average execution time but improves pass rate by 50%"
- "Token usage is 80% higher with skill, primarily due to script output parsing"
- "All 3 without-skill runs for eval 1 produced empty output"
### Step 6: Write Notes
Save notes to `{output_path}` as a JSON array of strings:
```json
[
"Assertion 'Output is a PDF file' passes 100% in both configurations - may not differentiate skill value",
"Eval 3 shows high variance (50% ± 40%) - run 2 had an unusual failure",
"Without-skill runs consistently fail on table extraction expectations",
"Skill adds 13s average execution time but improves pass rate by 50%"
]
```
## Guidelines
**DO:**
- Report what you observe in the data
- Be specific about which evals, expectations, or runs you're referring to
- Note patterns that aggregate metrics would hide
- Provide context that helps interpret the numbers
**DO NOT:**
- Suggest improvements to the skill (that's for the improvement step, not benchmarking)
- Make subjective quality judgments ("the output was good/bad")
- Speculate about causes without evidence
- Repeat information already in the run_summary aggregates

View File

@ -0,0 +1,202 @@
# Blind Comparator Agent
Compare two outputs WITHOUT knowing which skill produced them.
## Role
The Blind Comparator judges which output better accomplishes the eval task. You receive two outputs labeled A and B, but you do NOT know which skill produced which. This prevents bias toward a particular skill or approach.
Your judgment is based purely on output quality and task completion.
## Inputs
You receive these parameters in your prompt:
- **output_a_path**: Path to the first output file or directory
- **output_b_path**: Path to the second output file or directory
- **eval_prompt**: The original task/prompt that was executed
- **expectations**: List of expectations to check (optional - may be empty)
## Process
### Step 1: Read Both Outputs
1. Examine output A (file or directory)
2. Examine output B (file or directory)
3. Note the type, structure, and content of each
4. If outputs are directories, examine all relevant files inside
### Step 2: Understand the Task
1. Read the eval_prompt carefully
2. Identify what the task requires:
- What should be produced?
- What qualities matter (accuracy, completeness, format)?
- What would distinguish a good output from a poor one?
### Step 3: Generate Evaluation Rubric
Based on the task, generate a rubric with two dimensions:
**Content Rubric** (what the output contains):
| Criterion | 1 (Poor) | 3 (Acceptable) | 5 (Excellent) |
|-----------|----------|----------------|---------------|
| Correctness | Major errors | Minor errors | Fully correct |
| Completeness | Missing key elements | Mostly complete | All elements present |
| Accuracy | Significant inaccuracies | Minor inaccuracies | Accurate throughout |
**Structure Rubric** (how the output is organized):
| Criterion | 1 (Poor) | 3 (Acceptable) | 5 (Excellent) |
|-----------|----------|----------------|---------------|
| Organization | Disorganized | Reasonably organized | Clear, logical structure |
| Formatting | Inconsistent/broken | Mostly consistent | Professional, polished |
| Usability | Difficult to use | Usable with effort | Easy to use |
Adapt criteria to the specific task. For example:
- PDF form → "Field alignment", "Text readability", "Data placement"
- Document → "Section structure", "Heading hierarchy", "Paragraph flow"
- Data output → "Schema correctness", "Data types", "Completeness"
### Step 4: Evaluate Each Output Against the Rubric
For each output (A and B):
1. **Score each criterion** on the rubric (1-5 scale)
2. **Calculate dimension totals**: Content score, Structure score
3. **Calculate overall score**: Average of dimension scores, scaled to 1-10
### Step 5: Check Assertions (if provided)
If expectations are provided:
1. Check each expectation against output A
2. Check each expectation against output B
3. Count pass rates for each output
4. Use expectation scores as secondary evidence (not the primary decision factor)
### Step 6: Determine the Winner
Compare A and B based on (in priority order):
1. **Primary**: Overall rubric score (content + structure)
2. **Secondary**: Assertion pass rates (if applicable)
3. **Tiebreaker**: If truly equal, declare a TIE
Be decisive - ties should be rare. One output is usually better, even if marginally.
### Step 7: Write Comparison Results
Save results to a JSON file at the path specified (or `comparison.json` if not specified).
## Output Format
Write a JSON file with this structure:
```json
{
"winner": "A",
"reasoning": "Output A provides a complete solution with proper formatting and all required fields. Output B is missing the date field and has formatting inconsistencies.",
"rubric": {
"A": {
"content": {
"correctness": 5,
"completeness": 5,
"accuracy": 4
},
"structure": {
"organization": 4,
"formatting": 5,
"usability": 4
},
"content_score": 4.7,
"structure_score": 4.3,
"overall_score": 9.0
},
"B": {
"content": {
"correctness": 3,
"completeness": 2,
"accuracy": 3
},
"structure": {
"organization": 3,
"formatting": 2,
"usability": 3
},
"content_score": 2.7,
"structure_score": 2.7,
"overall_score": 5.4
}
},
"output_quality": {
"A": {
"score": 9,
"strengths": ["Complete solution", "Well-formatted", "All fields present"],
"weaknesses": ["Minor style inconsistency in header"]
},
"B": {
"score": 5,
"strengths": ["Readable output", "Correct basic structure"],
"weaknesses": ["Missing date field", "Formatting inconsistencies", "Partial data extraction"]
}
},
"expectation_results": {
"A": {
"passed": 4,
"total": 5,
"pass_rate": 0.80,
"details": [
{"text": "Output includes name", "passed": true},
{"text": "Output includes date", "passed": true},
{"text": "Format is PDF", "passed": true},
{"text": "Contains signature", "passed": false},
{"text": "Readable text", "passed": true}
]
},
"B": {
"passed": 3,
"total": 5,
"pass_rate": 0.60,
"details": [
{"text": "Output includes name", "passed": true},
{"text": "Output includes date", "passed": false},
{"text": "Format is PDF", "passed": true},
{"text": "Contains signature", "passed": false},
{"text": "Readable text", "passed": true}
]
}
}
}
```
If no expectations were provided, omit the `expectation_results` field entirely.
## Field Descriptions
- **winner**: "A", "B", or "TIE"
- **reasoning**: Clear explanation of why the winner was chosen (or why it's a tie)
- **rubric**: Structured rubric evaluation for each output
- **content**: Scores for content criteria (correctness, completeness, accuracy)
- **structure**: Scores for structure criteria (organization, formatting, usability)
- **content_score**: Average of content criteria (1-5)
- **structure_score**: Average of structure criteria (1-5)
- **overall_score**: Combined score scaled to 1-10
- **output_quality**: Summary quality assessment
- **score**: 1-10 rating (should match rubric overall_score)
- **strengths**: List of positive aspects
- **weaknesses**: List of issues or shortcomings
- **expectation_results**: (Only if expectations provided)
- **passed**: Number of expectations that passed
- **total**: Total number of expectations
- **pass_rate**: Fraction passed (0.0 to 1.0)
- **details**: Individual expectation results
## Guidelines
- **Stay blind**: DO NOT try to infer which skill produced which output. Judge purely on output quality.
- **Be specific**: Cite specific examples when explaining strengths and weaknesses.
- **Be decisive**: Choose a winner unless outputs are genuinely equivalent.
- **Output quality first**: Assertion scores are secondary to overall task completion.
- **Be objective**: Don't favor outputs based on style preferences; focus on correctness and completeness.
- **Explain your reasoning**: The reasoning field should make it clear why you chose the winner.
- **Handle edge cases**: If both outputs fail, pick the one that fails less badly. If both are excellent, pick the one that's marginally better.

View File

@ -0,0 +1,223 @@
# Grader Agent
Evaluate expectations against an execution transcript and outputs.
## Role
The Grader reviews a transcript and output files, then determines whether each expectation passes or fails. Provide clear evidence for each judgment.
You have two jobs: grade the outputs, and critique the evals themselves. A passing grade on a weak assertion is worse than useless — it creates false confidence. When you notice an assertion that's trivially satisfied, or an important outcome that no assertion checks, say so.
## Inputs
You receive these parameters in your prompt:
- **expectations**: List of expectations to evaluate (strings)
- **transcript_path**: Path to the execution transcript (markdown file)
- **outputs_dir**: Directory containing output files from execution
## Process
### Step 1: Read the Transcript
1. Read the transcript file completely
2. Note the eval prompt, execution steps, and final result
3. Identify any issues or errors documented
### Step 2: Examine Output Files
1. List files in outputs_dir
2. Read/examine each file relevant to the expectations. If outputs aren't plain text, use the inspection tools provided in your prompt — don't rely solely on what the transcript says the executor produced.
3. Note contents, structure, and quality
### Step 3: Evaluate Each Assertion
For each expectation:
1. **Search for evidence** in the transcript and outputs
2. **Determine verdict**:
- **PASS**: Clear evidence the expectation is true AND the evidence reflects genuine task completion, not just surface-level compliance
- **FAIL**: No evidence, or evidence contradicts the expectation, or the evidence is superficial (e.g., correct filename but empty/wrong content)
3. **Cite the evidence**: Quote the specific text or describe what you found
### Step 4: Extract and Verify Claims
Beyond the predefined expectations, extract implicit claims from the outputs and verify them:
1. **Extract claims** from the transcript and outputs:
- Factual statements ("The form has 12 fields")
- Process claims ("Used pypdf to fill the form")
- Quality claims ("All fields were filled correctly")
2. **Verify each claim**:
- **Factual claims**: Can be checked against the outputs or external sources
- **Process claims**: Can be verified from the transcript
- **Quality claims**: Evaluate whether the claim is justified
3. **Flag unverifiable claims**: Note claims that cannot be verified with available information
This catches issues that predefined expectations might miss.
### Step 5: Read User Notes
If `{outputs_dir}/user_notes.md` exists:
1. Read it and note any uncertainties or issues flagged by the executor
2. Include relevant concerns in the grading output
3. These may reveal problems even when expectations pass
### Step 6: Critique the Evals
After grading, consider whether the evals themselves could be improved. Only surface suggestions when there's a clear gap.
Good suggestions test meaningful outcomes — assertions that are hard to satisfy without actually doing the work correctly. Think about what makes an assertion *discriminating*: it passes when the skill genuinely succeeds and fails when it doesn't.
Suggestions worth raising:
- An assertion that passed but would also pass for a clearly wrong output (e.g., checking filename existence but not file content)
- An important outcome you observed — good or bad — that no assertion covers at all
- An assertion that can't actually be verified from the available outputs
Keep the bar high. The goal is to flag things the eval author would say "good catch" about, not to nitpick every assertion.
### Step 7: Write Grading Results
Save results to `{outputs_dir}/../grading.json` (sibling to outputs_dir).
## Grading Criteria
**PASS when**:
- The transcript or outputs clearly demonstrate the expectation is true
- Specific evidence can be cited
- The evidence reflects genuine substance, not just surface compliance (e.g., a file exists AND contains correct content, not just the right filename)
**FAIL when**:
- No evidence found for the expectation
- Evidence contradicts the expectation
- The expectation cannot be verified from available information
- The evidence is superficial — the assertion is technically satisfied but the underlying task outcome is wrong or incomplete
- The output appears to meet the assertion by coincidence rather than by actually doing the work
**When uncertain**: The burden of proof to pass is on the expectation.
### Step 8: Read Executor Metrics and Timing
1. If `{outputs_dir}/metrics.json` exists, read it and include in grading output
2. If `{outputs_dir}/../timing.json` exists, read it and include timing data
## Output Format
Write a JSON file with this structure:
```json
{
"expectations": [
{
"text": "The output includes the name 'John Smith'",
"passed": true,
"evidence": "Found in transcript Step 3: 'Extracted names: John Smith, Sarah Johnson'"
},
{
"text": "The spreadsheet has a SUM formula in cell B10",
"passed": false,
"evidence": "No spreadsheet was created. The output was a text file."
},
{
"text": "The assistant used the skill's OCR script",
"passed": true,
"evidence": "Transcript Step 2 shows: 'Tool: Bash - python ocr_script.py image.png'"
}
],
"summary": {
"passed": 2,
"failed": 1,
"total": 3,
"pass_rate": 0.67
},
"execution_metrics": {
"tool_calls": {
"Read": 5,
"Write": 2,
"Bash": 8
},
"total_tool_calls": 15,
"total_steps": 6,
"errors_encountered": 0,
"output_chars": 12450,
"transcript_chars": 3200
},
"timing": {
"executor_duration_seconds": 165.0,
"grader_duration_seconds": 26.0,
"total_duration_seconds": 191.0
},
"claims": [
{
"claim": "The form has 12 fillable fields",
"type": "factual",
"verified": true,
"evidence": "Counted 12 fields in field_info.json"
},
{
"claim": "All required fields were populated",
"type": "quality",
"verified": false,
"evidence": "Reference section was left blank despite data being available"
}
],
"user_notes_summary": {
"uncertainties": ["Used 2023 data, may be stale"],
"needs_review": [],
"workarounds": ["Fell back to text overlay for non-fillable fields"]
},
"eval_feedback": {
"suggestions": [
{
"assertion": "The output includes the name 'John Smith'",
"reason": "A hallucinated document that mentions the name would also pass — consider checking it appears as the primary contact with matching phone and email from the input"
},
{
"reason": "No assertion checks whether the extracted phone numbers match the input — I observed incorrect numbers in the output that went uncaught"
}
],
"overall": "Assertions check presence but not correctness. Consider adding content verification."
}
}
```
## Field Descriptions
- **expectations**: Array of graded expectations
- **text**: The original expectation text
- **passed**: Boolean - true if expectation passes
- **evidence**: Specific quote or description supporting the verdict
- **summary**: Aggregate statistics
- **passed**: Count of passed expectations
- **failed**: Count of failed expectations
- **total**: Total expectations evaluated
- **pass_rate**: Fraction passed (0.0 to 1.0)
- **execution_metrics**: Copied from executor's metrics.json (if available)
- **output_chars**: Total character count of output files (proxy for tokens)
- **transcript_chars**: Character count of transcript
- **timing**: Wall clock timing from timing.json (if available)
- **executor_duration_seconds**: Time spent in executor subagent
- **total_duration_seconds**: Total elapsed time for the run
- **claims**: Extracted and verified claims from the output
- **claim**: The statement being verified
- **type**: "factual", "process", or "quality"
- **verified**: Boolean - whether the claim holds
- **evidence**: Supporting or contradicting evidence
- **user_notes_summary**: Issues flagged by the executor
- **uncertainties**: Things the executor wasn't sure about
- **needs_review**: Items requiring human attention
- **workarounds**: Places where the skill didn't work as expected
- **eval_feedback**: Improvement suggestions for the evals (only when warranted)
- **suggestions**: List of concrete suggestions, each with a `reason` and optionally an `assertion` it relates to
- **overall**: Brief assessment — can be "No suggestions, evals look solid" if nothing to flag
## Guidelines
- **Be objective**: Base verdicts on evidence, not assumptions
- **Be specific**: Quote the exact text that supports your verdict
- **Be thorough**: Check both transcript and output files
- **Be consistent**: Apply the same standard to each expectation
- **Explain failures**: Make it clear why evidence was insufficient
- **No partial credit**: Each expectation is pass or fail, not partial

View File

@ -0,0 +1,146 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<title>Eval Set Review - __SKILL_NAME_PLACEHOLDER__</title>
<link rel="preconnect" href="https://fonts.googleapis.com">
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
<link href="https://fonts.googleapis.com/css2?family=Poppins:wght@500;600&family=Lora:wght@400;500&display=swap" rel="stylesheet">
<style>
* { box-sizing: border-box; margin: 0; padding: 0; }
body { font-family: 'Lora', Georgia, serif; background: #faf9f5; padding: 2rem; color: #141413; }
h1 { font-family: 'Poppins', sans-serif; margin-bottom: 0.5rem; font-size: 1.5rem; }
.description { color: #b0aea5; margin-bottom: 1.5rem; font-style: italic; max-width: 900px; }
.controls { margin-bottom: 1rem; display: flex; gap: 0.5rem; }
.btn { font-family: 'Poppins', sans-serif; padding: 0.5rem 1rem; border: none; border-radius: 6px; cursor: pointer; font-size: 0.875rem; font-weight: 500; }
.btn-add { background: #6a9bcc; color: white; }
.btn-add:hover { background: #5889b8; }
.btn-export { background: #d97757; color: white; }
.btn-export:hover { background: #c4613f; }
table { width: 100%; max-width: 1100px; border-collapse: collapse; background: white; border-radius: 6px; overflow: hidden; box-shadow: 0 1px 3px rgba(0,0,0,0.08); }
th { font-family: 'Poppins', sans-serif; background: #141413; color: #faf9f5; padding: 0.75rem 1rem; text-align: left; font-size: 0.875rem; }
td { padding: 0.75rem 1rem; border-bottom: 1px solid #e8e6dc; vertical-align: top; }
tr:nth-child(even) td { background: #faf9f5; }
tr:hover td { background: #f3f1ea; }
.section-header td { background: #e8e6dc; font-family: 'Poppins', sans-serif; font-weight: 500; font-size: 0.8rem; color: #141413; text-transform: uppercase; letter-spacing: 0.05em; }
.query-input { width: 100%; padding: 0.4rem; border: 1px solid #e8e6dc; border-radius: 4px; font-size: 0.875rem; font-family: 'Lora', Georgia, serif; resize: vertical; min-height: 60px; }
.query-input:focus { outline: none; border-color: #d97757; box-shadow: 0 0 0 2px rgba(217,119,87,0.15); }
.toggle { position: relative; display: inline-block; width: 44px; height: 24px; }
.toggle input { opacity: 0; width: 0; height: 0; }
.toggle .slider { position: absolute; inset: 0; background: #b0aea5; border-radius: 24px; cursor: pointer; transition: 0.2s; }
.toggle .slider::before { content: ""; position: absolute; width: 18px; height: 18px; left: 3px; bottom: 3px; background: white; border-radius: 50%; transition: 0.2s; }
.toggle input:checked + .slider { background: #d97757; }
.toggle input:checked + .slider::before { transform: translateX(20px); }
.btn-delete { background: #c44; color: white; padding: 0.3rem 0.6rem; border: none; border-radius: 4px; cursor: pointer; font-size: 0.75rem; font-family: 'Poppins', sans-serif; }
.btn-delete:hover { background: #a33; }
.summary { margin-top: 1rem; color: #b0aea5; font-size: 0.875rem; }
</style>
</head>
<body>
<h1>Eval Set Review: <span id="skill-name">__SKILL_NAME_PLACEHOLDER__</span></h1>
<p class="description">Current description: <span id="skill-desc">__SKILL_DESCRIPTION_PLACEHOLDER__</span></p>
<div class="controls">
<button class="btn btn-add" onclick="addRow()">+ Add Query</button>
<button class="btn btn-export" onclick="exportEvalSet()">Export Eval Set</button>
</div>
<table>
<thead>
<tr>
<th style="width:65%">Query</th>
<th style="width:18%">Should Trigger</th>
<th style="width:10%">Actions</th>
</tr>
</thead>
<tbody id="eval-body"></tbody>
</table>
<p class="summary" id="summary"></p>
<script>
const EVAL_DATA = __EVAL_DATA_PLACEHOLDER__;
let evalItems = [...EVAL_DATA];
function render() {
const tbody = document.getElementById('eval-body');
tbody.innerHTML = '';
// Sort: should-trigger first, then should-not-trigger
const sorted = evalItems
.map((item, origIdx) => ({ ...item, origIdx }))
.sort((a, b) => (b.should_trigger ? 1 : 0) - (a.should_trigger ? 1 : 0));
let lastGroup = null;
sorted.forEach(item => {
const group = item.should_trigger ? 'trigger' : 'no-trigger';
if (group !== lastGroup) {
const headerRow = document.createElement('tr');
headerRow.className = 'section-header';
headerRow.innerHTML = `<td colspan="3">${item.should_trigger ? 'Should Trigger' : 'Should NOT Trigger'}</td>`;
tbody.appendChild(headerRow);
lastGroup = group;
}
const idx = item.origIdx;
const tr = document.createElement('tr');
tr.innerHTML = `
<td><textarea class="query-input" onchange="updateQuery(${idx}, this.value)">${escapeHtml(item.query)}</textarea></td>
<td>
<label class="toggle">
<input type="checkbox" ${item.should_trigger ? 'checked' : ''} onchange="updateTrigger(${idx}, this.checked)">
<span class="slider"></span>
</label>
<span style="margin-left:8px;font-size:0.8rem;color:#b0aea5">${item.should_trigger ? 'Yes' : 'No'}</span>
</td>
<td><button class="btn-delete" onclick="deleteRow(${idx})">Delete</button></td>
`;
tbody.appendChild(tr);
});
updateSummary();
}
function escapeHtml(text) {
const div = document.createElement('div');
div.textContent = text;
return div.innerHTML;
}
function updateQuery(idx, value) { evalItems[idx].query = value; updateSummary(); }
function updateTrigger(idx, value) { evalItems[idx].should_trigger = value; render(); }
function deleteRow(idx) { evalItems.splice(idx, 1); render(); }
function addRow() {
evalItems.push({ query: '', should_trigger: true });
render();
const inputs = document.querySelectorAll('.query-input');
inputs[inputs.length - 1].focus();
}
function updateSummary() {
const trigger = evalItems.filter(i => i.should_trigger).length;
const noTrigger = evalItems.filter(i => !i.should_trigger).length;
document.getElementById('summary').textContent =
`${evalItems.length} queries total: ${trigger} should trigger, ${noTrigger} should not trigger`;
}
function exportEvalSet() {
const valid = evalItems.filter(i => i.query.trim() !== '');
const data = valid.map(i => ({ query: i.query.trim(), should_trigger: i.should_trigger }));
const blob = new Blob([JSON.stringify(data, null, 2)], { type: 'application/json' });
const url = URL.createObjectURL(blob);
const a = document.createElement('a');
a.href = url;
a.download = 'eval_set.json';
document.body.appendChild(a);
a.click();
document.body.removeChild(a);
URL.revokeObjectURL(url);
}
render();
</script>
</body>
</html>

View File

@ -0,0 +1,471 @@
#!/usr/bin/env python3
"""Generate and serve a review page for eval results.
Reads the workspace directory, discovers runs (directories with outputs/),
embeds all output data into a self-contained HTML page, and serves it via
a tiny HTTP server. Feedback auto-saves to feedback.json in the workspace.
Usage:
python generate_review.py <workspace-path> [--port PORT] [--skill-name NAME]
python generate_review.py <workspace-path> --previous-feedback /path/to/old/feedback.json
No dependencies beyond the Python stdlib are required.
"""
import argparse
import base64
import json
import mimetypes
import os
import re
import signal
import subprocess
import sys
import time
import webbrowser
from functools import partial
from http.server import HTTPServer, BaseHTTPRequestHandler
from pathlib import Path
# Files to exclude from output listings
METADATA_FILES = {"transcript.md", "user_notes.md", "metrics.json"}
# Extensions we render as inline text
TEXT_EXTENSIONS = {
".txt", ".md", ".json", ".csv", ".py", ".js", ".ts", ".tsx", ".jsx",
".yaml", ".yml", ".xml", ".html", ".css", ".sh", ".rb", ".go", ".rs",
".java", ".c", ".cpp", ".h", ".hpp", ".sql", ".r", ".toml",
}
# Extensions we render as inline images
IMAGE_EXTENSIONS = {".png", ".jpg", ".jpeg", ".gif", ".svg", ".webp"}
# MIME type overrides for common types
MIME_OVERRIDES = {
".svg": "image/svg+xml",
".xlsx": "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet",
".docx": "application/vnd.openxmlformats-officedocument.wordprocessingml.document",
".pptx": "application/vnd.openxmlformats-officedocument.presentationml.presentation",
}
def get_mime_type(path: Path) -> str:
ext = path.suffix.lower()
if ext in MIME_OVERRIDES:
return MIME_OVERRIDES[ext]
mime, _ = mimetypes.guess_type(str(path))
return mime or "application/octet-stream"
def find_runs(workspace: Path) -> list[dict]:
"""Recursively find directories that contain an outputs/ subdirectory."""
runs: list[dict] = []
_find_runs_recursive(workspace, workspace, runs)
runs.sort(key=lambda r: (r.get("eval_id", float("inf")), r["id"]))
return runs
def _find_runs_recursive(root: Path, current: Path, runs: list[dict]) -> None:
if not current.is_dir():
return
outputs_dir = current / "outputs"
if outputs_dir.is_dir():
run = build_run(root, current)
if run:
runs.append(run)
return
skip = {"node_modules", ".git", "__pycache__", "skill", "inputs"}
for child in sorted(current.iterdir()):
if child.is_dir() and child.name not in skip:
_find_runs_recursive(root, child, runs)
def build_run(root: Path, run_dir: Path) -> dict | None:
"""Build a run dict with prompt, outputs, and grading data."""
prompt = ""
eval_id = None
# Try eval_metadata.json
for candidate in [run_dir / "eval_metadata.json", run_dir.parent / "eval_metadata.json"]:
if candidate.exists():
try:
metadata = json.loads(candidate.read_text())
prompt = metadata.get("prompt", "")
eval_id = metadata.get("eval_id")
except (json.JSONDecodeError, OSError):
pass
if prompt:
break
# Fall back to transcript.md
if not prompt:
for candidate in [run_dir / "transcript.md", run_dir / "outputs" / "transcript.md"]:
if candidate.exists():
try:
text = candidate.read_text()
match = re.search(r"## Eval Prompt\n\n([\s\S]*?)(?=\n##|$)", text)
if match:
prompt = match.group(1).strip()
except OSError:
pass
if prompt:
break
if not prompt:
prompt = "(No prompt found)"
run_id = str(run_dir.relative_to(root)).replace("/", "-").replace("\\", "-")
# Collect output files
outputs_dir = run_dir / "outputs"
output_files: list[dict] = []
if outputs_dir.is_dir():
for f in sorted(outputs_dir.iterdir()):
if f.is_file() and f.name not in METADATA_FILES:
output_files.append(embed_file(f))
# Load grading if present
grading = None
for candidate in [run_dir / "grading.json", run_dir.parent / "grading.json"]:
if candidate.exists():
try:
grading = json.loads(candidate.read_text())
except (json.JSONDecodeError, OSError):
pass
if grading:
break
return {
"id": run_id,
"prompt": prompt,
"eval_id": eval_id,
"outputs": output_files,
"grading": grading,
}
def embed_file(path: Path) -> dict:
"""Read a file and return an embedded representation."""
ext = path.suffix.lower()
mime = get_mime_type(path)
if ext in TEXT_EXTENSIONS:
try:
content = path.read_text(errors="replace")
except OSError:
content = "(Error reading file)"
return {
"name": path.name,
"type": "text",
"content": content,
}
elif ext in IMAGE_EXTENSIONS:
try:
raw = path.read_bytes()
b64 = base64.b64encode(raw).decode("ascii")
except OSError:
return {"name": path.name, "type": "error", "content": "(Error reading file)"}
return {
"name": path.name,
"type": "image",
"mime": mime,
"data_uri": f"data:{mime};base64,{b64}",
}
elif ext == ".pdf":
try:
raw = path.read_bytes()
b64 = base64.b64encode(raw).decode("ascii")
except OSError:
return {"name": path.name, "type": "error", "content": "(Error reading file)"}
return {
"name": path.name,
"type": "pdf",
"data_uri": f"data:{mime};base64,{b64}",
}
elif ext == ".xlsx":
try:
raw = path.read_bytes()
b64 = base64.b64encode(raw).decode("ascii")
except OSError:
return {"name": path.name, "type": "error", "content": "(Error reading file)"}
return {
"name": path.name,
"type": "xlsx",
"data_b64": b64,
}
else:
# Binary / unknown — base64 download link
try:
raw = path.read_bytes()
b64 = base64.b64encode(raw).decode("ascii")
except OSError:
return {"name": path.name, "type": "error", "content": "(Error reading file)"}
return {
"name": path.name,
"type": "binary",
"mime": mime,
"data_uri": f"data:{mime};base64,{b64}",
}
def load_previous_iteration(workspace: Path) -> dict[str, dict]:
"""Load previous iteration's feedback and outputs.
Returns a map of run_id -> {"feedback": str, "outputs": list[dict]}.
"""
result: dict[str, dict] = {}
# Load feedback
feedback_map: dict[str, str] = {}
feedback_path = workspace / "feedback.json"
if feedback_path.exists():
try:
data = json.loads(feedback_path.read_text())
feedback_map = {
r["run_id"]: r["feedback"]
for r in data.get("reviews", [])
if r.get("feedback", "").strip()
}
except (json.JSONDecodeError, OSError, KeyError):
pass
# Load runs (to get outputs)
prev_runs = find_runs(workspace)
for run in prev_runs:
result[run["id"]] = {
"feedback": feedback_map.get(run["id"], ""),
"outputs": run.get("outputs", []),
}
# Also add feedback for run_ids that had feedback but no matching run
for run_id, fb in feedback_map.items():
if run_id not in result:
result[run_id] = {"feedback": fb, "outputs": []}
return result
def generate_html(
runs: list[dict],
skill_name: str,
previous: dict[str, dict] | None = None,
benchmark: dict | None = None,
) -> str:
"""Generate the complete standalone HTML page with embedded data."""
template_path = Path(__file__).parent / "viewer.html"
template = template_path.read_text()
# Build previous_feedback and previous_outputs maps for the template
previous_feedback: dict[str, str] = {}
previous_outputs: dict[str, list[dict]] = {}
if previous:
for run_id, data in previous.items():
if data.get("feedback"):
previous_feedback[run_id] = data["feedback"]
if data.get("outputs"):
previous_outputs[run_id] = data["outputs"]
embedded = {
"skill_name": skill_name,
"runs": runs,
"previous_feedback": previous_feedback,
"previous_outputs": previous_outputs,
}
if benchmark:
embedded["benchmark"] = benchmark
data_json = json.dumps(embedded)
return template.replace("/*__EMBEDDED_DATA__*/", f"const EMBEDDED_DATA = {data_json};")
# ---------------------------------------------------------------------------
# HTTP server (stdlib only, zero dependencies)
# ---------------------------------------------------------------------------
def _kill_port(port: int) -> None:
"""Kill any process listening on the given port."""
try:
result = subprocess.run(
["lsof", "-ti", f":{port}"],
capture_output=True, text=True, timeout=5,
)
for pid_str in result.stdout.strip().split("\n"):
if pid_str.strip():
try:
os.kill(int(pid_str.strip()), signal.SIGTERM)
except (ProcessLookupError, ValueError):
pass
if result.stdout.strip():
time.sleep(0.5)
except subprocess.TimeoutExpired:
pass
except FileNotFoundError:
print("Note: lsof not found, cannot check if port is in use", file=sys.stderr)
class ReviewHandler(BaseHTTPRequestHandler):
"""Serves the review HTML and handles feedback saves.
Regenerates the HTML on each page load so that refreshing the browser
picks up new eval outputs without restarting the server.
"""
def __init__(
self,
workspace: Path,
skill_name: str,
feedback_path: Path,
previous: dict[str, dict],
benchmark_path: Path | None,
*args,
**kwargs,
):
self.workspace = workspace
self.skill_name = skill_name
self.feedback_path = feedback_path
self.previous = previous
self.benchmark_path = benchmark_path
super().__init__(*args, **kwargs)
def do_GET(self) -> None:
if self.path == "/" or self.path == "/index.html":
# Regenerate HTML on each request (re-scans workspace for new outputs)
runs = find_runs(self.workspace)
benchmark = None
if self.benchmark_path and self.benchmark_path.exists():
try:
benchmark = json.loads(self.benchmark_path.read_text())
except (json.JSONDecodeError, OSError):
pass
html = generate_html(runs, self.skill_name, self.previous, benchmark)
content = html.encode("utf-8")
self.send_response(200)
self.send_header("Content-Type", "text/html; charset=utf-8")
self.send_header("Content-Length", str(len(content)))
self.end_headers()
self.wfile.write(content)
elif self.path == "/api/feedback":
data = b"{}"
if self.feedback_path.exists():
data = self.feedback_path.read_bytes()
self.send_response(200)
self.send_header("Content-Type", "application/json")
self.send_header("Content-Length", str(len(data)))
self.end_headers()
self.wfile.write(data)
else:
self.send_error(404)
def do_POST(self) -> None:
if self.path == "/api/feedback":
length = int(self.headers.get("Content-Length", 0))
body = self.rfile.read(length)
try:
data = json.loads(body)
if not isinstance(data, dict) or "reviews" not in data:
raise ValueError("Expected JSON object with 'reviews' key")
self.feedback_path.write_text(json.dumps(data, indent=2) + "\n")
resp = b'{"ok":true}'
self.send_response(200)
except (json.JSONDecodeError, OSError, ValueError) as e:
resp = json.dumps({"error": str(e)}).encode()
self.send_response(500)
self.send_header("Content-Type", "application/json")
self.send_header("Content-Length", str(len(resp)))
self.end_headers()
self.wfile.write(resp)
else:
self.send_error(404)
def log_message(self, format: str, *args: object) -> None:
# Suppress request logging to keep terminal clean
pass
def main() -> None:
parser = argparse.ArgumentParser(description="Generate and serve eval review")
parser.add_argument("workspace", type=Path, help="Path to workspace directory")
parser.add_argument("--port", "-p", type=int, default=3117, help="Server port (default: 3117)")
parser.add_argument("--skill-name", "-n", type=str, default=None, help="Skill name for header")
parser.add_argument(
"--previous-workspace", type=Path, default=None,
help="Path to previous iteration's workspace (shows old outputs and feedback as context)",
)
parser.add_argument(
"--benchmark", type=Path, default=None,
help="Path to benchmark.json to show in the Benchmark tab",
)
parser.add_argument(
"--static", "-s", type=Path, default=None,
help="Write standalone HTML to this path instead of starting a server",
)
args = parser.parse_args()
workspace = args.workspace.resolve()
if not workspace.is_dir():
print(f"Error: {workspace} is not a directory", file=sys.stderr)
sys.exit(1)
runs = find_runs(workspace)
if not runs:
print(f"No runs found in {workspace}", file=sys.stderr)
sys.exit(1)
skill_name = args.skill_name or workspace.name.replace("-workspace", "")
feedback_path = workspace / "feedback.json"
previous: dict[str, dict] = {}
if args.previous_workspace:
previous = load_previous_iteration(args.previous_workspace.resolve())
benchmark_path = args.benchmark.resolve() if args.benchmark else None
benchmark = None
if benchmark_path and benchmark_path.exists():
try:
benchmark = json.loads(benchmark_path.read_text())
except (json.JSONDecodeError, OSError):
pass
if args.static:
html = generate_html(runs, skill_name, previous, benchmark)
args.static.parent.mkdir(parents=True, exist_ok=True)
args.static.write_text(html)
print(f"\n Static viewer written to: {args.static}\n")
sys.exit(0)
# Kill any existing process on the target port
port = args.port
_kill_port(port)
handler = partial(ReviewHandler, workspace, skill_name, feedback_path, previous, benchmark_path)
try:
server = HTTPServer(("127.0.0.1", port), handler)
except OSError:
# Port still in use after kill attempt — find a free one
server = HTTPServer(("127.0.0.1", 0), handler)
port = server.server_address[1]
url = f"http://localhost:{port}"
print(f"\n Eval Viewer")
print(f" ─────────────────────────────────")
print(f" URL: {url}")
print(f" Workspace: {workspace}")
print(f" Feedback: {feedback_path}")
if previous:
print(f" Previous: {args.previous_workspace} ({len(previous)} runs)")
if benchmark_path:
print(f" Benchmark: {benchmark_path}")
print(f"\n Press Ctrl+C to stop.\n")
webbrowser.open(url)
try:
server.serve_forever()
except KeyboardInterrupt:
print("\nStopped.")
server.server_close()
if __name__ == "__main__":
main()

File diff suppressed because it is too large Load Diff

View File

@ -0,0 +1,430 @@
# JSON Schemas
This document defines the JSON schemas used by skill-creator.
---
## evals.json
Defines the evals for a skill. Located at `evals/evals.json` within the skill directory.
```json
{
"skill_name": "example-skill",
"evals": [
{
"id": 1,
"prompt": "User's example prompt",
"expected_output": "Description of expected result",
"files": ["evals/files/sample1.pdf"],
"expectations": [
"The output includes X",
"The skill used script Y"
]
}
]
}
```
**Fields:**
- `skill_name`: Name matching the skill's frontmatter
- `evals[].id`: Unique integer identifier
- `evals[].prompt`: The task to execute
- `evals[].expected_output`: Human-readable description of success
- `evals[].files`: Optional list of input file paths (relative to skill root)
- `evals[].expectations`: List of verifiable statements
---
## history.json
Tracks version progression in Improve mode. Located at workspace root.
```json
{
"started_at": "2026-01-15T10:30:00Z",
"skill_name": "pdf",
"current_best": "v2",
"iterations": [
{
"version": "v0",
"parent": null,
"expectation_pass_rate": 0.65,
"grading_result": "baseline",
"is_current_best": false
},
{
"version": "v1",
"parent": "v0",
"expectation_pass_rate": 0.75,
"grading_result": "won",
"is_current_best": false
},
{
"version": "v2",
"parent": "v1",
"expectation_pass_rate": 0.85,
"grading_result": "won",
"is_current_best": true
}
]
}
```
**Fields:**
- `started_at`: ISO timestamp of when improvement started
- `skill_name`: Name of the skill being improved
- `current_best`: Version identifier of the best performer
- `iterations[].version`: Version identifier (v0, v1, ...)
- `iterations[].parent`: Parent version this was derived from
- `iterations[].expectation_pass_rate`: Pass rate from grading
- `iterations[].grading_result`: "baseline", "won", "lost", or "tie"
- `iterations[].is_current_best`: Whether this is the current best version
---
## grading.json
Output from the grader agent. Located at `<run-dir>/grading.json`.
```json
{
"expectations": [
{
"text": "The output includes the name 'John Smith'",
"passed": true,
"evidence": "Found in transcript Step 3: 'Extracted names: John Smith, Sarah Johnson'"
},
{
"text": "The spreadsheet has a SUM formula in cell B10",
"passed": false,
"evidence": "No spreadsheet was created. The output was a text file."
}
],
"summary": {
"passed": 2,
"failed": 1,
"total": 3,
"pass_rate": 0.67
},
"execution_metrics": {
"tool_calls": {
"Read": 5,
"Write": 2,
"Bash": 8
},
"total_tool_calls": 15,
"total_steps": 6,
"errors_encountered": 0,
"output_chars": 12450,
"transcript_chars": 3200
},
"timing": {
"executor_duration_seconds": 165.0,
"grader_duration_seconds": 26.0,
"total_duration_seconds": 191.0
},
"claims": [
{
"claim": "The form has 12 fillable fields",
"type": "factual",
"verified": true,
"evidence": "Counted 12 fields in field_info.json"
}
],
"user_notes_summary": {
"uncertainties": ["Used 2023 data, may be stale"],
"needs_review": [],
"workarounds": ["Fell back to text overlay for non-fillable fields"]
},
"eval_feedback": {
"suggestions": [
{
"assertion": "The output includes the name 'John Smith'",
"reason": "A hallucinated document that mentions the name would also pass"
}
],
"overall": "Assertions check presence but not correctness."
}
}
```
**Fields:**
- `expectations[]`: Graded expectations with evidence
- `summary`: Aggregate pass/fail counts
- `execution_metrics`: Tool usage and output size (from executor's metrics.json)
- `timing`: Wall clock timing (from timing.json)
- `claims`: Extracted and verified claims from the output
- `user_notes_summary`: Issues flagged by the executor
- `eval_feedback`: (optional) Improvement suggestions for the evals, only present when the grader identifies issues worth raising
---
## metrics.json
Output from the executor agent. Located at `<run-dir>/outputs/metrics.json`.
```json
{
"tool_calls": {
"Read": 5,
"Write": 2,
"Bash": 8,
"Edit": 1,
"Glob": 2,
"Grep": 0
},
"total_tool_calls": 18,
"total_steps": 6,
"files_created": ["filled_form.pdf", "field_values.json"],
"errors_encountered": 0,
"output_chars": 12450,
"transcript_chars": 3200
}
```
**Fields:**
- `tool_calls`: Count per tool type
- `total_tool_calls`: Sum of all tool calls
- `total_steps`: Number of major execution steps
- `files_created`: List of output files created
- `errors_encountered`: Number of errors during execution
- `output_chars`: Total character count of output files
- `transcript_chars`: Character count of transcript
---
## timing.json
Wall clock timing for a run. Located at `<run-dir>/timing.json`.
**How to capture:** When a subagent task completes, the task notification includes `total_tokens` and `duration_ms`. Save these immediately — they are not persisted anywhere else and cannot be recovered after the fact.
```json
{
"total_tokens": 84852,
"duration_ms": 23332,
"total_duration_seconds": 23.3,
"executor_start": "2026-01-15T10:30:00Z",
"executor_end": "2026-01-15T10:32:45Z",
"executor_duration_seconds": 165.0,
"grader_start": "2026-01-15T10:32:46Z",
"grader_end": "2026-01-15T10:33:12Z",
"grader_duration_seconds": 26.0
}
```
---
## benchmark.json
Output from Benchmark mode. Located at `benchmarks/<timestamp>/benchmark.json`.
```json
{
"metadata": {
"skill_name": "pdf",
"skill_path": "/path/to/pdf",
"executor_model": "claude-sonnet-4-20250514",
"analyzer_model": "most-capable-model",
"timestamp": "2026-01-15T10:30:00Z",
"evals_run": [1, 2, 3],
"runs_per_configuration": 3
},
"runs": [
{
"eval_id": 1,
"eval_name": "Ocean",
"configuration": "with_skill",
"run_number": 1,
"result": {
"pass_rate": 0.85,
"passed": 6,
"failed": 1,
"total": 7,
"time_seconds": 42.5,
"tokens": 3800,
"tool_calls": 18,
"errors": 0
},
"expectations": [
{"text": "...", "passed": true, "evidence": "..."}
],
"notes": [
"Used 2023 data, may be stale",
"Fell back to text overlay for non-fillable fields"
]
}
],
"run_summary": {
"with_skill": {
"pass_rate": {"mean": 0.85, "stddev": 0.05, "min": 0.80, "max": 0.90},
"time_seconds": {"mean": 45.0, "stddev": 12.0, "min": 32.0, "max": 58.0},
"tokens": {"mean": 3800, "stddev": 400, "min": 3200, "max": 4100}
},
"without_skill": {
"pass_rate": {"mean": 0.35, "stddev": 0.08, "min": 0.28, "max": 0.45},
"time_seconds": {"mean": 32.0, "stddev": 8.0, "min": 24.0, "max": 42.0},
"tokens": {"mean": 2100, "stddev": 300, "min": 1800, "max": 2500}
},
"delta": {
"pass_rate": "+0.50",
"time_seconds": "+13.0",
"tokens": "+1700"
}
},
"notes": [
"Assertion 'Output is a PDF file' passes 100% in both configurations - may not differentiate skill value",
"Eval 3 shows high variance (50% ± 40%) - may be flaky or model-dependent",
"Without-skill runs consistently fail on table extraction expectations",
"Skill adds 13s average execution time but improves pass rate by 50%"
]
}
```
**Fields:**
- `metadata`: Information about the benchmark run
- `skill_name`: Name of the skill
- `timestamp`: When the benchmark was run
- `evals_run`: List of eval names or IDs
- `runs_per_configuration`: Number of runs per config (e.g. 3)
- `runs[]`: Individual run results
- `eval_id`: Numeric eval identifier
- `eval_name`: Human-readable eval name (used as section header in the viewer)
- `configuration`: Must be `"with_skill"` or `"without_skill"` (the viewer uses this exact string for grouping and color coding)
- `run_number`: Integer run number (1, 2, 3...)
- `result`: Nested object with `pass_rate`, `passed`, `total`, `time_seconds`, `tokens`, `errors`
- `run_summary`: Statistical aggregates per configuration
- `with_skill` / `without_skill`: Each contains `pass_rate`, `time_seconds`, `tokens` objects with `mean` and `stddev` fields
- `delta`: Difference strings like `"+0.50"`, `"+13.0"`, `"+1700"`
- `notes`: Freeform observations from the analyzer
**Important:** The viewer reads these field names exactly. Using `config` instead of `configuration`, or putting `pass_rate` at the top level of a run instead of nested under `result`, will cause the viewer to show empty/zero values. Always reference this schema when generating benchmark.json manually.
---
## comparison.json
Output from blind comparator. Located at `<grading-dir>/comparison-N.json`.
```json
{
"winner": "A",
"reasoning": "Output A provides a complete solution with proper formatting and all required fields. Output B is missing the date field and has formatting inconsistencies.",
"rubric": {
"A": {
"content": {
"correctness": 5,
"completeness": 5,
"accuracy": 4
},
"structure": {
"organization": 4,
"formatting": 5,
"usability": 4
},
"content_score": 4.7,
"structure_score": 4.3,
"overall_score": 9.0
},
"B": {
"content": {
"correctness": 3,
"completeness": 2,
"accuracy": 3
},
"structure": {
"organization": 3,
"formatting": 2,
"usability": 3
},
"content_score": 2.7,
"structure_score": 2.7,
"overall_score": 5.4
}
},
"output_quality": {
"A": {
"score": 9,
"strengths": ["Complete solution", "Well-formatted", "All fields present"],
"weaknesses": ["Minor style inconsistency in header"]
},
"B": {
"score": 5,
"strengths": ["Readable output", "Correct basic structure"],
"weaknesses": ["Missing date field", "Formatting inconsistencies", "Partial data extraction"]
}
},
"expectation_results": {
"A": {
"passed": 4,
"total": 5,
"pass_rate": 0.80,
"details": [
{"text": "Output includes name", "passed": true}
]
},
"B": {
"passed": 3,
"total": 5,
"pass_rate": 0.60,
"details": [
{"text": "Output includes name", "passed": true}
]
}
}
}
```
---
## analysis.json
Output from post-hoc analyzer. Located at `<grading-dir>/analysis.json`.
```json
{
"comparison_summary": {
"winner": "A",
"winner_skill": "path/to/winner/skill",
"loser_skill": "path/to/loser/skill",
"comparator_reasoning": "Brief summary of why comparator chose winner"
},
"winner_strengths": [
"Clear step-by-step instructions for handling multi-page documents",
"Included validation script that caught formatting errors"
],
"loser_weaknesses": [
"Vague instruction 'process the document appropriately' led to inconsistent behavior",
"No script for validation, agent had to improvise"
],
"instruction_following": {
"winner": {
"score": 9,
"issues": ["Minor: skipped optional logging step"]
},
"loser": {
"score": 6,
"issues": [
"Did not use the skill's formatting template",
"Invented own approach instead of following step 3"
]
}
},
"improvement_suggestions": [
{
"priority": "high",
"category": "instructions",
"suggestion": "Replace 'process the document appropriately' with explicit steps",
"expected_impact": "Would eliminate ambiguity that caused inconsistent behavior"
}
],
"transcript_insights": {
"winner_execution_pattern": "Read skill -> Followed 5-step process -> Used validation script",
"loser_execution_pattern": "Read skill -> Unclear on approach -> Tried 3 different methods"
}
}
```

View File

@ -0,0 +1,401 @@
#!/usr/bin/env python3
"""
Aggregate individual run results into benchmark summary statistics.
Reads grading.json files from run directories and produces:
- run_summary with mean, stddev, min, max for each metric
- delta between with_skill and without_skill configurations
Usage:
python aggregate_benchmark.py <benchmark_dir>
Example:
python aggregate_benchmark.py benchmarks/2026-01-15T10-30-00/
The script supports two directory layouts:
Workspace layout (from skill-creator iterations):
<benchmark_dir>/
eval-N/
with_skill/
run-1/grading.json
run-2/grading.json
without_skill/
run-1/grading.json
run-2/grading.json
Legacy layout (with runs/ subdirectory):
<benchmark_dir>/
runs/
eval-N/
with_skill/
run-1/grading.json
without_skill/
run-1/grading.json
"""
import argparse
import json
import math
import sys
from datetime import datetime, timezone
from pathlib import Path
def calculate_stats(values: list[float]) -> dict:
"""Calculate mean, stddev, min, max for a list of values."""
if not values:
return {"mean": 0.0, "stddev": 0.0, "min": 0.0, "max": 0.0}
n = len(values)
mean = sum(values) / n
if n > 1:
variance = sum((x - mean) ** 2 for x in values) / (n - 1)
stddev = math.sqrt(variance)
else:
stddev = 0.0
return {
"mean": round(mean, 4),
"stddev": round(stddev, 4),
"min": round(min(values), 4),
"max": round(max(values), 4)
}
def load_run_results(benchmark_dir: Path) -> dict:
"""
Load all run results from a benchmark directory.
Returns dict keyed by config name (e.g. "with_skill"/"without_skill",
or "new_skill"/"old_skill"), each containing a list of run results.
"""
# Support both layouts: eval dirs directly under benchmark_dir, or under runs/
runs_dir = benchmark_dir / "runs"
if runs_dir.exists():
search_dir = runs_dir
elif list(benchmark_dir.glob("eval-*")):
search_dir = benchmark_dir
else:
print(f"No eval directories found in {benchmark_dir} or {benchmark_dir / 'runs'}")
return {}
results: dict[str, list] = {}
for eval_idx, eval_dir in enumerate(sorted(search_dir.glob("eval-*"))):
metadata_path = eval_dir / "eval_metadata.json"
if metadata_path.exists():
try:
with open(metadata_path) as mf:
eval_id = json.load(mf).get("eval_id", eval_idx)
except (json.JSONDecodeError, OSError):
eval_id = eval_idx
else:
try:
eval_id = int(eval_dir.name.split("-")[1])
except ValueError:
eval_id = eval_idx
# Discover config directories dynamically rather than hardcoding names
for config_dir in sorted(eval_dir.iterdir()):
if not config_dir.is_dir():
continue
# Skip non-config directories (inputs, outputs, etc.)
if not list(config_dir.glob("run-*")):
continue
config = config_dir.name
if config not in results:
results[config] = []
for run_dir in sorted(config_dir.glob("run-*")):
run_number = int(run_dir.name.split("-")[1])
grading_file = run_dir / "grading.json"
if not grading_file.exists():
print(f"Warning: grading.json not found in {run_dir}")
continue
try:
with open(grading_file) as f:
grading = json.load(f)
except json.JSONDecodeError as e:
print(f"Warning: Invalid JSON in {grading_file}: {e}")
continue
# Extract metrics
result = {
"eval_id": eval_id,
"run_number": run_number,
"pass_rate": grading.get("summary", {}).get("pass_rate", 0.0),
"passed": grading.get("summary", {}).get("passed", 0),
"failed": grading.get("summary", {}).get("failed", 0),
"total": grading.get("summary", {}).get("total", 0),
}
# Extract timing — check grading.json first, then sibling timing.json
timing = grading.get("timing", {})
result["time_seconds"] = timing.get("total_duration_seconds", 0.0)
timing_file = run_dir / "timing.json"
if result["time_seconds"] == 0.0 and timing_file.exists():
try:
with open(timing_file) as tf:
timing_data = json.load(tf)
result["time_seconds"] = timing_data.get("total_duration_seconds", 0.0)
result["tokens"] = timing_data.get("total_tokens", 0)
except json.JSONDecodeError:
pass
# Extract metrics if available
metrics = grading.get("execution_metrics", {})
result["tool_calls"] = metrics.get("total_tool_calls", 0)
if not result.get("tokens"):
result["tokens"] = metrics.get("output_chars", 0)
result["errors"] = metrics.get("errors_encountered", 0)
# Extract expectations — viewer requires fields: text, passed, evidence
raw_expectations = grading.get("expectations", [])
for exp in raw_expectations:
if "text" not in exp or "passed" not in exp:
print(f"Warning: expectation in {grading_file} missing required fields (text, passed, evidence): {exp}")
result["expectations"] = raw_expectations
# Extract notes from user_notes_summary
notes_summary = grading.get("user_notes_summary", {})
notes = []
notes.extend(notes_summary.get("uncertainties", []))
notes.extend(notes_summary.get("needs_review", []))
notes.extend(notes_summary.get("workarounds", []))
result["notes"] = notes
results[config].append(result)
return results
def aggregate_results(results: dict) -> dict:
"""
Aggregate run results into summary statistics.
Returns run_summary with stats for each configuration and delta.
"""
run_summary = {}
configs = list(results.keys())
for config in configs:
runs = results.get(config, [])
if not runs:
run_summary[config] = {
"pass_rate": {"mean": 0.0, "stddev": 0.0, "min": 0.0, "max": 0.0},
"time_seconds": {"mean": 0.0, "stddev": 0.0, "min": 0.0, "max": 0.0},
"tokens": {"mean": 0, "stddev": 0, "min": 0, "max": 0}
}
continue
pass_rates = [r["pass_rate"] for r in runs]
times = [r["time_seconds"] for r in runs]
tokens = [r.get("tokens", 0) for r in runs]
run_summary[config] = {
"pass_rate": calculate_stats(pass_rates),
"time_seconds": calculate_stats(times),
"tokens": calculate_stats(tokens)
}
# Calculate delta between the first two configs (if two exist)
if len(configs) >= 2:
primary = run_summary.get(configs[0], {})
baseline = run_summary.get(configs[1], {})
else:
primary = run_summary.get(configs[0], {}) if configs else {}
baseline = {}
delta_pass_rate = primary.get("pass_rate", {}).get("mean", 0) - baseline.get("pass_rate", {}).get("mean", 0)
delta_time = primary.get("time_seconds", {}).get("mean", 0) - baseline.get("time_seconds", {}).get("mean", 0)
delta_tokens = primary.get("tokens", {}).get("mean", 0) - baseline.get("tokens", {}).get("mean", 0)
run_summary["delta"] = {
"pass_rate": f"{delta_pass_rate:+.2f}",
"time_seconds": f"{delta_time:+.1f}",
"tokens": f"{delta_tokens:+.0f}"
}
return run_summary
def generate_benchmark(benchmark_dir: Path, skill_name: str = "", skill_path: str = "") -> dict:
"""
Generate complete benchmark.json from run results.
"""
results = load_run_results(benchmark_dir)
run_summary = aggregate_results(results)
# Build runs array for benchmark.json
runs = []
for config in results:
for result in results[config]:
runs.append({
"eval_id": result["eval_id"],
"configuration": config,
"run_number": result["run_number"],
"result": {
"pass_rate": result["pass_rate"],
"passed": result["passed"],
"failed": result["failed"],
"total": result["total"],
"time_seconds": result["time_seconds"],
"tokens": result.get("tokens", 0),
"tool_calls": result.get("tool_calls", 0),
"errors": result.get("errors", 0)
},
"expectations": result["expectations"],
"notes": result["notes"]
})
# Determine eval IDs from results
eval_ids = sorted(set(
r["eval_id"]
for config in results.values()
for r in config
))
benchmark = {
"metadata": {
"skill_name": skill_name or "<skill-name>",
"skill_path": skill_path or "<path/to/skill>",
"executor_model": "<model-name>",
"analyzer_model": "<model-name>",
"timestamp": datetime.now(timezone.utc).strftime("%Y-%m-%dT%H:%M:%SZ"),
"evals_run": eval_ids,
"runs_per_configuration": 3
},
"runs": runs,
"run_summary": run_summary,
"notes": [] # To be filled by analyzer
}
return benchmark
def generate_markdown(benchmark: dict) -> str:
"""Generate human-readable benchmark.md from benchmark data."""
metadata = benchmark["metadata"]
run_summary = benchmark["run_summary"]
# Determine config names (excluding "delta")
configs = [k for k in run_summary if k != "delta"]
config_a = configs[0] if len(configs) >= 1 else "config_a"
config_b = configs[1] if len(configs) >= 2 else "config_b"
label_a = config_a.replace("_", " ").title()
label_b = config_b.replace("_", " ").title()
lines = [
f"# Skill Benchmark: {metadata['skill_name']}",
"",
f"**Model**: {metadata['executor_model']}",
f"**Date**: {metadata['timestamp']}",
f"**Evals**: {', '.join(map(str, metadata['evals_run']))} ({metadata['runs_per_configuration']} runs each per configuration)",
"",
"## Summary",
"",
f"| Metric | {label_a} | {label_b} | Delta |",
"|--------|------------|---------------|-------|",
]
a_summary = run_summary.get(config_a, {})
b_summary = run_summary.get(config_b, {})
delta = run_summary.get("delta", {})
# Format pass rate
a_pr = a_summary.get("pass_rate", {})
b_pr = b_summary.get("pass_rate", {})
lines.append(f"| Pass Rate | {a_pr.get('mean', 0)*100:.0f}% ± {a_pr.get('stddev', 0)*100:.0f}% | {b_pr.get('mean', 0)*100:.0f}% ± {b_pr.get('stddev', 0)*100:.0f}% | {delta.get('pass_rate', '')} |")
# Format time
a_time = a_summary.get("time_seconds", {})
b_time = b_summary.get("time_seconds", {})
lines.append(f"| Time | {a_time.get('mean', 0):.1f}s ± {a_time.get('stddev', 0):.1f}s | {b_time.get('mean', 0):.1f}s ± {b_time.get('stddev', 0):.1f}s | {delta.get('time_seconds', '')}s |")
# Format tokens
a_tokens = a_summary.get("tokens", {})
b_tokens = b_summary.get("tokens", {})
lines.append(f"| Tokens | {a_tokens.get('mean', 0):.0f} ± {a_tokens.get('stddev', 0):.0f} | {b_tokens.get('mean', 0):.0f} ± {b_tokens.get('stddev', 0):.0f} | {delta.get('tokens', '')} |")
# Notes section
if benchmark.get("notes"):
lines.extend([
"",
"## Notes",
""
])
for note in benchmark["notes"]:
lines.append(f"- {note}")
return "\n".join(lines)
def main():
parser = argparse.ArgumentParser(
description="Aggregate benchmark run results into summary statistics"
)
parser.add_argument(
"benchmark_dir",
type=Path,
help="Path to the benchmark directory"
)
parser.add_argument(
"--skill-name",
default="",
help="Name of the skill being benchmarked"
)
parser.add_argument(
"--skill-path",
default="",
help="Path to the skill being benchmarked"
)
parser.add_argument(
"--output", "-o",
type=Path,
help="Output path for benchmark.json (default: <benchmark_dir>/benchmark.json)"
)
args = parser.parse_args()
if not args.benchmark_dir.exists():
print(f"Directory not found: {args.benchmark_dir}")
sys.exit(1)
# Generate benchmark
benchmark = generate_benchmark(args.benchmark_dir, args.skill_name, args.skill_path)
# Determine output paths
output_json = args.output or (args.benchmark_dir / "benchmark.json")
output_md = output_json.with_suffix(".md")
# Write benchmark.json
with open(output_json, "w") as f:
json.dump(benchmark, f, indent=2)
print(f"Generated: {output_json}")
# Write benchmark.md
markdown = generate_markdown(benchmark)
with open(output_md, "w") as f:
f.write(markdown)
print(f"Generated: {output_md}")
# Print summary
run_summary = benchmark["run_summary"]
configs = [k for k in run_summary if k != "delta"]
delta = run_summary.get("delta", {})
print(f"\nSummary:")
for config in configs:
pr = run_summary[config]["pass_rate"]["mean"]
label = config.replace("_", " ").title()
print(f" {label}: {pr*100:.1f}% pass rate")
print(f" Delta: {delta.get('pass_rate', '')}")
if __name__ == "__main__":
main()

View File

@ -0,0 +1,326 @@
#!/usr/bin/env python3
"""Generate an HTML report from run_loop.py output.
Takes the JSON output from run_loop.py and generates a visual HTML report
showing each description attempt with check/x for each test case.
Distinguishes between train and test queries.
"""
import argparse
import html
import json
import sys
from pathlib import Path
def generate_html(data: dict, auto_refresh: bool = False, skill_name: str = "") -> str:
"""Generate HTML report from loop output data. If auto_refresh is True, adds a meta refresh tag."""
history = data.get("history", [])
holdout = data.get("holdout", 0)
title_prefix = html.escape(skill_name + " \u2014 ") if skill_name else ""
# Get all unique queries from train and test sets, with should_trigger info
train_queries: list[dict] = []
test_queries: list[dict] = []
if history:
for r in history[0].get("train_results", history[0].get("results", [])):
train_queries.append({"query": r["query"], "should_trigger": r.get("should_trigger", True)})
if history[0].get("test_results"):
for r in history[0].get("test_results", []):
test_queries.append({"query": r["query"], "should_trigger": r.get("should_trigger", True)})
refresh_tag = ' <meta http-equiv="refresh" content="5">\n' if auto_refresh else ""
html_parts = ["""<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8">
""" + refresh_tag + """ <title>""" + title_prefix + """Skill Description Optimization</title>
<link rel="preconnect" href="https://fonts.googleapis.com">
<link rel="preconnect" href="https://fonts.gstatic.com" crossorigin>
<link href="https://fonts.googleapis.com/css2?family=Poppins:wght@500;600&family=Lora:wght@400;500&display=swap" rel="stylesheet">
<style>
body {
font-family: 'Lora', Georgia, serif;
max-width: 100%;
margin: 0 auto;
padding: 20px;
background: #faf9f5;
color: #141413;
}
h1 { font-family: 'Poppins', sans-serif; color: #141413; }
.explainer {
background: white;
padding: 15px;
border-radius: 6px;
margin-bottom: 20px;
border: 1px solid #e8e6dc;
color: #b0aea5;
font-size: 0.875rem;
line-height: 1.6;
}
.summary {
background: white;
padding: 15px;
border-radius: 6px;
margin-bottom: 20px;
border: 1px solid #e8e6dc;
}
.summary p { margin: 5px 0; }
.best { color: #788c5d; font-weight: bold; }
.table-container {
overflow-x: auto;
width: 100%;
}
table {
border-collapse: collapse;
background: white;
border: 1px solid #e8e6dc;
border-radius: 6px;
font-size: 12px;
min-width: 100%;
}
th, td {
padding: 8px;
text-align: left;
border: 1px solid #e8e6dc;
white-space: normal;
word-wrap: break-word;
}
th {
font-family: 'Poppins', sans-serif;
background: #141413;
color: #faf9f5;
font-weight: 500;
}
th.test-col {
background: #6a9bcc;
}
th.query-col { min-width: 200px; }
td.description {
font-family: monospace;
font-size: 11px;
word-wrap: break-word;
max-width: 400px;
}
td.result {
text-align: center;
font-size: 16px;
min-width: 40px;
}
td.test-result {
background: #f0f6fc;
}
.pass { color: #788c5d; }
.fail { color: #c44; }
.rate {
font-size: 9px;
color: #b0aea5;
display: block;
}
tr:hover { background: #faf9f5; }
.score {
display: inline-block;
padding: 2px 6px;
border-radius: 4px;
font-weight: bold;
font-size: 11px;
}
.score-good { background: #eef2e8; color: #788c5d; }
.score-ok { background: #fef3c7; color: #d97706; }
.score-bad { background: #fceaea; color: #c44; }
.train-label { color: #b0aea5; font-size: 10px; }
.test-label { color: #6a9bcc; font-size: 10px; font-weight: bold; }
.best-row { background: #f5f8f2; }
th.positive-col { border-bottom: 3px solid #788c5d; }
th.negative-col { border-bottom: 3px solid #c44; }
th.test-col.positive-col { border-bottom: 3px solid #788c5d; }
th.test-col.negative-col { border-bottom: 3px solid #c44; }
.legend { font-family: 'Poppins', sans-serif; display: flex; gap: 20px; margin-bottom: 10px; font-size: 13px; align-items: center; }
.legend-item { display: flex; align-items: center; gap: 6px; }
.legend-swatch { width: 16px; height: 16px; border-radius: 3px; display: inline-block; }
.swatch-positive { background: #141413; border-bottom: 3px solid #788c5d; }
.swatch-negative { background: #141413; border-bottom: 3px solid #c44; }
.swatch-test { background: #6a9bcc; }
.swatch-train { background: #141413; }
</style>
</head>
<body>
<h1>""" + title_prefix + """Skill Description Optimization</h1>
<div class="explainer">
<strong>Optimizing your skill's description.</strong> This page updates automatically as Claude tests different versions of your skill's description. Each row is an iteration a new description attempt. The columns show test queries: green checkmarks mean the skill triggered correctly (or correctly didn't trigger), red crosses mean it got it wrong. The "Train" score shows performance on queries used to improve the description; the "Test" score shows performance on held-out queries the optimizer hasn't seen. When it's done, Claude will apply the best-performing description to your skill.
</div>
"""]
# Summary section
best_test_score = data.get('best_test_score')
best_train_score = data.get('best_train_score')
html_parts.append(f"""
<div class="summary">
<p><strong>Original:</strong> {html.escape(data.get('original_description', 'N/A'))}</p>
<p class="best"><strong>Best:</strong> {html.escape(data.get('best_description', 'N/A'))}</p>
<p><strong>Best Score:</strong> {data.get('best_score', 'N/A')} {'(test)' if best_test_score else '(train)'}</p>
<p><strong>Iterations:</strong> {data.get('iterations_run', 0)} | <strong>Train:</strong> {data.get('train_size', '?')} | <strong>Test:</strong> {data.get('test_size', '?')}</p>
</div>
""")
# Legend
html_parts.append("""
<div class="legend">
<span style="font-weight:600">Query columns:</span>
<span class="legend-item"><span class="legend-swatch swatch-positive"></span> Should trigger</span>
<span class="legend-item"><span class="legend-swatch swatch-negative"></span> Should NOT trigger</span>
<span class="legend-item"><span class="legend-swatch swatch-train"></span> Train</span>
<span class="legend-item"><span class="legend-swatch swatch-test"></span> Test</span>
</div>
""")
# Table header
html_parts.append("""
<div class="table-container">
<table>
<thead>
<tr>
<th>Iter</th>
<th>Train</th>
<th>Test</th>
<th class="query-col">Description</th>
""")
# Add column headers for train queries
for qinfo in train_queries:
polarity = "positive-col" if qinfo["should_trigger"] else "negative-col"
html_parts.append(f' <th class="{polarity}">{html.escape(qinfo["query"])}</th>\n')
# Add column headers for test queries (different color)
for qinfo in test_queries:
polarity = "positive-col" if qinfo["should_trigger"] else "negative-col"
html_parts.append(f' <th class="test-col {polarity}">{html.escape(qinfo["query"])}</th>\n')
html_parts.append(""" </tr>
</thead>
<tbody>
""")
# Find best iteration for highlighting
if test_queries:
best_iter = max(history, key=lambda h: h.get("test_passed") or 0).get("iteration")
else:
best_iter = max(history, key=lambda h: h.get("train_passed", h.get("passed", 0))).get("iteration")
# Add rows for each iteration
for h in history:
iteration = h.get("iteration", "?")
train_passed = h.get("train_passed", h.get("passed", 0))
train_total = h.get("train_total", h.get("total", 0))
test_passed = h.get("test_passed")
test_total = h.get("test_total")
description = h.get("description", "")
train_results = h.get("train_results", h.get("results", []))
test_results = h.get("test_results", [])
# Create lookups for results by query
train_by_query = {r["query"]: r for r in train_results}
test_by_query = {r["query"]: r for r in test_results} if test_results else {}
# Compute aggregate correct/total runs across all retries
def aggregate_runs(results: list[dict]) -> tuple[int, int]:
correct = 0
total = 0
for r in results:
runs = r.get("runs", 0)
triggers = r.get("triggers", 0)
total += runs
if r.get("should_trigger", True):
correct += triggers
else:
correct += runs - triggers
return correct, total
train_correct, train_runs = aggregate_runs(train_results)
test_correct, test_runs = aggregate_runs(test_results)
# Determine score classes
def score_class(correct: int, total: int) -> str:
if total > 0:
ratio = correct / total
if ratio >= 0.8:
return "score-good"
elif ratio >= 0.5:
return "score-ok"
return "score-bad"
train_class = score_class(train_correct, train_runs)
test_class = score_class(test_correct, test_runs)
row_class = "best-row" if iteration == best_iter else ""
html_parts.append(f""" <tr class="{row_class}">
<td>{iteration}</td>
<td><span class="score {train_class}">{train_correct}/{train_runs}</span></td>
<td><span class="score {test_class}">{test_correct}/{test_runs}</span></td>
<td class="description">{html.escape(description)}</td>
""")
# Add result for each train query
for qinfo in train_queries:
r = train_by_query.get(qinfo["query"], {})
did_pass = r.get("pass", False)
triggers = r.get("triggers", 0)
runs = r.get("runs", 0)
icon = "" if did_pass else ""
css_class = "pass" if did_pass else "fail"
html_parts.append(f' <td class="result {css_class}">{icon}<span class="rate">{triggers}/{runs}</span></td>\n')
# Add result for each test query (with different background)
for qinfo in test_queries:
r = test_by_query.get(qinfo["query"], {})
did_pass = r.get("pass", False)
triggers = r.get("triggers", 0)
runs = r.get("runs", 0)
icon = "" if did_pass else ""
css_class = "pass" if did_pass else "fail"
html_parts.append(f' <td class="result test-result {css_class}">{icon}<span class="rate">{triggers}/{runs}</span></td>\n')
html_parts.append(" </tr>\n")
html_parts.append(""" </tbody>
</table>
</div>
""")
html_parts.append("""
</body>
</html>
""")
return "".join(html_parts)
def main():
parser = argparse.ArgumentParser(description="Generate HTML report from run_loop output")
parser.add_argument("input", help="Path to JSON output from run_loop.py (or - for stdin)")
parser.add_argument("-o", "--output", default=None, help="Output HTML file (default: stdout)")
parser.add_argument("--skill-name", default="", help="Skill name to include in the report title")
args = parser.parse_args()
if args.input == "-":
data = json.load(sys.stdin)
else:
data = json.loads(Path(args.input).read_text())
html_output = generate_html(data, skill_name=args.skill_name)
if args.output:
Path(args.output).write_text(html_output)
print(f"Report written to {args.output}", file=sys.stderr)
else:
print(html_output)
if __name__ == "__main__":
main()

View File

@ -0,0 +1,247 @@
#!/usr/bin/env python3
"""Improve a skill description based on eval results.
Takes eval results (from run_eval.py) and generates an improved description
by calling `claude -p` as a subprocess (same auth pattern as run_eval.py
uses the session's Claude Code auth, no separate ANTHROPIC_API_KEY needed).
"""
import argparse
import json
import os
import re
import subprocess
import sys
from pathlib import Path
from scripts.utils import parse_skill_md
def _call_claude(prompt: str, model: str | None, timeout: int = 300) -> str:
"""Run `claude -p` with the prompt on stdin and return the text response.
Prompt goes over stdin (not argv) because it embeds the full SKILL.md
body and can easily exceed comfortable argv length.
"""
cmd = ["claude", "-p", "--output-format", "text"]
if model:
cmd.extend(["--model", model])
# Remove CLAUDECODE env var to allow nesting claude -p inside a
# Claude Code session. The guard is for interactive terminal conflicts;
# programmatic subprocess usage is safe. Same pattern as run_eval.py.
env = {k: v for k, v in os.environ.items() if k != "CLAUDECODE"}
result = subprocess.run(
cmd,
input=prompt,
capture_output=True,
text=True,
env=env,
timeout=timeout,
)
if result.returncode != 0:
raise RuntimeError(
f"claude -p exited {result.returncode}\nstderr: {result.stderr}"
)
return result.stdout
def improve_description(
skill_name: str,
skill_content: str,
current_description: str,
eval_results: dict,
history: list[dict],
model: str,
test_results: dict | None = None,
log_dir: Path | None = None,
iteration: int | None = None,
) -> str:
"""Call Claude to improve the description based on eval results."""
failed_triggers = [
r for r in eval_results["results"]
if r["should_trigger"] and not r["pass"]
]
false_triggers = [
r for r in eval_results["results"]
if not r["should_trigger"] and not r["pass"]
]
# Build scores summary
train_score = f"{eval_results['summary']['passed']}/{eval_results['summary']['total']}"
if test_results:
test_score = f"{test_results['summary']['passed']}/{test_results['summary']['total']}"
scores_summary = f"Train: {train_score}, Test: {test_score}"
else:
scores_summary = f"Train: {train_score}"
prompt = f"""You are optimizing a skill description for a Claude Code skill called "{skill_name}". A "skill" is sort of like a prompt, but with progressive disclosure -- there's a title and description that Claude sees when deciding whether to use the skill, and then if it does use the skill, it reads the .md file which has lots more details and potentially links to other resources in the skill folder like helper files and scripts and additional documentation or examples.
The description appears in Claude's "available_skills" list. When a user sends a query, Claude decides whether to invoke the skill based solely on the title and on this description. Your goal is to write a description that triggers for relevant queries, and doesn't trigger for irrelevant ones.
Here's the current description:
<current_description>
"{current_description}"
</current_description>
Current scores ({scores_summary}):
<scores_summary>
"""
if failed_triggers:
prompt += "FAILED TO TRIGGER (should have triggered but didn't):\n"
for r in failed_triggers:
prompt += f' - "{r["query"]}" (triggered {r["triggers"]}/{r["runs"]} times)\n'
prompt += "\n"
if false_triggers:
prompt += "FALSE TRIGGERS (triggered but shouldn't have):\n"
for r in false_triggers:
prompt += f' - "{r["query"]}" (triggered {r["triggers"]}/{r["runs"]} times)\n'
prompt += "\n"
if history:
prompt += "PREVIOUS ATTEMPTS (do NOT repeat these — try something structurally different):\n\n"
for h in history:
train_s = f"{h.get('train_passed', h.get('passed', 0))}/{h.get('train_total', h.get('total', 0))}"
test_s = f"{h.get('test_passed', '?')}/{h.get('test_total', '?')}" if h.get('test_passed') is not None else None
score_str = f"train={train_s}" + (f", test={test_s}" if test_s else "")
prompt += f'<attempt {score_str}>\n'
prompt += f'Description: "{h["description"]}"\n'
if "results" in h:
prompt += "Train results:\n"
for r in h["results"]:
status = "PASS" if r["pass"] else "FAIL"
prompt += f' [{status}] "{r["query"][:80]}" (triggered {r["triggers"]}/{r["runs"]})\n'
if h.get("note"):
prompt += f'Note: {h["note"]}\n'
prompt += "</attempt>\n\n"
prompt += f"""</scores_summary>
Skill content (for context on what the skill does):
<skill_content>
{skill_content}
</skill_content>
Based on the failures, write a new and improved description that is more likely to trigger correctly. When I say "based on the failures", it's a bit of a tricky line to walk because we don't want to overfit to the specific cases you're seeing. So what I DON'T want you to do is produce an ever-expanding list of specific queries that this skill should or shouldn't trigger for. Instead, try to generalize from the failures to broader categories of user intent and situations where this skill would be useful or not useful. The reason for this is twofold:
1. Avoid overfitting
2. The list might get loooong and it's injected into ALL queries and there might be a lot of skills, so we don't want to blow too much space on any given description.
Concretely, your description should not be more than about 100-200 words, even if that comes at the cost of accuracy. There is a hard limit of 1024 characters descriptions over that will be truncated, so stay comfortably under it.
Here are some tips that we've found to work well in writing these descriptions:
- The skill should be phrased in the imperative -- "Use this skill for" rather than "this skill does"
- The skill description should focus on the user's intent, what they are trying to achieve, vs. the implementation details of how the skill works.
- The description competes with other skills for Claude's attention — make it distinctive and immediately recognizable.
- If you're getting lots of failures after repeated attempts, change things up. Try different sentence structures or wordings.
I'd encourage you to be creative and mix up the style in different iterations since you'll have multiple opportunities to try different approaches and we'll just grab the highest-scoring one at the end.
Please respond with only the new description text in <new_description> tags, nothing else."""
text = _call_claude(prompt, model)
match = re.search(r"<new_description>(.*?)</new_description>", text, re.DOTALL)
description = match.group(1).strip().strip('"') if match else text.strip().strip('"')
transcript: dict = {
"iteration": iteration,
"prompt": prompt,
"response": text,
"parsed_description": description,
"char_count": len(description),
"over_limit": len(description) > 1024,
}
# Safety net: the prompt already states the 1024-char hard limit, but if
# the model blew past it anyway, make one fresh single-turn call that
# quotes the too-long version and asks for a shorter rewrite. (The old
# SDK path did this as a true multi-turn; `claude -p` is one-shot, so we
# inline the prior output into the new prompt instead.)
if len(description) > 1024:
shorten_prompt = (
f"{prompt}\n\n"
f"---\n\n"
f"A previous attempt produced this description, which at "
f"{len(description)} characters is over the 1024-character hard limit:\n\n"
f'"{description}"\n\n'
f"Rewrite it to be under 1024 characters while keeping the most "
f"important trigger words and intent coverage. Respond with only "
f"the new description in <new_description> tags."
)
shorten_text = _call_claude(shorten_prompt, model)
match = re.search(r"<new_description>(.*?)</new_description>", shorten_text, re.DOTALL)
shortened = match.group(1).strip().strip('"') if match else shorten_text.strip().strip('"')
transcript["rewrite_prompt"] = shorten_prompt
transcript["rewrite_response"] = shorten_text
transcript["rewrite_description"] = shortened
transcript["rewrite_char_count"] = len(shortened)
description = shortened
transcript["final_description"] = description
if log_dir:
log_dir.mkdir(parents=True, exist_ok=True)
log_file = log_dir / f"improve_iter_{iteration or 'unknown'}.json"
log_file.write_text(json.dumps(transcript, indent=2))
return description
def main():
parser = argparse.ArgumentParser(description="Improve a skill description based on eval results")
parser.add_argument("--eval-results", required=True, help="Path to eval results JSON (from run_eval.py)")
parser.add_argument("--skill-path", required=True, help="Path to skill directory")
parser.add_argument("--history", default=None, help="Path to history JSON (previous attempts)")
parser.add_argument("--model", required=True, help="Model for improvement")
parser.add_argument("--verbose", action="store_true", help="Print thinking to stderr")
args = parser.parse_args()
skill_path = Path(args.skill_path)
if not (skill_path / "SKILL.md").exists():
print(f"Error: No SKILL.md found at {skill_path}", file=sys.stderr)
sys.exit(1)
eval_results = json.loads(Path(args.eval_results).read_text())
history = []
if args.history:
history = json.loads(Path(args.history).read_text())
name, _, content = parse_skill_md(skill_path)
current_description = eval_results["description"]
if args.verbose:
print(f"Current: {current_description}", file=sys.stderr)
print(f"Score: {eval_results['summary']['passed']}/{eval_results['summary']['total']}", file=sys.stderr)
new_description = improve_description(
skill_name=name,
skill_content=content,
current_description=current_description,
eval_results=eval_results,
history=history,
model=args.model,
)
if args.verbose:
print(f"Improved: {new_description}", file=sys.stderr)
# Output as JSON with both the new description and updated history
output = {
"description": new_description,
"history": history + [{
"description": current_description,
"passed": eval_results["summary"]["passed"],
"failed": eval_results["summary"]["failed"],
"total": eval_results["summary"]["total"],
"results": eval_results["results"],
}],
}
print(json.dumps(output, indent=2))
if __name__ == "__main__":
main()

View File

@ -0,0 +1,136 @@
#!/usr/bin/env python3
"""
Skill Packager - Creates a distributable .skill file of a skill folder
Usage:
python utils/package_skill.py <path/to/skill-folder> [output-directory]
Example:
python utils/package_skill.py skills/public/my-skill
python utils/package_skill.py skills/public/my-skill ./dist
"""
import fnmatch
import sys
import zipfile
from pathlib import Path
from scripts.quick_validate import validate_skill
# Patterns to exclude when packaging skills.
EXCLUDE_DIRS = {"__pycache__", "node_modules"}
EXCLUDE_GLOBS = {"*.pyc"}
EXCLUDE_FILES = {".DS_Store"}
# Directories excluded only at the skill root (not when nested deeper).
ROOT_EXCLUDE_DIRS = {"evals"}
def should_exclude(rel_path: Path) -> bool:
"""Check if a path should be excluded from packaging."""
parts = rel_path.parts
if any(part in EXCLUDE_DIRS for part in parts):
return True
# rel_path is relative to skill_path.parent, so parts[0] is the skill
# folder name and parts[1] (if present) is the first subdir.
if len(parts) > 1 and parts[1] in ROOT_EXCLUDE_DIRS:
return True
name = rel_path.name
if name in EXCLUDE_FILES:
return True
return any(fnmatch.fnmatch(name, pat) for pat in EXCLUDE_GLOBS)
def package_skill(skill_path, output_dir=None):
"""
Package a skill folder into a .skill file.
Args:
skill_path: Path to the skill folder
output_dir: Optional output directory for the .skill file (defaults to current directory)
Returns:
Path to the created .skill file, or None if error
"""
skill_path = Path(skill_path).resolve()
# Validate skill folder exists
if not skill_path.exists():
print(f"❌ Error: Skill folder not found: {skill_path}")
return None
if not skill_path.is_dir():
print(f"❌ Error: Path is not a directory: {skill_path}")
return None
# Validate SKILL.md exists
skill_md = skill_path / "SKILL.md"
if not skill_md.exists():
print(f"❌ Error: SKILL.md not found in {skill_path}")
return None
# Run validation before packaging
print("🔍 Validating skill...")
valid, message = validate_skill(skill_path)
if not valid:
print(f"❌ Validation failed: {message}")
print(" Please fix the validation errors before packaging.")
return None
print(f"{message}\n")
# Determine output location
skill_name = skill_path.name
if output_dir:
output_path = Path(output_dir).resolve()
output_path.mkdir(parents=True, exist_ok=True)
else:
output_path = Path.cwd()
skill_filename = output_path / f"{skill_name}.skill"
# Create the .skill file (zip format)
try:
with zipfile.ZipFile(skill_filename, 'w', zipfile.ZIP_DEFLATED) as zipf:
# Walk through the skill directory, excluding build artifacts
for file_path in skill_path.rglob('*'):
if not file_path.is_file():
continue
arcname = file_path.relative_to(skill_path.parent)
if should_exclude(arcname):
print(f" Skipped: {arcname}")
continue
zipf.write(file_path, arcname)
print(f" Added: {arcname}")
print(f"\n✅ Successfully packaged skill to: {skill_filename}")
return skill_filename
except Exception as e:
print(f"❌ Error creating .skill file: {e}")
return None
def main():
if len(sys.argv) < 2:
print("Usage: python utils/package_skill.py <path/to/skill-folder> [output-directory]")
print("\nExample:")
print(" python utils/package_skill.py skills/public/my-skill")
print(" python utils/package_skill.py skills/public/my-skill ./dist")
sys.exit(1)
skill_path = sys.argv[1]
output_dir = sys.argv[2] if len(sys.argv) > 2 else None
print(f"📦 Packaging skill: {skill_path}")
if output_dir:
print(f" Output directory: {output_dir}")
print()
result = package_skill(skill_path, output_dir)
if result:
sys.exit(0)
else:
sys.exit(1)
if __name__ == "__main__":
main()

View File

@ -0,0 +1,103 @@
#!/usr/bin/env python3
"""
Quick validation script for skills - minimal version
"""
import sys
import os
import re
import yaml
from pathlib import Path
def validate_skill(skill_path):
"""Basic validation of a skill"""
skill_path = Path(skill_path)
# Check SKILL.md exists
skill_md = skill_path / 'SKILL.md'
if not skill_md.exists():
return False, "SKILL.md not found"
# Read and validate frontmatter
content = skill_md.read_text()
if not content.startswith('---'):
return False, "No YAML frontmatter found"
# Extract frontmatter
match = re.match(r'^---\n(.*?)\n---', content, re.DOTALL)
if not match:
return False, "Invalid frontmatter format"
frontmatter_text = match.group(1)
# Parse YAML frontmatter
try:
frontmatter = yaml.safe_load(frontmatter_text)
if not isinstance(frontmatter, dict):
return False, "Frontmatter must be a YAML dictionary"
except yaml.YAMLError as e:
return False, f"Invalid YAML in frontmatter: {e}"
# Define allowed properties
ALLOWED_PROPERTIES = {'name', 'description', 'license', 'allowed-tools', 'metadata', 'compatibility'}
# Check for unexpected properties (excluding nested keys under metadata)
unexpected_keys = set(frontmatter.keys()) - ALLOWED_PROPERTIES
if unexpected_keys:
return False, (
f"Unexpected key(s) in SKILL.md frontmatter: {', '.join(sorted(unexpected_keys))}. "
f"Allowed properties are: {', '.join(sorted(ALLOWED_PROPERTIES))}"
)
# Check required fields
if 'name' not in frontmatter:
return False, "Missing 'name' in frontmatter"
if 'description' not in frontmatter:
return False, "Missing 'description' in frontmatter"
# Extract name for validation
name = frontmatter.get('name', '')
if not isinstance(name, str):
return False, f"Name must be a string, got {type(name).__name__}"
name = name.strip()
if name:
# Check naming convention (kebab-case: lowercase with hyphens)
if not re.match(r'^[a-z0-9-]+$', name):
return False, f"Name '{name}' should be kebab-case (lowercase letters, digits, and hyphens only)"
if name.startswith('-') or name.endswith('-') or '--' in name:
return False, f"Name '{name}' cannot start/end with hyphen or contain consecutive hyphens"
# Check name length (max 64 characters per spec)
if len(name) > 64:
return False, f"Name is too long ({len(name)} characters). Maximum is 64 characters."
# Extract and validate description
description = frontmatter.get('description', '')
if not isinstance(description, str):
return False, f"Description must be a string, got {type(description).__name__}"
description = description.strip()
if description:
# Check for angle brackets
if '<' in description or '>' in description:
return False, "Description cannot contain angle brackets (< or >)"
# Check description length (max 1024 characters per spec)
if len(description) > 1024:
return False, f"Description is too long ({len(description)} characters). Maximum is 1024 characters."
# Validate compatibility field if present (optional)
compatibility = frontmatter.get('compatibility', '')
if compatibility:
if not isinstance(compatibility, str):
return False, f"Compatibility must be a string, got {type(compatibility).__name__}"
if len(compatibility) > 500:
return False, f"Compatibility is too long ({len(compatibility)} characters). Maximum is 500 characters."
return True, "Skill is valid!"
if __name__ == "__main__":
if len(sys.argv) != 2:
print("Usage: python quick_validate.py <skill_directory>")
sys.exit(1)
valid, message = validate_skill(sys.argv[1])
print(message)
sys.exit(0 if valid else 1)

View File

@ -0,0 +1,310 @@
#!/usr/bin/env python3
"""Run trigger evaluation for a skill description.
Tests whether a skill's description causes Claude to trigger (read the skill)
for a set of queries. Outputs results as JSON.
"""
import argparse
import json
import os
import select
import subprocess
import sys
import time
import uuid
from concurrent.futures import ProcessPoolExecutor, as_completed
from pathlib import Path
from scripts.utils import parse_skill_md
def find_project_root() -> Path:
"""Find the project root by walking up from cwd looking for .claude/.
Mimics how Claude Code discovers its project root, so the command file
we create ends up where claude -p will look for it.
"""
current = Path.cwd()
for parent in [current, *current.parents]:
if (parent / ".claude").is_dir():
return parent
return current
def run_single_query(
query: str,
skill_name: str,
skill_description: str,
timeout: int,
project_root: str,
model: str | None = None,
) -> bool:
"""Run a single query and return whether the skill was triggered.
Creates a command file in .claude/commands/ so it appears in Claude's
available_skills list, then runs `claude -p` with the raw query.
Uses --include-partial-messages to detect triggering early from
stream events (content_block_start) rather than waiting for the
full assistant message, which only arrives after tool execution.
"""
unique_id = uuid.uuid4().hex[:8]
clean_name = f"{skill_name}-skill-{unique_id}"
project_commands_dir = Path(project_root) / ".claude" / "commands"
command_file = project_commands_dir / f"{clean_name}.md"
try:
project_commands_dir.mkdir(parents=True, exist_ok=True)
# Use YAML block scalar to avoid breaking on quotes in description
indented_desc = "\n ".join(skill_description.split("\n"))
command_content = (
f"---\n"
f"description: |\n"
f" {indented_desc}\n"
f"---\n\n"
f"# {skill_name}\n\n"
f"This skill handles: {skill_description}\n"
)
command_file.write_text(command_content)
cmd = [
"claude",
"-p", query,
"--output-format", "stream-json",
"--verbose",
"--include-partial-messages",
]
if model:
cmd.extend(["--model", model])
# Remove CLAUDECODE env var to allow nesting claude -p inside a
# Claude Code session. The guard is for interactive terminal conflicts;
# programmatic subprocess usage is safe.
env = {k: v for k, v in os.environ.items() if k != "CLAUDECODE"}
process = subprocess.Popen(
cmd,
stdout=subprocess.PIPE,
stderr=subprocess.DEVNULL,
cwd=project_root,
env=env,
)
triggered = False
start_time = time.time()
buffer = ""
# Track state for stream event detection
pending_tool_name = None
accumulated_json = ""
try:
while time.time() - start_time < timeout:
if process.poll() is not None:
remaining = process.stdout.read()
if remaining:
buffer += remaining.decode("utf-8", errors="replace")
break
ready, _, _ = select.select([process.stdout], [], [], 1.0)
if not ready:
continue
chunk = os.read(process.stdout.fileno(), 8192)
if not chunk:
break
buffer += chunk.decode("utf-8", errors="replace")
while "\n" in buffer:
line, buffer = buffer.split("\n", 1)
line = line.strip()
if not line:
continue
try:
event = json.loads(line)
except json.JSONDecodeError:
continue
# Early detection via stream events
if event.get("type") == "stream_event":
se = event.get("event", {})
se_type = se.get("type", "")
if se_type == "content_block_start":
cb = se.get("content_block", {})
if cb.get("type") == "tool_use":
tool_name = cb.get("name", "")
if tool_name in ("Skill", "Read"):
pending_tool_name = tool_name
accumulated_json = ""
else:
return False
elif se_type == "content_block_delta" and pending_tool_name:
delta = se.get("delta", {})
if delta.get("type") == "input_json_delta":
accumulated_json += delta.get("partial_json", "")
if clean_name in accumulated_json:
return True
elif se_type in ("content_block_stop", "message_stop"):
if pending_tool_name:
return clean_name in accumulated_json
if se_type == "message_stop":
return False
# Fallback: full assistant message
elif event.get("type") == "assistant":
message = event.get("message", {})
for content_item in message.get("content", []):
if content_item.get("type") != "tool_use":
continue
tool_name = content_item.get("name", "")
tool_input = content_item.get("input", {})
if tool_name == "Skill" and clean_name in tool_input.get("skill", ""):
triggered = True
elif tool_name == "Read" and clean_name in tool_input.get("file_path", ""):
triggered = True
return triggered
elif event.get("type") == "result":
return triggered
finally:
# Clean up process on any exit path (return, exception, timeout)
if process.poll() is None:
process.kill()
process.wait()
return triggered
finally:
if command_file.exists():
command_file.unlink()
def run_eval(
eval_set: list[dict],
skill_name: str,
description: str,
num_workers: int,
timeout: int,
project_root: Path,
runs_per_query: int = 1,
trigger_threshold: float = 0.5,
model: str | None = None,
) -> dict:
"""Run the full eval set and return results."""
results = []
with ProcessPoolExecutor(max_workers=num_workers) as executor:
future_to_info = {}
for item in eval_set:
for run_idx in range(runs_per_query):
future = executor.submit(
run_single_query,
item["query"],
skill_name,
description,
timeout,
str(project_root),
model,
)
future_to_info[future] = (item, run_idx)
query_triggers: dict[str, list[bool]] = {}
query_items: dict[str, dict] = {}
for future in as_completed(future_to_info):
item, _ = future_to_info[future]
query = item["query"]
query_items[query] = item
if query not in query_triggers:
query_triggers[query] = []
try:
query_triggers[query].append(future.result())
except Exception as e:
print(f"Warning: query failed: {e}", file=sys.stderr)
query_triggers[query].append(False)
for query, triggers in query_triggers.items():
item = query_items[query]
trigger_rate = sum(triggers) / len(triggers)
should_trigger = item["should_trigger"]
if should_trigger:
did_pass = trigger_rate >= trigger_threshold
else:
did_pass = trigger_rate < trigger_threshold
results.append({
"query": query,
"should_trigger": should_trigger,
"trigger_rate": trigger_rate,
"triggers": sum(triggers),
"runs": len(triggers),
"pass": did_pass,
})
passed = sum(1 for r in results if r["pass"])
total = len(results)
return {
"skill_name": skill_name,
"description": description,
"results": results,
"summary": {
"total": total,
"passed": passed,
"failed": total - passed,
},
}
def main():
parser = argparse.ArgumentParser(description="Run trigger evaluation for a skill description")
parser.add_argument("--eval-set", required=True, help="Path to eval set JSON file")
parser.add_argument("--skill-path", required=True, help="Path to skill directory")
parser.add_argument("--description", default=None, help="Override description to test")
parser.add_argument("--num-workers", type=int, default=10, help="Number of parallel workers")
parser.add_argument("--timeout", type=int, default=30, help="Timeout per query in seconds")
parser.add_argument("--runs-per-query", type=int, default=3, help="Number of runs per query")
parser.add_argument("--trigger-threshold", type=float, default=0.5, help="Trigger rate threshold")
parser.add_argument("--model", default=None, help="Model to use for claude -p (default: user's configured model)")
parser.add_argument("--verbose", action="store_true", help="Print progress to stderr")
args = parser.parse_args()
eval_set = json.loads(Path(args.eval_set).read_text())
skill_path = Path(args.skill_path)
if not (skill_path / "SKILL.md").exists():
print(f"Error: No SKILL.md found at {skill_path}", file=sys.stderr)
sys.exit(1)
name, original_description, content = parse_skill_md(skill_path)
description = args.description or original_description
project_root = find_project_root()
if args.verbose:
print(f"Evaluating: {description}", file=sys.stderr)
output = run_eval(
eval_set=eval_set,
skill_name=name,
description=description,
num_workers=args.num_workers,
timeout=args.timeout,
project_root=project_root,
runs_per_query=args.runs_per_query,
trigger_threshold=args.trigger_threshold,
model=args.model,
)
if args.verbose:
summary = output["summary"]
print(f"Results: {summary['passed']}/{summary['total']} passed", file=sys.stderr)
for r in output["results"]:
status = "PASS" if r["pass"] else "FAIL"
rate_str = f"{r['triggers']}/{r['runs']}"
print(f" [{status}] rate={rate_str} expected={r['should_trigger']}: {r['query'][:70]}", file=sys.stderr)
print(json.dumps(output, indent=2))
if __name__ == "__main__":
main()

View File

@ -0,0 +1,328 @@
#!/usr/bin/env python3
"""Run the eval + improve loop until all pass or max iterations reached.
Combines run_eval.py and improve_description.py in a loop, tracking history
and returning the best description found. Supports train/test split to prevent
overfitting.
"""
import argparse
import json
import random
import sys
import tempfile
import time
import webbrowser
from pathlib import Path
from scripts.generate_report import generate_html
from scripts.improve_description import improve_description
from scripts.run_eval import find_project_root, run_eval
from scripts.utils import parse_skill_md
def split_eval_set(eval_set: list[dict], holdout: float, seed: int = 42) -> tuple[list[dict], list[dict]]:
"""Split eval set into train and test sets, stratified by should_trigger."""
random.seed(seed)
# Separate by should_trigger
trigger = [e for e in eval_set if e["should_trigger"]]
no_trigger = [e for e in eval_set if not e["should_trigger"]]
# Shuffle each group
random.shuffle(trigger)
random.shuffle(no_trigger)
# Calculate split points
n_trigger_test = max(1, int(len(trigger) * holdout))
n_no_trigger_test = max(1, int(len(no_trigger) * holdout))
# Split
test_set = trigger[:n_trigger_test] + no_trigger[:n_no_trigger_test]
train_set = trigger[n_trigger_test:] + no_trigger[n_no_trigger_test:]
return train_set, test_set
def run_loop(
eval_set: list[dict],
skill_path: Path,
description_override: str | None,
num_workers: int,
timeout: int,
max_iterations: int,
runs_per_query: int,
trigger_threshold: float,
holdout: float,
model: str,
verbose: bool,
live_report_path: Path | None = None,
log_dir: Path | None = None,
) -> dict:
"""Run the eval + improvement loop."""
project_root = find_project_root()
name, original_description, content = parse_skill_md(skill_path)
current_description = description_override or original_description
# Split into train/test if holdout > 0
if holdout > 0:
train_set, test_set = split_eval_set(eval_set, holdout)
if verbose:
print(f"Split: {len(train_set)} train, {len(test_set)} test (holdout={holdout})", file=sys.stderr)
else:
train_set = eval_set
test_set = []
history = []
exit_reason = "unknown"
for iteration in range(1, max_iterations + 1):
if verbose:
print(f"\n{'='*60}", file=sys.stderr)
print(f"Iteration {iteration}/{max_iterations}", file=sys.stderr)
print(f"Description: {current_description}", file=sys.stderr)
print(f"{'='*60}", file=sys.stderr)
# Evaluate train + test together in one batch for parallelism
all_queries = train_set + test_set
t0 = time.time()
all_results = run_eval(
eval_set=all_queries,
skill_name=name,
description=current_description,
num_workers=num_workers,
timeout=timeout,
project_root=project_root,
runs_per_query=runs_per_query,
trigger_threshold=trigger_threshold,
model=model,
)
eval_elapsed = time.time() - t0
# Split results back into train/test by matching queries
train_queries_set = {q["query"] for q in train_set}
train_result_list = [r for r in all_results["results"] if r["query"] in train_queries_set]
test_result_list = [r for r in all_results["results"] if r["query"] not in train_queries_set]
train_passed = sum(1 for r in train_result_list if r["pass"])
train_total = len(train_result_list)
train_summary = {"passed": train_passed, "failed": train_total - train_passed, "total": train_total}
train_results = {"results": train_result_list, "summary": train_summary}
if test_set:
test_passed = sum(1 for r in test_result_list if r["pass"])
test_total = len(test_result_list)
test_summary = {"passed": test_passed, "failed": test_total - test_passed, "total": test_total}
test_results = {"results": test_result_list, "summary": test_summary}
else:
test_results = None
test_summary = None
history.append({
"iteration": iteration,
"description": current_description,
"train_passed": train_summary["passed"],
"train_failed": train_summary["failed"],
"train_total": train_summary["total"],
"train_results": train_results["results"],
"test_passed": test_summary["passed"] if test_summary else None,
"test_failed": test_summary["failed"] if test_summary else None,
"test_total": test_summary["total"] if test_summary else None,
"test_results": test_results["results"] if test_results else None,
# For backward compat with report generator
"passed": train_summary["passed"],
"failed": train_summary["failed"],
"total": train_summary["total"],
"results": train_results["results"],
})
# Write live report if path provided
if live_report_path:
partial_output = {
"original_description": original_description,
"best_description": current_description,
"best_score": "in progress",
"iterations_run": len(history),
"holdout": holdout,
"train_size": len(train_set),
"test_size": len(test_set),
"history": history,
}
live_report_path.write_text(generate_html(partial_output, auto_refresh=True, skill_name=name))
if verbose:
def print_eval_stats(label, results, elapsed):
pos = [r for r in results if r["should_trigger"]]
neg = [r for r in results if not r["should_trigger"]]
tp = sum(r["triggers"] for r in pos)
pos_runs = sum(r["runs"] for r in pos)
fn = pos_runs - tp
fp = sum(r["triggers"] for r in neg)
neg_runs = sum(r["runs"] for r in neg)
tn = neg_runs - fp
total = tp + tn + fp + fn
precision = tp / (tp + fp) if (tp + fp) > 0 else 1.0
recall = tp / (tp + fn) if (tp + fn) > 0 else 1.0
accuracy = (tp + tn) / total if total > 0 else 0.0
print(f"{label}: {tp+tn}/{total} correct, precision={precision:.0%} recall={recall:.0%} accuracy={accuracy:.0%} ({elapsed:.1f}s)", file=sys.stderr)
for r in results:
status = "PASS" if r["pass"] else "FAIL"
rate_str = f"{r['triggers']}/{r['runs']}"
print(f" [{status}] rate={rate_str} expected={r['should_trigger']}: {r['query'][:60]}", file=sys.stderr)
print_eval_stats("Train", train_results["results"], eval_elapsed)
if test_summary:
print_eval_stats("Test ", test_results["results"], 0)
if train_summary["failed"] == 0:
exit_reason = f"all_passed (iteration {iteration})"
if verbose:
print(f"\nAll train queries passed on iteration {iteration}!", file=sys.stderr)
break
if iteration == max_iterations:
exit_reason = f"max_iterations ({max_iterations})"
if verbose:
print(f"\nMax iterations reached ({max_iterations}).", file=sys.stderr)
break
# Improve the description based on train results
if verbose:
print(f"\nImproving description...", file=sys.stderr)
t0 = time.time()
# Strip test scores from history so improvement model can't see them
blinded_history = [
{k: v for k, v in h.items() if not k.startswith("test_")}
for h in history
]
new_description = improve_description(
skill_name=name,
skill_content=content,
current_description=current_description,
eval_results=train_results,
history=blinded_history,
model=model,
log_dir=log_dir,
iteration=iteration,
)
improve_elapsed = time.time() - t0
if verbose:
print(f"Proposed ({improve_elapsed:.1f}s): {new_description}", file=sys.stderr)
current_description = new_description
# Find the best iteration by TEST score (or train if no test set)
if test_set:
best = max(history, key=lambda h: h["test_passed"] or 0)
best_score = f"{best['test_passed']}/{best['test_total']}"
else:
best = max(history, key=lambda h: h["train_passed"])
best_score = f"{best['train_passed']}/{best['train_total']}"
if verbose:
print(f"\nExit reason: {exit_reason}", file=sys.stderr)
print(f"Best score: {best_score} (iteration {best['iteration']})", file=sys.stderr)
return {
"exit_reason": exit_reason,
"original_description": original_description,
"best_description": best["description"],
"best_score": best_score,
"best_train_score": f"{best['train_passed']}/{best['train_total']}",
"best_test_score": f"{best['test_passed']}/{best['test_total']}" if test_set else None,
"final_description": current_description,
"iterations_run": len(history),
"holdout": holdout,
"train_size": len(train_set),
"test_size": len(test_set),
"history": history,
}
def main():
parser = argparse.ArgumentParser(description="Run eval + improve loop")
parser.add_argument("--eval-set", required=True, help="Path to eval set JSON file")
parser.add_argument("--skill-path", required=True, help="Path to skill directory")
parser.add_argument("--description", default=None, help="Override starting description")
parser.add_argument("--num-workers", type=int, default=10, help="Number of parallel workers")
parser.add_argument("--timeout", type=int, default=30, help="Timeout per query in seconds")
parser.add_argument("--max-iterations", type=int, default=5, help="Max improvement iterations")
parser.add_argument("--runs-per-query", type=int, default=3, help="Number of runs per query")
parser.add_argument("--trigger-threshold", type=float, default=0.5, help="Trigger rate threshold")
parser.add_argument("--holdout", type=float, default=0.4, help="Fraction of eval set to hold out for testing (0 to disable)")
parser.add_argument("--model", required=True, help="Model for improvement")
parser.add_argument("--verbose", action="store_true", help="Print progress to stderr")
parser.add_argument("--report", default="auto", help="Generate HTML report at this path (default: 'auto' for temp file, 'none' to disable)")
parser.add_argument("--results-dir", default=None, help="Save all outputs (results.json, report.html, log.txt) to a timestamped subdirectory here")
args = parser.parse_args()
eval_set = json.loads(Path(args.eval_set).read_text())
skill_path = Path(args.skill_path)
if not (skill_path / "SKILL.md").exists():
print(f"Error: No SKILL.md found at {skill_path}", file=sys.stderr)
sys.exit(1)
name, _, _ = parse_skill_md(skill_path)
# Set up live report path
if args.report != "none":
if args.report == "auto":
timestamp = time.strftime("%Y%m%d_%H%M%S")
live_report_path = Path(tempfile.gettempdir()) / f"skill_description_report_{skill_path.name}_{timestamp}.html"
else:
live_report_path = Path(args.report)
# Open the report immediately so the user can watch
live_report_path.write_text("<html><body><h1>Starting optimization loop...</h1><meta http-equiv='refresh' content='5'></body></html>")
webbrowser.open(str(live_report_path))
else:
live_report_path = None
# Determine output directory (create before run_loop so logs can be written)
if args.results_dir:
timestamp = time.strftime("%Y-%m-%d_%H%M%S")
results_dir = Path(args.results_dir) / timestamp
results_dir.mkdir(parents=True, exist_ok=True)
else:
results_dir = None
log_dir = results_dir / "logs" if results_dir else None
output = run_loop(
eval_set=eval_set,
skill_path=skill_path,
description_override=args.description,
num_workers=args.num_workers,
timeout=args.timeout,
max_iterations=args.max_iterations,
runs_per_query=args.runs_per_query,
trigger_threshold=args.trigger_threshold,
holdout=args.holdout,
model=args.model,
verbose=args.verbose,
live_report_path=live_report_path,
log_dir=log_dir,
)
# Save JSON output
json_output = json.dumps(output, indent=2)
print(json_output)
if results_dir:
(results_dir / "results.json").write_text(json_output)
# Write final HTML report (without auto-refresh)
if live_report_path:
live_report_path.write_text(generate_html(output, auto_refresh=False, skill_name=name))
print(f"\nReport: {live_report_path}", file=sys.stderr)
if results_dir and live_report_path:
(results_dir / "report.html").write_text(generate_html(output, auto_refresh=False, skill_name=name))
if results_dir:
print(f"Results saved to: {results_dir}", file=sys.stderr)
if __name__ == "__main__":
main()

View File

@ -0,0 +1,47 @@
"""Shared utilities for skill-creator scripts."""
from pathlib import Path
def parse_skill_md(skill_path: Path) -> tuple[str, str, str]:
"""Parse a SKILL.md file, returning (name, description, full_content)."""
content = (skill_path / "SKILL.md").read_text()
lines = content.split("\n")
if lines[0].strip() != "---":
raise ValueError("SKILL.md missing frontmatter (no opening ---)")
end_idx = None
for i, line in enumerate(lines[1:], start=1):
if line.strip() == "---":
end_idx = i
break
if end_idx is None:
raise ValueError("SKILL.md missing frontmatter (no closing ---)")
name = ""
description = ""
frontmatter_lines = lines[1:end_idx]
i = 0
while i < len(frontmatter_lines):
line = frontmatter_lines[i]
if line.startswith("name:"):
name = line[len("name:"):].strip().strip('"').strip("'")
elif line.startswith("description:"):
value = line[len("description:"):].strip()
# Handle YAML multiline indicators (>, |, >-, |-)
if value in (">", "|", ">-", "|-"):
continuation_lines: list[str] = []
i += 1
while i < len(frontmatter_lines) and (frontmatter_lines[i].startswith(" ") or frontmatter_lines[i].startswith("\t")):
continuation_lines.append(frontmatter_lines[i].strip())
i += 1
description = " ".join(continuation_lines)
continue
else:
description = value.strip('"').strip("'")
i += 1
return name, description, content

View File

@ -0,0 +1,285 @@
---
name: zeroclaw
description: "Help users operate and interact with their ZeroClaw agent instance — through both the CLI (`zeroclaw` commands) and the REST/WebSocket gateway API. Use this skill whenever the user wants to: send messages to ZeroClaw, manage memory or cron jobs, check system status, configure channels or providers, hit the gateway API, troubleshoot their ZeroClaw setup, build from source, or do anything involving the `zeroclaw` binary or its HTTP endpoints. Trigger this even if the user just says things like 'check my agent status', 'schedule a reminder', 'store this in memory', 'list my cron jobs', 'send a message to my bot', 'set up Telegram', 'build zeroclaw', or 'my bot is broken' — these are all ZeroClaw operations."
---
# ZeroClaw Skill
You are helping a user operate their ZeroClaw agent instance. ZeroClaw is an autonomous agent runtime with a CLI and an HTTP/WebSocket gateway.
Your job is to understand what the user wants to accomplish and then **execute it** — run the command, make the API call, report the result. Do not just show commands for the user to copy-paste. Actually run them via the Bash tool and tell the user what happened. The only exception is destructive operations (clearing all memory, estop kill-all) where you should confirm first.
## Adaptive Expertise
Pay attention to how the user talks. Someone who says "can you hit the webhook endpoint with a POST" is telling you they know what they're doing — be concise, skip explanations, just execute. Someone who says "how do I make my bot remember things" needs more context about what's happening under the hood.
Signals of technical comfort: mentions specific endpoints, HTTP methods, JSON fields, talks about tokens/auth, uses CLI flags fluently, references config files directly.
Signals of less familiarity: asks "what does X do", uses casual language about the bot/agent, describes goals rather than mechanisms ("I want it to check something every morning").
Default to a middle ground — brief explanation of what you're about to do, then do it. Dial up or down from there based on cues.
## Discovery — Before You Act
Before running any ZeroClaw operation, make sure you know where things are:
1. **Find the binary.** Search in this order:
- `which zeroclaw` (PATH)
- The current project's build output: `./target/release/zeroclaw` or `./target/debug/zeroclaw` — this is the right choice when the user is working inside the ZeroClaw source tree and may have local changes
- Common install locations: `~/.cargo/bin/zeroclaw`, `~/Downloads/zeroclaw-bin/zeroclaw`
If no binary is found anywhere, offer to build from source (see "Building from Source" below). If the user is a developer working on ZeroClaw itself, they'll likely want the local build — watch for cues like them editing source files, mentioning PRs, or being in the project directory.
2. **Check if the gateway is running** (only needed for REST/WebSocket operations). A quick `curl -sf http://127.0.0.1:42617/health` tells you. If it's not running and the user wants REST access, let them know and offer to start it (`zeroclaw gateway` or `zeroclaw daemon`).
3. **Check auth status.** If the gateway requires pairing (`require_pairing = true` is the default), REST calls need a bearer token. Run `zeroclaw status` to see the current state, or check `~/.zeroclaw/config.toml` for a stored token under `[gateway]`.
Cache these findings for the conversation — don't re-discover every time.
## Important: REPL Limitation
`zeroclaw agent` (interactive REPL) requires interactive stdin, which doesn't work through the Bash tool. When the user wants to chat with their agent, use single-message mode instead:
```bash
zeroclaw agent -m "the message"
```
Each `-m` invocation is independent (no conversation history between calls). If the user needs multi-turn conversation, let them know they can run `zeroclaw agent` directly in their terminal, or use the WebSocket endpoint for programmatic streaming.
## First-Time Setup
If the user hasn't set up ZeroClaw yet (no `~/.zeroclaw/config.toml` exists), guide them through onboarding:
```bash
zeroclaw onboard # Quick mode — defaults to OpenRouter
zeroclaw onboard --provider anthropic # Use Anthropic directly
zeroclaw onboard # Guided wizard (default)
```
After onboarding, verify everything works:
```bash
zeroclaw status
zeroclaw doctor
```
If they already have a config but something is broken, `zeroclaw onboard --channels-only` repairs just the channel configuration without overwriting everything else.
## Building from Source
If the user wants to build ZeroClaw (or no binary is installed):
```bash
cargo build --release
```
This produces `target/release/zeroclaw`. For faster iteration during development, `cargo build` (debug mode) is quicker but produces a slower binary at `target/debug/zeroclaw`.
You can also run directly without a separate build step:
```bash
cargo run --release -- <subcommand> [args]
```
Before building, `cargo check` gives a quick compile validation without the full build.
## Choosing CLI vs REST
Both surfaces can do most things. Rules of thumb:
- **CLI is simpler** for one-off operations from the terminal. It handles auth internally and formats output nicely. Prefer CLI when the user is working locally.
- **REST is needed** when the user is building an integration, scripting from another language, or accessing a remote ZeroClaw instance. Also needed for streaming (WebSocket, SSE).
- If unclear, **default to CLI** — it's less setup.
## Core Operations
### Sending Messages
**CLI:** `zeroclaw agent -m "your message here"` — remember, always use `-m` mode, not bare `zeroclaw agent`.
**REST:**
```bash
curl -X POST http://127.0.0.1:42617/webhook \
-H "Authorization: Bearer <token>" \
-H "Content-Type: application/json" \
-d '{"message": "your message here"}'
```
Response: `{"response": "...", "model": "..."}`
**WebSocket** (for streaming): connect to `ws://127.0.0.1:42617/ws/chat?token=<token>`, send `{"type": "message", "content": "..."}`, receive `{"type": "done", "full_response": "..."}`.
### System Status
Run `zeroclaw status` to see provider, model, uptime, channels, memory backend. For deeper diagnostics: `zeroclaw doctor`.
**REST:** `GET /api/status` (same info as JSON), `GET /health` (no auth, quick ok/not-ok).
### Memory
The CLI can list, get, and clear memories but **cannot store** them directly. To store a memory:
- Via agent: `zeroclaw agent -m "remember that my favorite color is blue"`
- Via REST: `POST /api/memory` with `{"key": "...", "content": "...", "category": "core"}`
**CLI (read/delete):**
- `zeroclaw memory list` — list all entries
- `zeroclaw memory list --category core --limit 10` — filtered
- `zeroclaw memory get "key-name"` — get specific entry
- `zeroclaw memory stats` — usage statistics
- `zeroclaw memory clear --key "prefix" --yes` — delete entries (confirm with user first)
**REST (full CRUD):**
- `GET /api/memory` — list all (optional: `?query=search+text&category=core`)
- `POST /api/memory` — store: `{"key": "...", "content": "...", "category": "core"}`
- `DELETE /api/memory/{key}` — delete entry
Categories: `core`, `daily`, `conversation`, or any custom string.
### Cron / Scheduling
**CLI:**
- `zeroclaw cron list` — show all jobs
- `zeroclaw cron add '0 9 * * 1-5' 'Good morning' --tz America/New_York` — recurring
- `zeroclaw cron add-at '2026-03-11T10:00:00Z' 'Remind me'` — one-time at specific time
- `zeroclaw cron add-every 3600000 'Check health'` — interval in ms
- `zeroclaw cron once 30m 'Follow up'` — delay from now
- `zeroclaw cron pause <id>` / `zeroclaw cron resume <id>` / `zeroclaw cron remove <id>`
**REST:**
- `GET /api/cron` — list jobs
- `POST /api/cron` — add: `{"name": "...", "schedule": "0 9 * * *", "command": "..."}`
- `DELETE /api/cron/{id}` — remove job
### Tools
Tools are used automatically by the agent during conversations (shell, file ops, memory, browser, HTTP, web search, git, etc. — 30+ tools gated by security policy).
To see what's available: `GET /api/tools` (REST) lists all registered tools with descriptions and parameter schemas.
### Configuration
Edit `~/.zeroclaw/config.toml` directly, or re-run `zeroclaw onboard` to reconfigure.
**REST:**
- `GET /api/config` — get current config (secrets masked as `***MASKED***`)
- `PUT /api/config` — update config (send raw TOML as body, 1MB limit)
### Providers & Models
- `zeroclaw providers` — list all supported providers
- `zeroclaw models list` — cached model catalog
- `zeroclaw models refresh --all` — refresh from providers
- `zeroclaw models set anthropic/claude-sonnet-4-6` — set default model
Override per-message: `zeroclaw agent -p anthropic --model claude-sonnet-4-6 -m "hello"`
### Real-Time Events (SSE)
REST only — useful for building dashboards or monitoring:
```bash
curl -N -H "Authorization: Bearer <token>" http://127.0.0.1:42617/api/events
```
Streams JSON events: `llm_request`, `tool_call_start`, `tool_call`, `agent_start`, `agent_end`, `error`.
### Cost Tracking
`GET /api/cost` — returns session/daily/monthly costs, token counts, per-model breakdown.
### Emergency Stop
Confirm with the user before running any estop command — these are disruptive.
- `zeroclaw estop --level kill-all` — stop everything
- `zeroclaw estop --level network-kill` — block all network
- `zeroclaw estop --level tool-freeze --tool shell` — freeze specific tool
- `zeroclaw estop status` — check current estop state
- `zeroclaw estop resume --network` — resume
### Gateway Lifecycle
- `zeroclaw gateway` — start HTTP gateway (foreground)
- `zeroclaw gateway -p 8080 --host 127.0.0.1` — custom bind
- `zeroclaw daemon` — start gateway + channels + scheduler + heartbeat
- `zeroclaw service install/start/stop/status/uninstall` — OS service management
### Channels
ZeroClaw supports 21 messaging channels. To add one, you need to edit `~/.zeroclaw/config.toml`. For example, to set up Telegram:
```toml
[channels]
telegram = true
[channels_config.telegram]
bot_token = "your-bot-token-from-botfather"
allowed_users = [123456789]
```
Then restart the daemon. Check channel health with `zeroclaw channels doctor`.
For the full list of channels and their config fields, read `references/cli-reference.md` (Channels section).
### Pairing (Authentication Setup)
When `require_pairing = true` (default), REST clients need a bearer token:
```bash
curl -X POST http://127.0.0.1:42617/pair -H "X-Pairing-Code: <code>"
```
Response includes `{"token": "..."}` — save this for subsequent requests.
## Common Workflows
Here are multi-step sequences you're likely to need:
**"Is my agent healthy?"**
1. Run `zeroclaw status` — check provider, model, channels
2. Run `zeroclaw doctor` — check connectivity, diagnose issues
3. If gateway needed: `curl -sf http://127.0.0.1:42617/health`
**"Set up a new channel"**
1. Read the current config: `cat ~/.zeroclaw/config.toml`
2. Add the channel config (edit the TOML)
3. Restart: `zeroclaw service restart` (or restart daemon manually)
4. Verify: `zeroclaw channels doctor`
**"Switch to a different model"**
1. Check available: `zeroclaw models list`
2. Set it: `zeroclaw models set <provider/model>`
3. Verify: `zeroclaw status`
4. Test: `zeroclaw agent -m "hello, what model are you?"`
## Gateway Defaults
- **Port:** 42617
- **Host:** 127.0.0.1
- **Auth:** Pairing required (bearer token)
- **Rate limits:** 60 webhook requests/min, 10 pairing attempts/min
- **Body limit:** 64KB (1MB for config updates)
- **Timeout:** 30 seconds
- **Idempotency:** Optional `X-Idempotency-Key` header on `/webhook` (300s TTL)
- **Config location:** `~/.zeroclaw/config.toml`
## Reference Files
For the complete API specification with every endpoint, field, and edge case, read `references/rest-api.md`.
For the full CLI command tree with all flags and options, read `references/cli-reference.md`.
Only load these when you need precise details beyond what's in this file — for most operations, the quick references above are sufficient.
## Troubleshooting
**"zeroclaw: command not found"** — Binary not in PATH. Check `./target/release/zeroclaw`, `~/.cargo/bin/zeroclaw`, or build from source with `cargo build --release`.
**"Connection refused" on REST calls** — Gateway isn't running. Start it with `zeroclaw gateway` or `zeroclaw daemon`.
**"Unauthorized" (401/403)** — Bearer token is missing or invalid. Re-pair via `POST /pair` with the pairing code, or check `~/.zeroclaw/config.toml` for the stored token.
**"LLM request failed" (500)** — Provider issue. Run `zeroclaw doctor` to check connectivity. Common causes: expired API key, provider outage, rate limiting on the provider side.
**"Too many requests" (429)** — You're hitting ZeroClaw's rate limit. Back off — the response includes `retry_after` with the number of seconds to wait.
**Agent not using tools / acting limited** — Check autonomy settings in config.toml under `[autonomy]`. `level = "read_only"` disables most tools. Try `level = "supervised"` or `level = "full"`.
**Memory not persisting** — Check `[memory]` config. If `backend = "none"`, nothing is stored. Switch to `"sqlite"` or `"markdown"`. Also verify `auto_save = true`.
**Channel not responding** — Run `zeroclaw channels doctor` for the specific channel. Common issues: expired bot token, wrong allowed_users list, channel not enabled in `[channels]`.
Report errors to the user with context appropriate to their expertise level. For beginners, explain what went wrong and suggest the fix. For experts, just show the error and the fix.

View File

@ -0,0 +1,23 @@
{
"skill_name": "zeroclaw",
"evals": [
{
"id": 0,
"prompt": "how do i make my bot remember my name",
"expected_output": "Executes a zeroclaw command to store a memory, explains what happened in beginner-friendly language",
"files": []
},
{
"id": 1,
"prompt": "I want to schedule a daily health check on my ZeroClaw instance every morning at 9am ET",
"expected_output": "Executes zeroclaw cron add with correct cron expression and timezone flag",
"files": []
},
{
"id": 2,
"prompt": "Set up a Python script that monitors my ZeroClaw agent's activity via SSE and logs tool calls to a file",
"expected_output": "Writes a Python script that connects to /api/events SSE endpoint with auth, filters for tool_call events, and logs to a file",
"files": []
}
]
}

View File

@ -0,0 +1,277 @@
# ZeroClaw CLI Reference
Complete command reference for the `zeroclaw` binary.
## Table of Contents
1. [Agent](#agent)
2. [Onboarding](#onboarding)
3. [Status & Diagnostics](#status--diagnostics)
4. [Memory](#memory)
5. [Cron](#cron)
6. [Providers & Models](#providers--models)
7. [Gateway & Daemon](#gateway--daemon)
8. [Service Management](#service-management)
9. [Channels](#channels)
10. [Security & Emergency Stop](#security--emergency-stop)
11. [Hardware Peripherals](#hardware-peripherals)
12. [Skills](#skills)
13. [Shell Completions](#shell-completions)
---
## Agent
Interactive chat or single-message mode.
```bash
zeroclaw agent # Interactive REPL
zeroclaw agent -m "Summarize today's logs" # Single message
zeroclaw agent -p anthropic --model claude-sonnet-4-6 # Override provider/model
zeroclaw agent -t 0.3 # Set temperature
zeroclaw agent --peripheral nucleo-f401re:/dev/ttyACM0 # Attach hardware
```
**Key flags:**
- `-m <message>` — single message mode (no REPL)
- `-p <provider>` — override provider (openrouter, anthropic, openai, ollama)
- `--model <model>` — override model
- `-t <float>` — temperature (0.02.0)
- `--peripheral <name>:<port>` — attach hardware peripheral
The agent has access to 30+ tools gated by security policy: shell, file_read, file_write, file_edit, glob_search, content_search, memory_store, memory_recall, memory_forget, browser, http_request, web_fetch, web_search, cron, delegate, git, and more. Max tool iterations defaults to 10.
---
## Onboarding
First-time setup or reconfiguration.
```bash
zeroclaw onboard # Quick mode (default: openrouter)
zeroclaw onboard --provider anthropic # Quick mode with specific provider
zeroclaw onboard # Guided wizard (default)
zeroclaw onboard --memory sqlite # Set memory backend
zeroclaw onboard --force # Overwrite existing config
zeroclaw onboard --channels-only # Repair channels only
```
**Key flags:**
- `--provider <name>` — openrouter (default), anthropic, openai, ollama
- `--model <model>` — default model
- `--memory <backend>` — sqlite, markdown, lucid, none
- `--force` — overwrite existing config.toml
- `--channels-only` — only repair channel configuration
- `--reinit` — start fresh (backs up existing config)
Creates `~/.zeroclaw/config.toml` with `0600` permissions.
---
## Status & Diagnostics
```bash
zeroclaw status # System overview
zeroclaw doctor # Run all diagnostic checks
zeroclaw doctor models # Probe model connectivity
zeroclaw doctor traces # Query execution traces
```
---
## Memory
```bash
zeroclaw memory list # List all entries
zeroclaw memory list --category core --limit 10 # Filtered list
zeroclaw memory get "some-key" # Get specific entry
zeroclaw memory stats # Usage statistics
zeroclaw memory clear --key "prefix" --yes # Delete entries (requires --yes)
```
**Key flags:**
- `--category <name>` — filter by category (core, daily, conversation, custom)
- `--limit <n>` — limit results
- `--key <prefix>` — key prefix for clear operations
- `--yes` — skip confirmation (required for clear)
---
## Cron
```bash
zeroclaw cron list # List all jobs
zeroclaw cron add '0 9 * * 1-5' 'Good morning' --tz America/New_York # Recurring (cron expr)
zeroclaw cron add-at '2026-03-11T10:00:00Z' 'Remind me about meeting' # One-time at specific time
zeroclaw cron add-every 3600000 'Check server health' # Interval in milliseconds
zeroclaw cron once 30m 'Follow up on that task' # Delay from now
zeroclaw cron pause <id> # Pause job
zeroclaw cron resume <id> # Resume job
zeroclaw cron remove <id> # Delete job
```
**Subcommands:**
- `add <cron-expr> <command>` — standard cron expression (5-field)
- `add-at <iso-datetime> <command>` — fire once at exact time
- `add-every <ms> <command>` — repeating interval
- `once <duration> <command>` — delay from now (e.g., `30m`, `2h`, `1d`)
---
## Providers & Models
```bash
zeroclaw providers # List all 40+ supported providers
zeroclaw models list # Show cached model catalog
zeroclaw models refresh --all # Refresh catalogs from all providers
zeroclaw models set anthropic/claude-sonnet-4-6 # Set default model
zeroclaw models status # Current model info
```
Model routing in config.toml:
```toml
[[model_routes]]
hint = "reasoning"
provider = "openrouter"
model = "anthropic/claude-sonnet-4-6"
```
---
## Gateway & Daemon
```bash
zeroclaw gateway # Start HTTP gateway (foreground)
zeroclaw gateway -p 8080 --host 127.0.0.1 # Custom port/host
zeroclaw daemon # Gateway + channels + scheduler + heartbeat
zeroclaw daemon -p 8080 --host 0.0.0.0 # Custom bind
```
**Gateway defaults:**
- Port: 42617
- Host: 127.0.0.1
- Pairing required: true
- Public bind allowed: false
---
## Service Management
OS service lifecycle (systemd on Linux, launchd on macOS).
```bash
zeroclaw service install # Install as system service
zeroclaw service start # Start the service
zeroclaw service status # Check service status
zeroclaw service stop # Stop the service
zeroclaw service restart # Restart the service
zeroclaw service uninstall # Remove the service
```
**Logs:**
- macOS: `~/.zeroclaw/logs/daemon.stdout.log`
- Linux: `journalctl -u zeroclaw`
---
## Channels
Channels are configured in `config.toml` under `[channels]` and `[channels_config.*]`.
```bash
zeroclaw channels list # List configured channels
zeroclaw channels doctor # Check channel health
```
Supported channels (21 total): Telegram, Discord, Slack, WhatsApp (Meta), WATI, Linq (iMessage/RCS/SMS), Email (IMAP/SMTP), IRC, Matrix, Nostr, Signal, Nextcloud Talk, and more.
Channel config example (Telegram):
```toml
[channels]
telegram = true
[channels_config.telegram]
bot_token = "..."
allowed_users = [123456789]
```
---
## Security & Emergency Stop
```bash
zeroclaw estop --level kill-all # Stop everything
zeroclaw estop --level network-kill # Block all network access
zeroclaw estop --level domain-block --domain "*.example.com" # Block specific domains
zeroclaw estop --level tool-freeze --tool shell # Freeze specific tool
zeroclaw estop status # Check estop state
zeroclaw estop resume --network # Resume (may require OTP)
```
**Estop levels:**
- `kill-all` — nuclear option, stops all agent activity
- `network-kill` — blocks all outbound network
- `domain-block` — blocks specific domain patterns
- `tool-freeze` — freezes individual tools
Autonomy config in config.toml:
```toml
[autonomy]
level = "supervised" # read_only | supervised | full
workspace_only = true
allowed_commands = ["git", "cargo", "python"]
forbidden_paths = ["/etc", "/root", "~/.ssh"]
max_actions_per_hour = 20
max_cost_per_day_cents = 500
```
---
## Hardware Peripherals
```bash
zeroclaw hardware discover # Find USB devices
zeroclaw hardware introspect /dev/ttyACM0 # Probe device capabilities
zeroclaw peripheral list # List configured peripherals
zeroclaw peripheral add nucleo-f401re /dev/ttyACM0 # Add peripheral
zeroclaw peripheral flash-nucleo # Flash STM32 firmware
zeroclaw peripheral flash --port /dev/cu.usbmodem101 # Flash Arduino firmware
```
**Supported boards:** STM32 Nucleo-F401RE, Arduino Uno R4, Raspberry Pi GPIO, ESP32.
Attach to agent session: `zeroclaw agent --peripheral nucleo-f401re:/dev/ttyACM0`
---
## Skills
```bash
zeroclaw skills list # List installed skills
zeroclaw skills install <path-or-url> # Install a skill
zeroclaw skills audit # Audit installed skills
zeroclaw skills remove <name> # Remove a skill
```
---
## Shell Completions
```bash
zeroclaw completions zsh # Generate Zsh completions
zeroclaw completions bash # Generate Bash completions
zeroclaw completions fish # Generate Fish completions
```
---
## Config File
Default location: `~/.zeroclaw/config.toml`
Config resolution order (first match wins):
1. `ZEROCLAW_CONFIG_DIR` environment variable
2. `ZEROCLAW_WORKSPACE` environment variable
3. `~/.zeroclaw/active_workspace.toml` marker file
4. `~/.zeroclaw/config.toml` (default)

View File

@ -0,0 +1,505 @@
# ZeroClaw REST API Reference
Complete endpoint reference for the ZeroClaw gateway HTTP API.
## Table of Contents
1. [Authentication](#authentication)
2. [Public Endpoints](#public-endpoints)
3. [Webhook](#webhook)
4. [WebSocket Chat](#websocket-chat)
5. [Status & Health](#status--health)
6. [Memory](#memory)
7. [Cron](#cron)
8. [Tools](#tools)
9. [Configuration](#configuration)
10. [Integrations](#integrations)
11. [Cost](#cost)
12. [Events (SSE)](#events-sse)
13. [Channel Webhooks](#channel-webhooks)
14. [Rate Limiting](#rate-limiting)
15. [Error Responses](#error-responses)
---
## Authentication
Three authentication mechanisms:
### Bearer Token (Primary)
```
Authorization: Bearer <token>
```
Obtained via `POST /pair`. Required for all `/api/*` endpoints when `require_pairing = true` (default).
### Webhook Secret
```
X-Webhook-Secret: <raw_secret>
```
Optional additional auth for `/webhook`. Server SHA-256 hashes and compares using constant-time comparison.
### WebSocket Token
```
ws://host:port/ws/chat?token=<bearer_token>
```
WebSocket connections pass the token as a query parameter (browsers can't set custom headers on WS handshake).
---
## Public Endpoints
### GET /health
No authentication required.
**Response 200:**
```json
{
"status": "ok",
"paired": true,
"require_pairing": true,
"runtime": {}
}
```
### GET /metrics
Prometheus text exposition format.
**Response 200:**
```
Content-Type: text/plain; version=0.0.4; charset=utf-8
```
### POST /pair
Exchange a one-time pairing code for a bearer token.
**Rate Limit:** Configurable per-minute limit per IP (default: 10/min).
**Headers:**
- `X-Pairing-Code: <code>` (required)
**Response 200 (success):**
```json
{
"paired": true,
"persisted": true,
"token": "<bearer_token>",
"message": "Save this token — use it as Authorization: Bearer <token>"
}
```
**Response 200 (persistence failure):**
```json
{
"paired": true,
"persisted": false,
"token": "<bearer_token>",
"message": "Paired for this process, but failed to persist token to config.toml..."
}
```
**Response 403:**
```json
{"error": "Invalid pairing code"}
```
**Response 429:**
```json
{"error": "Too many pairing requests. Please retry later.", "retry_after": 60}
```
**Response 429 (lockout):**
```json
{"error": "Too many failed attempts. Try again in {lockout_secs}s.", "retry_after": 120}
```
---
## Webhook
### POST /webhook
Send a message to the agent and receive a response.
**Rate Limit:** Configurable per-minute limit per IP (default: 60/min).
**Headers:**
- `Authorization: Bearer <token>` (if pairing enabled)
- `Content-Type: application/json`
- `X-Webhook-Secret: <secret>` (optional)
- `X-Idempotency-Key: <uuid>` (optional)
**Request Body:**
```json
{"message": "your prompt here"}
```
**Response 200:**
```json
{"response": "<llm_response>", "model": "<model_name>"}
```
**Response 200 (duplicate — idempotency key match):**
```json
{"status": "duplicate", "idempotent": true, "message": "Request already processed for this idempotency key"}
```
**Response 401:**
```json
{"error": "Unauthorized — pair first via POST /pair, then send Authorization: Bearer <token>"}
```
**Response 429:**
```json
{"error": "Too many webhook requests. Please retry later.", "retry_after": 60}
```
**Response 500:**
```json
{"error": "LLM request failed"}
```
### Idempotency
- Header: `X-Idempotency-Key: <uuid>`
- TTL: configurable, default 300 seconds
- Max tracked keys: configurable, default 10,000
- Duplicate requests within TTL return `"status": "duplicate"` instead of re-processing
---
## WebSocket Chat
### GET /ws/chat?token=<bearer_token>
Streaming agent chat over WebSocket.
**Client → Server:**
```json
{"type": "message", "content": "Hello, what's the weather?"}
```
**Server → Client (complete response):**
```json
{"type": "done", "full_response": "The weather in San Francisco is sunny..."}
```
**Server → Client (error):**
```json
{"type": "error", "message": "Error message here"}
```
Ignore unknown message types. Invalid JSON triggers an error response.
---
## Status & Health
### GET /api/status
**Response 200:**
```json
{
"provider": "openrouter",
"model": "anthropic/claude-sonnet-4",
"temperature": 0.7,
"uptime_seconds": 3600,
"gateway_port": 42617,
"locale": "en",
"memory_backend": "sqlite",
"paired": true,
"channels": {
"telegram": false,
"discord": true,
"slack": false
},
"health": {}
}
```
### GET /api/health
Component health snapshot (requires auth).
```json
{"health": {}}
```
### GET or POST /api/doctor
Run system diagnostics.
```json
{
"results": [
{"name": "provider_connectivity", "severity": "ok", "message": "OpenRouter API reachable"}
],
"summary": {"ok": 5, "warnings": 1, "errors": 0}
}
```
---
## Memory
### GET /api/memory
List or search memory entries.
**Query Parameters:**
- `query` (string, optional) — search text; triggers search mode
- `category` (string, optional) — filter by category
**Response 200:**
```json
{
"entries": [
{
"key": "memory_key",
"content": "memory content",
"category": "core",
"timestamp": "2025-01-10T12:00:00Z"
}
]
}
```
### POST /api/memory
Store a memory entry.
**Request Body:**
```json
{
"key": "unique_key",
"content": "memory content",
"category": "core"
}
```
Category defaults to `"core"` if omitted. Other values: `daily`, `conversation`, or any custom string.
**Response 200:**
```json
{"status": "ok"}
```
### DELETE /api/memory/{key}
Delete a memory entry.
**Response 200:**
```json
{"status": "ok", "deleted": true}
```
---
## Cron
### GET /api/cron
List all scheduled jobs.
**Response 200:**
```json
{
"jobs": [
{
"id": "<uuid>",
"name": "daily-backup",
"command": "backup.sh",
"next_run": "2025-01-10T15:00:00Z",
"last_run": "2025-01-09T15:00:00Z",
"last_status": "success",
"enabled": true
}
]
}
```
### POST /api/cron
Add a new job.
**Request Body:**
```json
{
"name": "job-name",
"schedule": "0 9 * * *",
"command": "command to run"
}
```
**Response 200:**
```json
{
"status": "ok",
"job": {"id": "<uuid>", "name": "job-name", "command": "command to run", "enabled": true}
}
```
### DELETE /api/cron/{id}
Remove a job.
**Response 200:**
```json
{"status": "ok"}
```
---
## Tools
### GET /api/tools
List all registered tools with descriptions and parameter schemas.
**Response 200:**
```json
{
"tools": [
{"name": "shell", "description": "Execute shell commands", "parameters": {}},
{"name": "file_read", "description": "Read file contents", "parameters": {}}
]
}
```
---
## Configuration
### GET /api/config
Get current config. Secrets are masked as `***MASKED***`.
**Response 200:**
```json
{"format": "toml", "content": "<toml_string>"}
```
### PUT /api/config
Update config from TOML body. Body limit: 1 MB.
**Request Body:** Raw TOML text.
**Response 200:**
```json
{"status": "ok"}
```
**Response 400:**
```json
{"error": "Invalid TOML: <details>"}
```
or
```json
{"error": "Invalid config: <validation_error>"}
```
---
## Integrations
### GET /api/integrations
List all integrations and their status.
**Response 200:**
```json
{
"integrations": [
{"name": "openrouter", "description": "OpenRouter LLM provider", "category": "providers", "status": "ok"},
{"name": "telegram", "description": "Telegram messaging channel", "category": "channels", "status": "configured"}
]
}
```
---
## Cost
### GET /api/cost
Cost tracking summary.
**Response 200:**
```json
{
"cost": {
"session_cost_usd": 1.50,
"daily_cost_usd": 5.00,
"monthly_cost_usd": 150.00,
"total_tokens": 50000,
"request_count": 25,
"by_model": {"anthropic/claude-sonnet-4": 1.50}
}
}
```
---
## Events (SSE)
### GET /api/events
Server-Sent Events stream. Requires bearer token.
**Content-Type:** `text/event-stream`
**Event types:**
| Type | Fields | Description |
|------|--------|-------------|
| `llm_request` | provider, model, timestamp | LLM call started |
| `tool_call_start` | tool, timestamp | Tool execution started |
| `tool_call` | tool, duration_ms, success, timestamp | Tool execution completed |
| `agent_start` | provider, model, timestamp | Agent loop started |
| `agent_end` | provider, model, duration_ms, tokens_used, cost_usd, timestamp | Agent loop completed |
| `error` | component, message, timestamp | Error occurred |
**Example:**
```bash
curl -N -H "Authorization: Bearer <token>" http://127.0.0.1:42617/api/events
```
---
## Channel Webhooks
These are incoming webhook endpoints for specific messaging channels. They're set up automatically when channels are configured.
### WhatsApp (Meta Cloud API)
- `GET /whatsapp` — verification (echoes `hub.challenge`)
- `POST /whatsapp` — incoming messages (signature verified via `X-Hub-Signature-256`)
### WATI (WhatsApp Business)
- `GET /wati` — verification (echoes `challenge`)
- `POST /wati` — incoming messages
### Linq (iMessage/RCS/SMS)
- `POST /linq` — incoming messages (signature verified via `X-Webhook-Signature` + `X-Webhook-Timestamp`)
### Nextcloud Talk
- `POST /nextcloud-talk` — bot API webhook (signature verified via `X-Nextcloud-Talk-Signature`)
---
## Rate Limiting
Sliding window (60-second window), per client IP.
| Endpoint | Default Limit |
|----------|--------------|
| `POST /pair` | 10/min |
| `POST /webhook` | 60/min |
If `trust_forwarded_headers` is enabled, uses `X-Forwarded-For` for client IP.
Max tracked keys: configurable (default: 10,000).
---
## Error Responses
**Standard format:**
```json
{"error": "Human-readable error message"}
```
**With retry info:**
```json
{"error": "...", "retry_after": 60}
```
**Status codes:**
| Code | Meaning |
|------|---------|
| 200 | Success |
| 400 | Invalid JSON, missing fields, invalid TOML |
| 401 | Invalid/missing bearer token or webhook secret |
| 403 | Pairing verification failed |
| 404 | Endpoint or channel not configured |
| 408 | Request timeout (30s) |
| 429 | Rate limited (check `retry_after`) |
| 500 | LLM error, database error, internal failure |

View File

@ -20,12 +20,9 @@ reviews:
enabled: true
# Only review PRs targeting these branches
base_branches:
- main
- dev
- master
# Skip reviews for draft PRs or WIP
drafts: false
# Enable base branch analysis
base_branch_analysis: true
# Poem feature toggle (must be a boolean, not an object)
poem: false

View File

@ -64,3 +64,8 @@ LICENSE
*.profdata
coverage
lcov.info
# Application and script directories (not needed for Docker runtime)
apps/
python/
scripts/

View File

@ -1,32 +1,44 @@
# EditorConfig — https://editorconfig.org
# Provides consistent formatting defaults across editors and platforms.
# EditorConfig is awesome: https://EditorConfig.org
# top-most EditorConfig file
root = true
# All files
[*]
charset = utf-8
end_of_line = lf
insert_final_newline = true
trim_trailing_whitespace = true
indent_style = space
indent_size = 4
[*.md]
# Trailing whitespace is significant in Markdown (line breaks).
trim_trailing_whitespace = false
[*.go]
indent_style = tab
[*.{yml,yaml}]
indent_size = 2
end_of_line = lf
charset = utf-8
trim_trailing_whitespace = true
insert_final_newline = true
# Rust files - match rustfmt.toml
[*.rs]
indent_size = 4
max_line_length = 100
# Markdown files
[*.md]
trim_trailing_whitespace = false
max_line_length = 80
# TOML files
[*.toml]
indent_size = 2
[Dockerfile]
indent_size = 4
[*.nix]
indent_style = space
# YAML files
[*.{yml,yaml}]
indent_size = 2
# Python files
[*.py]
indent_size = 4
max_line_length = 100
# Shell scripts
[*.{sh,bash}]
indent_size = 2
# JSON files
[*.json]
indent_size = 2

View File

@ -59,6 +59,7 @@ PROVIDER=openrouter
# ZAI_API_KEY=...
# SYNTHETIC_API_KEY=...
# OPENCODE_API_KEY=...
# OPENCODE_GO_API_KEY=...
# VERCEL_API_KEY=...
# CLOUDFLARE_API_KEY=...
@ -117,3 +118,7 @@ PROVIDER=openrouter
# Optional: Brave Search (requires API key from https://brave.com/search/api)
# WEB_SEARCH_PROVIDER=brave
# BRAVE_API_KEY=your-brave-search-api-key
#
# Optional: SearXNG (self-hosted, requires instance URL)
# WEB_SEARCH_PROVIDER=searxng
# SEARXNG_INSTANCE_URL=https://searx.example.com

68
.gitattributes vendored
View File

@ -1,33 +1,61 @@
# Normalize all text files
# Git attributes for ZeroClaw
# https://git-scm.com/docs/gitattributes
# Auto detect text files and perform LF normalization
* text=auto
# Force LF for scripts and build-critical files
*.sh text eol=lf
Dockerfile* text eol=lf
*.rs text eol=lf
*.toml text eol=lf
*.yml text eol=lf
*.yaml text eol=lf
# Source code
*.rs text eol=lf linguist-language=Rust
*.toml text eol=lf linguist-language=TOML
*.py text eol=lf linguist-language=Python
*.js text eol=lf linguist-language=JavaScript
*.ts text eol=lf linguist-language=TypeScript
*.html text eol=lf linguist-language=HTML
*.css text eol=lf linguist-language=CSS
*.scss text eol=lf linguist-language=SCSS
*.json text eol=lf linguist-language=JSON
*.yaml text eol=lf linguist-language=YAML
*.yml text eol=lf linguist-language=YAML
*.md text eol=lf linguist-language=Markdown
*.sh text eol=lf linguist-language=Shell
*.bash text eol=lf linguist-language=Shell
*.ps1 text eol=crlf linguist-language=PowerShell
# CI
.github/**/* text eol=lf
# Documentation
*.txt text eol=lf
LICENSE* text eol=lf
# Images
# Configuration files
.editorconfig text eol=lf
.gitattributes text eol=lf
.gitignore text eol=lf
.dockerignore text eol=lf
# Rust-specific
Cargo.lock text eol=lf linguist-generated
Cargo.toml text eol=lf
# Declare files that will always have CRLF line endings on checkout
*.sln text eol=crlf
# Denote all files that are truly binary and should not be modified
*.png binary
*.jpg binary
*.jpeg binary
*.gif binary
*.ico binary
# Archives
*.svg text
*.wasm binary
*.woff binary
*.woff2 binary
*.ttf binary
*.eot binary
*.mp3 binary
*.mp4 binary
*.webm binary
*.zip binary
*.tar binary
*.tgz binary
*.gz binary
*.bz2 binary
*.7z binary
# Compiled artifacts
*.so binary
*.dll binary
*.exe binary
*.a binary
*.db binary

50
.github/CODEOWNERS vendored
View File

@ -1,32 +1,32 @@
# Default owner for all files
* @theonlyhennygod @JordanTheJet @chumyin
* @theonlyhennygod @JordanTheJet @SimianAstronaut7
# Important functional modules
/src/agent/** @theonlyhennygod @JordanTheJet @chumyin
/src/providers/** @theonlyhennygod @JordanTheJet @chumyin
/src/channels/** @theonlyhennygod @JordanTheJet @chumyin
/src/tools/** @theonlyhennygod @JordanTheJet @chumyin
/src/gateway/** @theonlyhennygod @JordanTheJet @chumyin
/src/runtime/** @theonlyhennygod @JordanTheJet @chumyin
/src/memory/** @theonlyhennygod @JordanTheJet @chumyin
/Cargo.toml @theonlyhennygod @JordanTheJet @chumyin
/Cargo.lock @theonlyhennygod @JordanTheJet @chumyin
/src/agent/** @theonlyhennygod @JordanTheJet @SimianAstronaut7
/src/providers/** @theonlyhennygod @JordanTheJet @SimianAstronaut7
/src/channels/** @theonlyhennygod @JordanTheJet @SimianAstronaut7
/src/tools/** @theonlyhennygod @JordanTheJet @SimianAstronaut7
/src/gateway/** @theonlyhennygod @JordanTheJet @SimianAstronaut7
/src/runtime/** @theonlyhennygod @JordanTheJet @SimianAstronaut7
/src/memory/** @theonlyhennygod @JordanTheJet @SimianAstronaut7
/Cargo.toml @theonlyhennygod @JordanTheJet @SimianAstronaut7
/Cargo.lock @theonlyhennygod @JordanTheJet @SimianAstronaut7
# Security / tests / CI-CD ownership
/src/security/** @theonlyhennygod @JordanTheJet @chumyin
/tests/** @theonlyhennygod @JordanTheJet @chumyin
/.github/** @theonlyhennygod @JordanTheJet @chumyin
/.github/workflows/** @theonlyhennygod @JordanTheJet @chumyin
/.github/codeql/** @theonlyhennygod @JordanTheJet @chumyin
/.github/dependabot.yml @theonlyhennygod @JordanTheJet @chumyin
/SECURITY.md @theonlyhennygod @JordanTheJet @chumyin
/docs/actions-source-policy.md @theonlyhennygod @JordanTheJet @chumyin
/docs/ci-map.md @theonlyhennygod @JordanTheJet @chumyin
/src/security/** @theonlyhennygod @JordanTheJet @SimianAstronaut7
/tests/** @theonlyhennygod @JordanTheJet @SimianAstronaut7
/.github/** @theonlyhennygod @JordanTheJet @SimianAstronaut7
/.github/workflows/** @theonlyhennygod @JordanTheJet @SimianAstronaut7
/.github/codeql/** @theonlyhennygod @JordanTheJet @SimianAstronaut7
/.github/dependabot.yml @theonlyhennygod @JordanTheJet @SimianAstronaut7
/SECURITY.md @theonlyhennygod @JordanTheJet @SimianAstronaut7
/docs/actions-source-policy.md @theonlyhennygod @JordanTheJet @SimianAstronaut7
/docs/ci-map.md @theonlyhennygod @JordanTheJet @SimianAstronaut7
# Docs & governance
/docs/** @theonlyhennygod @JordanTheJet @chumyin
/AGENTS.md @theonlyhennygod @JordanTheJet @chumyin
/CLAUDE.md @theonlyhennygod @JordanTheJet @chumyin
/CONTRIBUTING.md @theonlyhennygod @JordanTheJet @chumyin
/docs/pr-workflow.md @theonlyhennygod @JordanTheJet @chumyin
/docs/reviewer-playbook.md @theonlyhennygod @JordanTheJet @chumyin
/docs/** @theonlyhennygod @JordanTheJet @SimianAstronaut7
/AGENTS.md @theonlyhennygod @JordanTheJet @SimianAstronaut7
/CLAUDE.md @theonlyhennygod @JordanTheJet @SimianAstronaut7
/CONTRIBUTING.md @theonlyhennygod @JordanTheJet @SimianAstronaut7
/docs/pr-workflow.md @theonlyhennygod @JordanTheJet @SimianAstronaut7
/docs/reviewer-playbook.md @theonlyhennygod @JordanTheJet @SimianAstronaut7

View File

@ -11,15 +11,6 @@ body:
Please provide a minimal reproducible case so maintainers can triage quickly.
Do not include personal/sensitive data; redact and anonymize all logs/payloads.
- type: input
id: summary
attributes:
label: Summary
description: One-line description of the problem.
placeholder: zeroclaw daemon exits immediately when ...
validations:
required: true
- type: dropdown
id: component
attributes:
@ -72,7 +63,7 @@ body:
label: Steps to reproduce
description: Please provide exact commands/config.
placeholder: |
1. zeroclaw onboard --interactive
1. zeroclaw onboard
2. zeroclaw daemon
3. Observe crash in logs
render: bash
@ -83,13 +74,13 @@ body:
id: impact
attributes:
label: Impact
description: Who is affected, how often, and practical consequences.
description: Who is affected, how often, and practical consequences (optional but helps triage).
placeholder: |
Affected users: ...
Frequency: always/intermittent
Consequence: ...
validations:
required: true
required: false
- type: textarea
id: logs
@ -112,9 +103,10 @@ body:
id: rust
attributes:
label: Rust version
description: Required for runtime/build bugs; optional for docs/config issues.
placeholder: rustc 1.xx.x
validations:
required: true
required: false
- type: input
id: os
@ -140,9 +132,7 @@ body:
attributes:
label: Pre-flight checks
options:
- label: I reproduced this on the latest main branch or latest release.
- label: I reproduced this on the latest master branch or latest release.
required: true
- label: I redacted secrets/tokens from logs.
required: true
- label: I removed personal identifiers and replaced identity-specific data with neutral placeholders.
- label: I redacted secrets, tokens, and personal data from all submitted content.
required: true

View File

@ -3,15 +3,9 @@ contact_links:
- name: Security vulnerability report
url: https://github.com/zeroclaw-labs/zeroclaw/security/policy
about: Please report security vulnerabilities privately via SECURITY.md policy.
- name: Private vulnerability report template
url: https://github.com/zeroclaw-labs/zeroclaw/blob/main/docs/security/private-vulnerability-report-template.md
about: Use this template when filing a private vulnerability report in Security Advisories.
- name: 私密漏洞报告模板(中文)
url: https://github.com/zeroclaw-labs/zeroclaw/blob/main/docs/security/private-vulnerability-report-template.zh-CN.md
about: 使用该中文模板通过 Security Advisories 进行私密漏洞提交。
- name: Contribution guide
url: https://github.com/zeroclaw-labs/zeroclaw/blob/main/CONTRIBUTING.md
url: https://github.com/zeroclaw-labs/zeroclaw/blob/master/CONTRIBUTING.md
about: Please read contribution and PR requirements before opening an issue.
- name: PR workflow & reviewer expectations
url: https://github.com/zeroclaw-labs/zeroclaw/blob/main/docs/pr-workflow.md
url: https://github.com/zeroclaw-labs/zeroclaw/blob/master/docs/pr-workflow.md
about: Read risk-based PR tracks, CI gates, and merge criteria before filing feature requests.

View File

@ -42,10 +42,10 @@ body:
id: non_goals
attributes:
label: Non-goals / out of scope
description: Clarify what should not be included in the first iteration.
description: Clarify what should not be included in the first iteration (optional but helps scope discussion).
placeholder: No UI changes, no cross-provider dynamic adaptation in v1.
validations:
required: true
required: false
- type: textarea
id: alternatives
@ -60,31 +60,31 @@ body:
id: acceptance
attributes:
label: Acceptance criteria
description: What outcomes would make this request complete?
description: What outcomes would make this request complete? (optional — can be defined during triage)
placeholder: |
- Config key is documented and validated
- Runtime path uses configured retry budget
- Regression tests cover fallback and invalid config
validations:
required: true
required: false
- type: textarea
id: architecture
attributes:
label: Architecture impact
description: Which subsystem(s) are affected?
description: Which subsystem(s) are affected? (optional — maintainers will assess during triage)
placeholder: providers/, channels/, memory/, runtime/, security/, docs/ ...
validations:
required: true
required: false
- type: textarea
id: risk
attributes:
label: Risk and rollback
description: Main risk + how to disable/revert quickly.
description: Main risk + how to disable/revert quickly (optional — can be defined during planning).
placeholder: Risk is ... rollback is ...
validations:
required: true
required: false
- type: dropdown
id: breaking

View File

@ -1,11 +1,3 @@
self-hosted-runner:
labels:
- Linux
- X64
- racknerd
- aws-india
- light
- cpu40
- codeql
- codeql-general
- blacksmith-2vcpu-ubuntu-2404

BIN
.github/assets/show-tool-calls-after.png vendored Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 84 KiB

Binary file not shown.

After

Width:  |  Height:  |  Size: 110 KiB

View File

@ -1,70 +0,0 @@
{
"version": 1,
"description": "Provider/model connectivity probe contract for scheduled CI checks.",
"consecutive_transient_failures_to_escalate": 2,
"providers": [
{
"name": "OpenAI",
"provider": "openai",
"required": true,
"secret_env": "OPENAI_API_KEY",
"timeout_sec": 90,
"retries": 2,
"notes": "Primary reference provider; validates baseline OpenAI-compatible path."
},
{
"name": "Anthropic",
"provider": "anthropic",
"required": true,
"secret_env": "ANTHROPIC_API_KEY",
"timeout_sec": 90,
"retries": 2,
"notes": "Checks non-OpenAI provider fetch path and account health."
},
{
"name": "Gemini",
"provider": "gemini",
"required": true,
"secret_env": "GEMINI_API_KEY",
"timeout_sec": 90,
"retries": 2,
"notes": "Validates Google model discovery endpoint availability."
},
{
"name": "OpenRouter",
"provider": "openrouter",
"required": true,
"secret_env": "OPENROUTER_API_KEY",
"timeout_sec": 90,
"retries": 2,
"notes": "Routes across many providers; signal for aggregator-side health."
},
{
"name": "Qwen",
"provider": "qwen",
"required": false,
"secret_env": "DASHSCOPE_API_KEY",
"timeout_sec": 90,
"retries": 2,
"notes": "Regional provider check; optional for global deployments."
},
{
"name": "NVIDIA NIM",
"provider": "nvidia",
"required": false,
"secret_env": "NVIDIA_API_KEY",
"timeout_sec": 90,
"retries": 2,
"notes": "Optional ecosystem endpoint check."
},
{
"name": "OpenAI Codex",
"provider": "openai-codex",
"required": false,
"secret_env": "OPENAI_API_KEY",
"timeout_sec": 90,
"retries": 2,
"notes": "Uses OpenAI-compatible models endpoint to verify Codex-profile discovery path."
}
]
}

View File

@ -1,77 +0,0 @@
{
"global_timeout_seconds": 8,
"providers": [
{
"id": "openrouter",
"url": "https://openrouter.ai/api/v1/models",
"method": "GET",
"critical": true
},
{
"id": "openai",
"url": "https://api.openai.com/v1/models",
"method": "GET",
"critical": true
},
{
"id": "anthropic",
"url": "https://api.anthropic.com/v1/messages",
"method": "POST",
"critical": true
},
{
"id": "groq",
"url": "https://api.groq.com/openai/v1/models",
"method": "GET",
"critical": false
},
{
"id": "deepseek",
"url": "https://api.deepseek.com/v1/models",
"method": "GET",
"critical": false
},
{
"id": "moonshot",
"url": "https://api.moonshot.ai/v1/models",
"method": "GET",
"critical": false
},
{
"id": "qwen",
"url": "https://dashscope-intl.aliyuncs.com/compatible-mode/v1/models",
"method": "GET",
"critical": false
},
{
"id": "zai",
"url": "https://api.z.ai/api/paas/v4/models",
"method": "GET",
"critical": false
},
{
"id": "glm",
"url": "https://open.bigmodel.cn/api/paas/v4/models",
"method": "GET",
"critical": false
},
{
"id": "together",
"url": "https://api.together.xyz/v1/models",
"method": "GET",
"critical": false
},
{
"id": "fireworks",
"url": "https://api.fireworks.ai/inference/v1/models",
"method": "GET",
"critical": false
},
{
"id": "cohere",
"url": "https://api.cohere.com/v1/models",
"method": "GET",
"critical": false
}
]
}

View File

@ -5,7 +5,7 @@ updates:
directory: "/"
schedule:
interval: daily
target-branch: main
target-branch: master
open-pull-requests-limit: 3
labels:
- "dependencies"
@ -21,7 +21,7 @@ updates:
directory: "/"
schedule:
interval: daily
target-branch: main
target-branch: master
open-pull-requests-limit: 1
labels:
- "ci"
@ -38,7 +38,7 @@ updates:
directory: "/"
schedule:
interval: daily
target-branch: main
target-branch: master
open-pull-requests-limit: 1
labels:
- "ci"

301
.github/labeler.yml vendored
View File

@ -36,6 +36,145 @@
- any-glob-to-any-file:
- "src/channels/**"
"channel:bluesky":
- changed-files:
- any-glob-to-any-file:
- "src/channels/bluesky.rs"
"channel:clawdtalk":
- changed-files:
- any-glob-to-any-file:
- "src/channels/clawdtalk.rs"
"channel:cli":
- changed-files:
- any-glob-to-any-file:
- "src/channels/cli.rs"
"channel:dingtalk":
- changed-files:
- any-glob-to-any-file:
- "src/channels/dingtalk.rs"
"channel:discord":
- changed-files:
- any-glob-to-any-file:
- "src/channels/discord.rs"
- "src/channels/discord_history.rs"
"channel:email":
- changed-files:
- any-glob-to-any-file:
- "src/channels/email_channel.rs"
- "src/channels/gmail_push.rs"
"channel:imessage":
- changed-files:
- any-glob-to-any-file:
- "src/channels/imessage.rs"
"channel:irc":
- changed-files:
- any-glob-to-any-file:
- "src/channels/irc.rs"
"channel:lark":
- changed-files:
- any-glob-to-any-file:
- "src/channels/lark.rs"
"channel:linq":
- changed-files:
- any-glob-to-any-file:
- "src/channels/linq.rs"
"channel:matrix":
- changed-files:
- any-glob-to-any-file:
- "src/channels/matrix.rs"
"channel:mattermost":
- changed-files:
- any-glob-to-any-file:
- "src/channels/mattermost.rs"
"channel:mochat":
- changed-files:
- any-glob-to-any-file:
- "src/channels/mochat.rs"
"channel:mqtt":
- changed-files:
- any-glob-to-any-file:
- "src/channels/mqtt.rs"
"channel:nextcloud-talk":
- changed-files:
- any-glob-to-any-file:
- "src/channels/nextcloud_talk.rs"
"channel:nostr":
- changed-files:
- any-glob-to-any-file:
- "src/channels/nostr.rs"
"channel:notion":
- changed-files:
- any-glob-to-any-file:
- "src/channels/notion.rs"
"channel:qq":
- changed-files:
- any-glob-to-any-file:
- "src/channels/qq.rs"
"channel:reddit":
- changed-files:
- any-glob-to-any-file:
- "src/channels/reddit.rs"
"channel:signal":
- changed-files:
- any-glob-to-any-file:
- "src/channels/signal.rs"
"channel:slack":
- changed-files:
- any-glob-to-any-file:
- "src/channels/slack.rs"
"channel:telegram":
- changed-files:
- any-glob-to-any-file:
- "src/channels/telegram.rs"
"channel:twitter":
- changed-files:
- any-glob-to-any-file:
- "src/channels/twitter.rs"
"channel:wati":
- changed-files:
- any-glob-to-any-file:
- "src/channels/wati.rs"
"channel:webhook":
- changed-files:
- any-glob-to-any-file:
- "src/channels/webhook.rs"
"channel:wecom":
- changed-files:
- any-glob-to-any-file:
- "src/channels/wecom.rs"
"channel:whatsapp":
- changed-files:
- any-glob-to-any-file:
- "src/channels/whatsapp.rs"
- "src/channels/whatsapp_storage.rs"
- "src/channels/whatsapp_web.rs"
"gateway":
- changed-files:
- any-glob-to-any-file:
@ -101,6 +240,73 @@
- any-glob-to-any-file:
- "src/providers/**"
"provider:anthropic":
- changed-files:
- any-glob-to-any-file:
- "src/providers/anthropic.rs"
"provider:azure-openai":
- changed-files:
- any-glob-to-any-file:
- "src/providers/azure_openai.rs"
"provider:bedrock":
- changed-files:
- any-glob-to-any-file:
- "src/providers/bedrock.rs"
"provider:claude-code":
- changed-files:
- any-glob-to-any-file:
- "src/providers/claude_code.rs"
"provider:compatible":
- changed-files:
- any-glob-to-any-file:
- "src/providers/compatible.rs"
"provider:copilot":
- changed-files:
- any-glob-to-any-file:
- "src/providers/copilot.rs"
"provider:gemini":
- changed-files:
- any-glob-to-any-file:
- "src/providers/gemini.rs"
- "src/providers/gemini_cli.rs"
"provider:glm":
- changed-files:
- any-glob-to-any-file:
- "src/providers/glm.rs"
"provider:kilocli":
- changed-files:
- any-glob-to-any-file:
- "src/providers/kilocli.rs"
"provider:ollama":
- changed-files:
- any-glob-to-any-file:
- "src/providers/ollama.rs"
"provider:openai":
- changed-files:
- any-glob-to-any-file:
- "src/providers/openai.rs"
- "src/providers/openai_codex.rs"
"provider:openrouter":
- changed-files:
- any-glob-to-any-file:
- "src/providers/openrouter.rs"
"provider:telnyx":
- changed-files:
- any-glob-to-any-file:
- "src/providers/telnyx.rs"
"service":
- changed-files:
- any-glob-to-any-file:
@ -121,6 +327,101 @@
- any-glob-to-any-file:
- "src/tools/**"
"tool:browser":
- changed-files:
- any-glob-to-any-file:
- "src/tools/browser.rs"
- "src/tools/browser_delegate.rs"
- "src/tools/browser_open.rs"
- "src/tools/text_browser.rs"
- "src/tools/screenshot.rs"
"tool:composio":
- changed-files:
- any-glob-to-any-file:
- "src/tools/composio.rs"
"tool:cron":
- changed-files:
- any-glob-to-any-file:
- "src/tools/cron_add.rs"
- "src/tools/cron_list.rs"
- "src/tools/cron_remove.rs"
- "src/tools/cron_run.rs"
- "src/tools/cron_runs.rs"
- "src/tools/cron_update.rs"
"tool:file":
- changed-files:
- any-glob-to-any-file:
- "src/tools/file_edit.rs"
- "src/tools/file_read.rs"
- "src/tools/file_write.rs"
- "src/tools/glob_search.rs"
- "src/tools/content_search.rs"
"tool:google-workspace":
- changed-files:
- any-glob-to-any-file:
- "src/tools/google_workspace.rs"
"tool:mcp":
- changed-files:
- any-glob-to-any-file:
- "src/tools/mcp_client.rs"
- "src/tools/mcp_deferred.rs"
- "src/tools/mcp_protocol.rs"
- "src/tools/mcp_tool.rs"
- "src/tools/mcp_transport.rs"
"tool:memory":
- changed-files:
- any-glob-to-any-file:
- "src/tools/memory_forget.rs"
- "src/tools/memory_recall.rs"
- "src/tools/memory_store.rs"
"tool:microsoft365":
- changed-files:
- any-glob-to-any-file:
- "src/tools/microsoft365/**"
"tool:shell":
- changed-files:
- any-glob-to-any-file:
- "src/tools/shell.rs"
- "src/tools/node_tool.rs"
- "src/tools/cli_discovery.rs"
"tool:sop":
- changed-files:
- any-glob-to-any-file:
- "src/tools/sop_advance.rs"
- "src/tools/sop_approve.rs"
- "src/tools/sop_execute.rs"
- "src/tools/sop_list.rs"
- "src/tools/sop_status.rs"
"tool:web":
- changed-files:
- any-glob-to-any-file:
- "src/tools/web_fetch.rs"
- "src/tools/web_search_tool.rs"
- "src/tools/web_search_provider_routing.rs"
- "src/tools/http_request.rs"
"tool:security":
- changed-files:
- any-glob-to-any-file:
- "src/tools/security_ops.rs"
- "src/tools/verifiable_intent.rs"
"tool:cloud":
- changed-files:
- any-glob-to-any-file:
- "src/tools/cloud_ops.rs"
- "src/tools/cloud_patterns.rs"
"tunnel":
- changed-files:
- any-glob-to-any-file:

View File

@ -2,7 +2,7 @@
Describe this PR in 2-5 bullets:
- Base branch target (`main` or `dev`; direct `main` PRs are allowed):
- Base branch target (`master` for all contributions):
- Problem:
- Why it matters:
- What changed:

View File

@ -1,39 +0,0 @@
{
"schema_version": "zeroclaw.canary-policy.v1",
"release_channel": "stable",
"observation_window_minutes": 60,
"minimum_sample_size": 500,
"cohorts": [
{
"name": "canary-5pct",
"traffic_percent": 5,
"duration_minutes": 20
},
{
"name": "canary-20pct",
"traffic_percent": 20,
"duration_minutes": 20
},
{
"name": "canary-50pct",
"traffic_percent": 50,
"duration_minutes": 20
},
{
"name": "canary-100pct",
"traffic_percent": 100,
"duration_minutes": 60
}
],
"observability_signals": [
"error_rate",
"crash_rate",
"p95_latency_ms",
"sample_size"
],
"thresholds": {
"max_error_rate": 0.02,
"max_crash_rate": 0.01,
"max_p95_latency_ms": 1200
}
}

View File

@ -1,10 +0,0 @@
{
"schema_version": "zeroclaw.docs-deploy-policy.v1",
"production_branch": "main",
"allow_manual_production_dispatch": true,
"require_preview_evidence_on_manual_production": true,
"allow_manual_rollback_dispatch": true,
"rollback_ref_must_be_ancestor_of_production_branch": true,
"docs_preview_retention_days": 14,
"docs_guard_artifact_retention_days": 21
}

View File

@ -1,18 +0,0 @@
{
"schema_version": "zeroclaw.ghcr-tag-policy.v1",
"release_tag_regex": "^v[0-9]+\\.[0-9]+\\.[0-9]+$",
"sha_tag_prefix": "sha-",
"sha_tag_length": 12,
"latest_tag": "latest",
"require_latest_on_release": true,
"immutable_tag_classes": [
"release",
"sha"
],
"rollback_priority": [
"sha",
"release"
],
"contract_artifact_retention_days": 21,
"scan_artifact_retention_days": 14
}

View File

@ -1,17 +0,0 @@
{
"schema_version": "zeroclaw.ghcr-vulnerability-policy.v1",
"required_tag_classes": [
"release",
"sha",
"latest"
],
"blocking_severities": [
"HIGH",
"CRITICAL"
],
"max_blocking_findings_per_tag": 0,
"require_blocking_count_parity": true,
"require_artifact_id_parity": true,
"scan_artifact_retention_days": 14,
"audit_artifact_retention_days": 21
}

View File

@ -1,9 +0,0 @@
{
"schema_version": "zeroclaw.nightly-owner-routing.v1",
"owners": {
"default": "@chumyin",
"whatsapp-web": "@chumyin",
"browser-native": "@chumyin",
"nightly-all-features": "@chumyin"
}
}

View File

@ -1,33 +0,0 @@
{
"schema_version": "zeroclaw.prerelease-stage-gates.v1",
"stage_order": ["alpha", "beta", "rc", "stable"],
"required_previous_stage": {
"beta": "alpha",
"rc": "beta",
"stable": "rc"
},
"required_checks": {
"alpha": [
"CI Required Gate",
"Security Audit"
],
"beta": [
"CI Required Gate",
"Security Audit",
"Feature Matrix Summary"
],
"rc": [
"CI Required Gate",
"Security Audit",
"Feature Matrix Summary",
"Nightly Summary & Routing"
],
"stable": [
"CI Required Gate",
"Security Audit",
"Feature Matrix Summary",
"Verify Artifact Set",
"Nightly Summary & Routing"
]
}
}

View File

@ -1,30 +0,0 @@
{
"schema_version": "zeroclaw.release-artifact-contract.v1",
"release_archive_patterns": [
"zeroclaw-x86_64-unknown-linux-gnu.tar.gz",
"zeroclaw-x86_64-unknown-linux-musl.tar.gz",
"zeroclaw-aarch64-unknown-linux-gnu.tar.gz",
"zeroclaw-aarch64-unknown-linux-musl.tar.gz",
"zeroclaw-armv7-unknown-linux-gnueabihf.tar.gz",
"zeroclaw-armv7-linux-androideabi.tar.gz",
"zeroclaw-aarch64-linux-android.tar.gz",
"zeroclaw-x86_64-unknown-freebsd.tar.gz",
"zeroclaw-x86_64-apple-darwin.tar.gz",
"zeroclaw-aarch64-apple-darwin.tar.gz",
"zeroclaw-x86_64-pc-windows-msvc.zip"
],
"required_manifest_files": [
"release-manifest.json",
"release-manifest.md",
"SHA256SUMS"
],
"required_sbom_files": [
"zeroclaw.cdx.json",
"zeroclaw.spdx.json"
],
"required_notice_files": [
"LICENSE-APACHE",
"LICENSE-MIT",
"NOTICE"
]
}

View File

@ -1,33 +0,0 @@
{
"schema_version": "zeroclaw.deny-governance.v1",
"advisories": [
{
"id": "RUSTSEC-2025-0141",
"owner": "repo-maintainers",
"reason": "Transitive via probe-rs in current release path; tracked for replacement when probe-rs updates.",
"ticket": "RMN-21",
"expires_on": "2026-12-31"
},
{
"id": "RUSTSEC-2024-0384",
"owner": "repo-maintainers",
"reason": "Upstream rust-nostr advisory mitigation is still in progress; monitor until released fix lands.",
"ticket": "RMN-21",
"expires_on": "2026-12-31"
},
{
"id": "RUSTSEC-2024-0388",
"owner": "repo-maintainers",
"reason": "Transitive via matrix-sdk indexeddb dependency chain in current matrix release line; track removal when upstream drops derivative.",
"ticket": "RMN-21",
"expires_on": "2026-12-31"
},
{
"id": "RUSTSEC-2024-0436",
"owner": "repo-maintainers",
"reason": "Transitive via wasmtime dependency stack; tracked until upstream removes or replaces paste.",
"ticket": "RMN-21",
"expires_on": "2026-12-31"
}
]
}

View File

@ -1,56 +0,0 @@
{
"schema_version": "zeroclaw.secrets-governance.v1",
"paths": [
{
"pattern": "src/security/leak_detector\\.rs",
"owner": "repo-maintainers",
"reason": "Fixture patterns are intentionally embedded for regression tests in leak detector logic.",
"ticket": "RMN-13",
"expires_on": "2026-12-31"
},
{
"pattern": "src/agent/loop_\\.rs",
"owner": "repo-maintainers",
"reason": "Contains escaped template snippets used for command orchestration and parser coverage.",
"ticket": "RMN-13",
"expires_on": "2026-12-31"
},
{
"pattern": "src/security/secrets\\.rs",
"owner": "repo-maintainers",
"reason": "Contains detector test vectors and redaction examples required for secret scanning tests.",
"ticket": "RMN-13",
"expires_on": "2026-12-31"
},
{
"pattern": "docs/(i18n/vi/|vi/)?zai-glm-setup\\.md",
"owner": "repo-maintainers",
"reason": "Documentation contains literal environment variable placeholders for onboarding commands.",
"ticket": "RMN-13",
"expires_on": "2026-12-31"
},
{
"pattern": "\\.github/workflows/pub-release\\.yml",
"owner": "repo-maintainers",
"reason": "Release workflow emits masked authorization header examples during registry smoke checks.",
"ticket": "RMN-13",
"expires_on": "2026-12-31"
}
],
"regexes": [
{
"pattern": "Authorization: Bearer \\$\\{[^}]+\\}",
"owner": "repo-maintainers",
"reason": "Intentional placeholder used in docs/workflow snippets for safe header examples.",
"ticket": "RMN-13",
"expires_on": "2026-12-31"
},
{
"pattern": "curl -sS -o /tmp/ghcr-release-manifest\\.json -w \"%\\{http_code\\}\"",
"owner": "repo-maintainers",
"reason": "Release smoke command string is non-secret telemetry and should not be flagged as credential leakage.",
"ticket": "RMN-13",
"expires_on": "2026-12-31"
}
]
}

View File

@ -1,5 +0,0 @@
{
"schema_version": "zeroclaw.unsafe-audit-governance.v1",
"ignore_paths": [],
"ignore_pattern_ids": []
}

View File

@ -10,26 +10,8 @@ Subdirectories are not valid locations for workflow entry files.
Repository convention:
1. Keep runnable workflow entry files at `.github/workflows/` root.
2. Keep workflow-only helper scripts under `.github/workflows/scripts/`.
3. Keep cross-tooling/local CI scripts under `scripts/ci/` when they are used outside Actions.
2. Keep cross-tooling/local CI scripts under `dev/` or `scripts/ci/` when used outside Actions.
Workflow behavior documentation in this directory:
- `.github/workflows/main-branch-flow.md`
Current workflow helper scripts:
- `.github/workflows/scripts/ci_license_file_owner_guard.js`
- `.github/workflows/scripts/lint_feedback.js`
- `.github/workflows/scripts/pr_auto_response_contributor_tier.js`
- `.github/workflows/scripts/pr_auto_response_labeled_routes.js`
- `.github/workflows/scripts/pr_check_status_nudge.js`
- `.github/workflows/scripts/pr_intake_checks.js`
- `.github/workflows/scripts/pr_labeler.js`
- `.github/workflows/scripts/test_benchmarks_pr_comment.js`
Release/CI policy assets introduced for advanced delivery lanes:
- `.github/release/nightly-owner-routing.json`
- `.github/release/canary-policy.json`
- `.github/release/prerelease-stage-gates.json`
- `.github/workflows/master-branch-flow.md`

175
.github/workflows/checks-on-pr.yml vendored Normal file
View File

@ -0,0 +1,175 @@
name: Quality Gate
on:
pull_request:
branches: [master]
concurrency:
group: checks-${{ github.event.pull_request.number }}
cancel-in-progress: true
permissions:
contents: read
env:
CARGO_TERM_COLOR: always
CARGO_INCREMENTAL: 0
jobs:
lint:
name: Lint
runs-on: ubuntu-latest
timeout-minutes: 10
steps:
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
- uses: dtolnay/rust-toolchain@631a55b12751854ce901bb631d5902ceb48146f7 # stable
with:
toolchain: 1.92.0
components: rustfmt, clippy
- uses: Swatinem/rust-cache@779680da715d629ac1d338a641029a2f4372abb5 # v2
- name: Ensure web/dist placeholder exists
run: mkdir -p web/dist && touch web/dist/.gitkeep
- name: Check formatting
run: cargo fmt --all -- --check
- name: Clippy
run: cargo clippy --all-targets -- -D warnings
test:
name: Test
runs-on: ubuntu-latest
timeout-minutes: 30
steps:
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
- uses: dtolnay/rust-toolchain@631a55b12751854ce901bb631d5902ceb48146f7 # stable
with:
toolchain: 1.92.0
- uses: Swatinem/rust-cache@779680da715d629ac1d338a641029a2f4372abb5 # v2
- name: Ensure web/dist placeholder exists
run: mkdir -p web/dist && touch web/dist/.gitkeep
- name: Install mold linker
run: |
sudo apt-get update -qq
sudo apt-get install -y mold
- name: Install cargo-nextest
run: curl -LsSf https://get.nexte.st/latest/linux | tar zxf - -C ${CARGO_HOME:-~/.cargo}/bin
- name: Run tests
run: cargo nextest run --locked
env:
CARGO_TARGET_X86_64_UNKNOWN_LINUX_GNU_LINKER: clang
CARGO_TARGET_X86_64_UNKNOWN_LINUX_GNU_RUSTFLAGS: "-C link-arg=-fuse-ld=mold"
build:
name: Build ${{ matrix.target }}
runs-on: ${{ matrix.os }}
timeout-minutes: 40
strategy:
fail-fast: false
matrix:
include:
- os: ubuntu-latest
target: x86_64-unknown-linux-gnu
- os: macos-14
target: aarch64-apple-darwin
- os: windows-latest
target: x86_64-pc-windows-msvc
steps:
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
- uses: dtolnay/rust-toolchain@631a55b12751854ce901bb631d5902ceb48146f7 # stable
with:
toolchain: 1.92.0
targets: ${{ matrix.target }}
- uses: Swatinem/rust-cache@779680da715d629ac1d338a641029a2f4372abb5 # v2
if: runner.os != 'Windows'
- name: Install mold linker
if: runner.os == 'Linux'
run: |
sudo apt-get update -qq
sudo apt-get install -y mold
- name: Ensure web/dist placeholder exists
shell: bash
run: mkdir -p web/dist && touch web/dist/.gitkeep
- name: Build release
shell: bash
run: cargo build --profile ci --locked --target ${{ matrix.target }}
env:
CARGO_TARGET_X86_64_UNKNOWN_LINUX_GNU_LINKER: clang
CARGO_TARGET_X86_64_UNKNOWN_LINUX_GNU_RUSTFLAGS: "-C link-arg=-fuse-ld=mold"
security:
name: Security Audit
runs-on: ubuntu-latest
timeout-minutes: 10
steps:
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
- uses: dtolnay/rust-toolchain@631a55b12751854ce901bb631d5902ceb48146f7 # stable
with:
toolchain: 1.92.0
- uses: Swatinem/rust-cache@779680da715d629ac1d338a641029a2f4372abb5 # v2
- name: Install cargo-audit
run: cargo install cargo-audit --locked
- name: Install cargo-deny
run: cargo install cargo-deny --locked
- name: Audit dependencies
run: cargo audit
- name: Check licenses and sources
run: cargo deny check licenses sources
check-32bit:
name: "Check (32-bit)"
runs-on: ubuntu-latest
timeout-minutes: 15
steps:
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
- uses: dtolnay/rust-toolchain@631a55b12751854ce901bb631d5902ceb48146f7 # stable
with:
toolchain: 1.92.0
targets: i686-unknown-linux-gnu
- uses: Swatinem/rust-cache@779680da715d629ac1d338a641029a2f4372abb5 # v2
- name: Install 32-bit libs
run: sudo apt-get update && sudo apt-get install -y gcc-multilib
- name: Ensure web/dist placeholder exists
run: mkdir -p web/dist && touch web/dist/.gitkeep
- name: Cargo check (32-bit, no default features)
run: cargo check --target i686-unknown-linux-gnu --no-default-features
# Composite status check — branch protection only needs to require this
# single job instead of tracking every matrix leg individually.
gate:
name: CI Required Gate
if: always()
needs: [lint, test, build, security, check-32bit]
runs-on: ubuntu-latest
steps:
- name: Check upstream job results
run: |
if [[ "${{ contains(needs.*.result, 'failure') || contains(needs.*.result, 'cancelled') }}" == "true" ]]; then
echo "::error::One or more upstream jobs failed or were cancelled"
exit 1
fi
security-gate:
name: Security Required Gate
if: always()
needs: [security]
runs-on: ubuntu-latest
steps:
- name: Check security job result
run: |
if [[ "${{ needs.security.result }}" != "success" ]]; then
echo "::error::Security audit failed or was cancelled"
exit 1
fi

View File

@ -1,330 +0,0 @@
name: CI Canary Gate
on:
workflow_dispatch:
inputs:
mode:
description: "dry-run computes decision only; execute enables canary dispatch"
required: true
default: dry-run
type: choice
options:
- dry-run
- execute
candidate_tag:
description: "Candidate release tag (e.g. v0.1.8-rc.1 or v0.1.8)"
required: false
default: ""
type: string
candidate_sha:
description: "Optional explicit candidate SHA"
required: false
default: ""
type: string
error_rate:
description: "Observed canary error rate (0.0-1.0)"
required: true
default: "0.0"
type: string
crash_rate:
description: "Observed canary crash rate (0.0-1.0)"
required: true
default: "0.0"
type: string
p95_latency_ms:
description: "Observed canary p95 latency in milliseconds"
required: true
default: "0"
type: string
sample_size:
description: "Observed canary sample size"
required: true
default: "0"
type: string
emit_repository_dispatch:
description: "Emit canary decision repository_dispatch event"
required: true
default: false
type: boolean
trigger_rollback_on_abort:
description: "Automatically dispatch CI Rollback Guard when canary decision is abort"
required: true
default: true
type: boolean
rollback_branch:
description: "Rollback integration branch used by CI Rollback Guard dispatch"
required: true
default: dev
type: choice
options:
- dev
- main
rollback_target_ref:
description: "Optional explicit rollback target ref passed to CI Rollback Guard"
required: false
default: ""
type: string
fail_on_violation:
description: "Fail on policy violations"
required: true
default: true
type: boolean
schedule:
- cron: "45 7 * * 1" # Weekly Monday 07:45 UTC
concurrency:
group: canary-gate-${{ github.event.inputs.candidate_tag || github.ref || github.run_id }}
cancel-in-progress: false
permissions:
contents: read
actions: read
env:
GIT_CONFIG_COUNT: "1"
GIT_CONFIG_KEY_0: core.hooksPath
GIT_CONFIG_VALUE_0: /dev/null
jobs:
canary-plan:
name: Canary Plan
runs-on: [self-hosted, Linux, X64, aws-india, blacksmith-2vcpu-ubuntu-2404]
timeout-minutes: 20
outputs:
mode: ${{ steps.inputs.outputs.mode }}
candidate_tag: ${{ steps.inputs.outputs.candidate_tag }}
candidate_sha: ${{ steps.inputs.outputs.candidate_sha }}
trigger_rollback_on_abort: ${{ steps.inputs.outputs.trigger_rollback_on_abort }}
rollback_branch: ${{ steps.inputs.outputs.rollback_branch }}
rollback_target_ref: ${{ steps.inputs.outputs.rollback_target_ref }}
decision: ${{ steps.extract.outputs.decision }}
ready_to_execute: ${{ steps.extract.outputs.ready_to_execute }}
steps:
- name: Checkout
uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
with:
fetch-depth: 0
- name: Resolve canary inputs
id: inputs
shell: bash
run: |
set -euo pipefail
mode="dry-run"
candidate_tag=""
candidate_sha=""
error_rate="0.0"
crash_rate="0.0"
p95_latency_ms="0"
sample_size="0"
trigger_rollback_on_abort="true"
rollback_branch="dev"
rollback_target_ref=""
# Scheduled audits may not have live canary telemetry; report violations without failing by default.
fail_on_violation="false"
if [ "${GITHUB_EVENT_NAME}" = "workflow_dispatch" ]; then
mode="${{ github.event.inputs.mode || 'dry-run' }}"
candidate_tag="${{ github.event.inputs.candidate_tag || '' }}"
candidate_sha="${{ github.event.inputs.candidate_sha || '' }}"
error_rate="${{ github.event.inputs.error_rate || '0.0' }}"
crash_rate="${{ github.event.inputs.crash_rate || '0.0' }}"
p95_latency_ms="${{ github.event.inputs.p95_latency_ms || '0' }}"
sample_size="${{ github.event.inputs.sample_size || '0' }}"
trigger_rollback_on_abort="${{ github.event.inputs.trigger_rollback_on_abort || 'true' }}"
rollback_branch="${{ github.event.inputs.rollback_branch || 'dev' }}"
rollback_target_ref="${{ github.event.inputs.rollback_target_ref || '' }}"
fail_on_violation="${{ github.event.inputs.fail_on_violation || 'true' }}"
else
git fetch --tags --force origin
candidate_tag="$(git tag --list 'v*' --sort=-version:refname | head -n1)"
if [ -n "$candidate_tag" ]; then
candidate_sha="$(git rev-parse "${candidate_tag}^{commit}")"
fi
fi
{
echo "mode=${mode}"
echo "candidate_tag=${candidate_tag}"
echo "candidate_sha=${candidate_sha}"
echo "error_rate=${error_rate}"
echo "crash_rate=${crash_rate}"
echo "p95_latency_ms=${p95_latency_ms}"
echo "sample_size=${sample_size}"
echo "trigger_rollback_on_abort=${trigger_rollback_on_abort}"
echo "rollback_branch=${rollback_branch}"
echo "rollback_target_ref=${rollback_target_ref}"
echo "fail_on_violation=${fail_on_violation}"
} >> "$GITHUB_OUTPUT"
- name: Run canary guard
shell: bash
run: |
set -euo pipefail
mkdir -p artifacts
args=()
if [ "${{ steps.inputs.outputs.fail_on_violation }}" = "true" ]; then
args+=(--fail-on-violation)
fi
python3 scripts/ci/canary_guard.py \
--policy-file .github/release/canary-policy.json \
--candidate-tag "${{ steps.inputs.outputs.candidate_tag }}" \
--candidate-sha "${{ steps.inputs.outputs.candidate_sha }}" \
--mode "${{ steps.inputs.outputs.mode }}" \
--error-rate "${{ steps.inputs.outputs.error_rate }}" \
--crash-rate "${{ steps.inputs.outputs.crash_rate }}" \
--p95-latency-ms "${{ steps.inputs.outputs.p95_latency_ms }}" \
--sample-size "${{ steps.inputs.outputs.sample_size }}" \
--output-json artifacts/canary-guard.json \
--output-md artifacts/canary-guard.md \
"${args[@]}"
- name: Extract canary decision outputs
id: extract
shell: bash
run: |
set -euo pipefail
decision="$(python3 - <<'PY'
import json
data = json.load(open('artifacts/canary-guard.json', encoding='utf-8'))
print(data.get('decision', 'hold'))
PY
)"
ready_to_execute="$(python3 - <<'PY'
import json
data = json.load(open('artifacts/canary-guard.json', encoding='utf-8'))
print(str(bool(data.get('ready_to_execute', False))).lower())
PY
)"
echo "decision=${decision}" >> "$GITHUB_OUTPUT"
echo "ready_to_execute=${ready_to_execute}" >> "$GITHUB_OUTPUT"
- name: Emit canary audit event
if: always()
shell: bash
run: |
set -euo pipefail
python3 scripts/ci/emit_audit_event.py \
--event-type canary_guard \
--input-json artifacts/canary-guard.json \
--output-json artifacts/audit-event-canary-guard.json \
--artifact-name canary-guard \
--retention-days 21
- name: Publish canary summary
if: always()
shell: bash
run: |
set -euo pipefail
cat artifacts/canary-guard.md >> "$GITHUB_STEP_SUMMARY"
- name: Upload canary artifacts
if: always()
uses: actions/upload-artifact@b7c566a772e6b6bfb58ed0dc250532a479d7789f # v6.0.0
with:
name: canary-guard
path: |
artifacts/canary-guard.json
artifacts/canary-guard.md
artifacts/audit-event-canary-guard.json
if-no-files-found: error
retention-days: 21
canary-execute:
name: Canary Execute
needs: [canary-plan]
if: github.event_name == 'workflow_dispatch' && needs.canary-plan.outputs.mode == 'execute' && needs.canary-plan.outputs.ready_to_execute == 'true'
runs-on: [self-hosted, Linux, X64, aws-india, blacksmith-2vcpu-ubuntu-2404]
timeout-minutes: 10
permissions:
contents: write
actions: write
steps:
- name: Checkout
uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
- name: Create canary marker tag
shell: bash
run: |
set -euo pipefail
marker_tag="canary-${{ needs.canary-plan.outputs.candidate_tag }}-${{ github.run_id }}"
git fetch --tags --force origin
git tag -a "$marker_tag" "${{ needs.canary-plan.outputs.candidate_sha }}" -m "Canary decision marker from run ${{ github.run_id }}"
git push origin "$marker_tag"
echo "Created marker tag: $marker_tag" >> "$GITHUB_STEP_SUMMARY"
- name: Emit canary repository dispatch
if: github.event.inputs.emit_repository_dispatch == 'true'
uses: actions/github-script@ed597411d8f924073f98dfc5c65a23a2325f34cd # v8
with:
script: |
await github.rest.repos.createDispatchEvent({
owner: context.repo.owner,
repo: context.repo.repo,
event_type: `canary_${{ needs.canary-plan.outputs.decision }}`,
client_payload: {
candidate_tag: "${{ needs.canary-plan.outputs.candidate_tag }}",
candidate_sha: "${{ needs.canary-plan.outputs.candidate_sha }}",
decision: "${{ needs.canary-plan.outputs.decision }}",
run_id: context.runId,
run_attempt: process.env.GITHUB_RUN_ATTEMPT,
source_sha: context.sha
}
});
- name: Trigger rollback guard workflow on abort
if: needs.canary-plan.outputs.decision == 'abort' && needs.canary-plan.outputs.trigger_rollback_on_abort == 'true'
uses: actions/github-script@ed597411d8f924073f98dfc5c65a23a2325f34cd # v8
with:
script: |
const rollbackBranch = "${{ needs.canary-plan.outputs.rollback_branch }}" || "dev";
const rollbackTargetRef = `${{ needs.canary-plan.outputs.rollback_target_ref }}`.trim();
const workflowRef = process.env.GITHUB_REF_NAME || "dev";
const inputs = {
branch: rollbackBranch,
mode: "execute",
allow_non_ancestor: "false",
fail_on_violation: "true",
create_marker_tag: "true",
emit_repository_dispatch: "true",
};
if (rollbackTargetRef.length > 0) {
inputs.target_ref = rollbackTargetRef;
}
await github.rest.actions.createWorkflowDispatch({
owner: context.repo.owner,
repo: context.repo.repo,
workflow_id: "ci-rollback.yml",
ref: workflowRef,
inputs,
});
- name: Publish rollback trigger summary
if: needs.canary-plan.outputs.decision == 'abort'
shell: bash
run: |
set -euo pipefail
if [ "${{ needs.canary-plan.outputs.trigger_rollback_on_abort }}" = "true" ]; then
{
echo "### Canary Abort Rollback Trigger"
echo "- CI Rollback Guard dispatch: triggered"
echo "- Rollback branch: \`${{ needs.canary-plan.outputs.rollback_branch }}\`"
if [ -n "${{ needs.canary-plan.outputs.rollback_target_ref }}" ]; then
echo "- Rollback target ref: \`${{ needs.canary-plan.outputs.rollback_target_ref }}\`"
else
echo "- Rollback target ref: _auto (latest release tag strategy)_"
fi
} >> "$GITHUB_STEP_SUMMARY"
else
{
echo "### Canary Abort Rollback Trigger"
echo "- CI Rollback Guard dispatch: skipped (trigger_rollback_on_abort=false)"
} >> "$GITHUB_STEP_SUMMARY"
fi

View File

@ -1,160 +0,0 @@
name: CI/CD Change Audit
on:
pull_request:
branches: [dev, main]
paths:
- ".github/workflows/**"
- ".github/release/**"
- ".github/codeql/**"
- "scripts/ci/**"
- ".github/dependabot.yml"
- "deny.toml"
- ".gitleaks.toml"
push:
branches: [dev, main]
paths:
- ".github/workflows/**"
- ".github/release/**"
- ".github/codeql/**"
- "scripts/ci/**"
- ".github/dependabot.yml"
- "deny.toml"
- ".gitleaks.toml"
workflow_dispatch:
inputs:
base_sha:
description: "Optional base SHA (default: HEAD~1)"
required: false
default: ""
type: string
fail_on_policy:
description: "Fail when audit policy violations are found"
required: true
default: true
type: boolean
concurrency:
group: ci-change-audit-${{ github.event.pull_request.number || github.sha || github.run_id }}
cancel-in-progress: true
permissions:
contents: read
env:
GIT_CONFIG_COUNT: "1"
GIT_CONFIG_KEY_0: core.hooksPath
GIT_CONFIG_VALUE_0: /dev/null
jobs:
audit:
name: CI Change Audit
runs-on: [self-hosted, Linux, X64, aws-india, light, cpu40]
timeout-minutes: 15
steps:
- name: Checkout
uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
with:
fetch-depth: 0
- name: Setup Python
shell: bash
run: |
set -euo pipefail
python3 --version
- name: Resolve base/head commits
id: refs
shell: bash
run: |
set -euo pipefail
head_sha="$(git rev-parse HEAD)"
if [ "${GITHUB_EVENT_NAME}" = "pull_request" ]; then
# For pull_request events, checkout uses refs/pull/*/merge; HEAD^1 is the
# effective base commit for this synthesized merge and avoids stale base.sha.
if git rev-parse --verify HEAD^1 >/dev/null 2>&1; then
base_sha="$(git rev-parse HEAD^1)"
else
base_sha="${{ github.event.pull_request.base.sha }}"
fi
elif [ "${GITHUB_EVENT_NAME}" = "push" ]; then
base_sha="${{ github.event.before }}"
else
base_sha="${{ github.event.inputs.base_sha || '' }}"
if [ -z "$base_sha" ]; then
base_sha="$(git rev-parse HEAD~1)"
fi
fi
echo "base_sha=$base_sha" >> "$GITHUB_OUTPUT"
echo "head_sha=$head_sha" >> "$GITHUB_OUTPUT"
- name: Run CI helper script unit tests
shell: bash
run: |
set -euo pipefail
python3 -m unittest discover -s scripts/ci/tests -p 'test_*.py' -v
- name: Generate CI change audit
shell: bash
env:
BASE_SHA: ${{ steps.refs.outputs.base_sha }}
HEAD_SHA: ${{ steps.refs.outputs.head_sha }}
run: |
set -euo pipefail
mkdir -p artifacts
fail_on_policy="true"
if [ "${GITHUB_EVENT_NAME}" = "workflow_dispatch" ]; then
fail_on_policy="${{ github.event.inputs.fail_on_policy || 'true' }}"
fi
cmd=(python3 scripts/ci/ci_change_audit.py
--base-sha "$BASE_SHA"
--head-sha "$HEAD_SHA"
--output-json artifacts/ci-change-audit.json
--output-md artifacts/ci-change-audit.md)
if [ "$fail_on_policy" = "true" ]; then
cmd+=(--fail-on-violations)
fi
"${cmd[@]}"
- name: Emit normalized audit event
if: always()
shell: bash
run: |
set -euo pipefail
if [ -f artifacts/ci-change-audit.json ]; then
python3 scripts/ci/emit_audit_event.py \
--event-type ci_change_audit \
--input-json artifacts/ci-change-audit.json \
--output-json artifacts/audit-event-ci-change-audit.json \
--artifact-name ci-change-audit-event \
--retention-days 14
fi
- name: Upload audit artifact
uses: actions/upload-artifact@b7c566a772e6b6bfb58ed0dc250532a479d7789f # v6.0.0
if: always()
with:
name: ci-change-audit
path: artifacts/ci-change-audit.*
retention-days: 14
- name: Publish audit summary
if: always()
shell: bash
run: |
set -euo pipefail
if [ -f artifacts/ci-change-audit.md ]; then
cat artifacts/ci-change-audit.md >> "$GITHUB_STEP_SUMMARY"
else
echo "CI change audit report was not generated." >> "$GITHUB_STEP_SUMMARY"
fi
- name: Upload audit event artifact
uses: actions/upload-artifact@b7c566a772e6b6bfb58ed0dc250532a479d7789f # v6.0.0
if: always()
with:
name: ci-change-audit-event
path: artifacts/audit-event-ci-change-audit.json
if-no-files-found: ignore
retention-days: 14

View File

@ -1,88 +0,0 @@
---
name: Post-Release Validation
on:
release:
types: ["published"]
permissions:
contents: read
jobs:
validate:
name: Validate Published Release
runs-on: [self-hosted, Linux, X64, aws-india, blacksmith-2vcpu-ubuntu-2404]
timeout-minutes: 15
steps:
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
- name: Download and verify release assets
shell: bash
env:
RELEASE_TAG: ${{ github.event.release.tag_name }}
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
set -euo pipefail
echo "Validating release: ${RELEASE_TAG}"
# 1. Check release exists and is not draft
release_json="$(gh api \
"repos/${GITHUB_REPOSITORY}/releases/tags/${RELEASE_TAG}")"
is_draft="$(echo "$release_json" \
| python3 -c "import sys,json; print(json.load(sys.stdin)['draft'])")"
if [ "$is_draft" = "True" ]; then
echo "::warning::Release ${RELEASE_TAG} is still in draft."
fi
# 2. Check expected assets against artifact contract
asset_count="$(echo "$release_json" \
| python3 -c "import sys,json; print(len(json.load(sys.stdin)['assets']))")"
contract=".github/release/release-artifact-contract.json"
expected_count="$(python3 -c "
import json
c = json.load(open('$contract'))
total = sum(len(c[k]) for k in c if k != 'schema_version')
print(total)
")"
echo "Release has ${asset_count} assets (contract expects ${expected_count})"
if [ "$asset_count" -lt "$expected_count" ]; then
echo "::error::Expected >=${expected_count} release assets (from ${contract}), found ${asset_count}"
exit 1
fi
# 3. Download checksum file and one archive
gh release download "${RELEASE_TAG}" \
--pattern "SHA256SUMS" \
--dir /tmp/release-check
gh release download "${RELEASE_TAG}" \
--pattern "zeroclaw-x86_64-unknown-linux-gnu.tar.gz" \
--dir /tmp/release-check
# 4. Verify checksum
cd /tmp/release-check
if sha256sum --check --ignore-missing SHA256SUMS; then
echo "SHA256 checksum verification: passed"
else
echo "::error::SHA256 checksum verification failed"
exit 1
fi
# 5. Extract binary
tar xzf zeroclaw-x86_64-unknown-linux-gnu.tar.gz
- name: Smoke-test release binary
shell: bash
env:
RELEASE_TAG: ${{ github.event.release.tag_name }}
run: |
set -euo pipefail
cd /tmp/release-check
if ./zeroclaw --version | grep -Fq "${RELEASE_TAG#v}"; then
echo "Binary version check: passed (${RELEASE_TAG})"
else
actual="$(./zeroclaw --version)"
echo "::error::Binary --version mismatch: ${actual}"
exit 1
fi
echo "Post-release validation: all checks passed"

View File

@ -1,112 +0,0 @@
name: CI Provider Connectivity
on:
schedule:
- cron: "30 */6 * * *" # Every 6 hours
workflow_dispatch:
inputs:
fail_on_critical:
description: "Fail run when critical endpoints are unreachable"
required: true
default: false
type: boolean
pull_request:
branches: [dev, main]
paths:
- ".github/workflows/ci-provider-connectivity.yml"
- ".github/connectivity/providers.json"
- "scripts/ci/provider_connectivity_matrix.py"
push:
branches: [dev, main]
paths:
- ".github/workflows/ci-provider-connectivity.yml"
- ".github/connectivity/providers.json"
- "scripts/ci/provider_connectivity_matrix.py"
concurrency:
group: provider-connectivity-${{ github.event.pull_request.number || github.ref || github.run_id }}
cancel-in-progress: true
permissions:
contents: read
env:
GIT_CONFIG_COUNT: "1"
GIT_CONFIG_KEY_0: core.hooksPath
GIT_CONFIG_VALUE_0: /dev/null
jobs:
probe:
name: Provider Connectivity Probe
runs-on: [self-hosted, Linux, X64, aws-india, blacksmith-2vcpu-ubuntu-2404]
timeout-minutes: 20
steps:
- name: Checkout
uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
- name: Run connectivity matrix probe
shell: bash
run: |
set -euo pipefail
mkdir -p artifacts
fail_on_critical="false"
case "${GITHUB_EVENT_NAME}" in
schedule)
fail_on_critical="true"
;;
workflow_dispatch)
fail_on_critical="${{ github.event.inputs.fail_on_critical || 'false' }}"
;;
esac
cmd=(python3 scripts/ci/provider_connectivity_matrix.py
--config .github/connectivity/providers.json
--output-json artifacts/provider-connectivity-matrix.json
--output-md artifacts/provider-connectivity-matrix.md)
if [ "$fail_on_critical" = "true" ]; then
cmd+=(--fail-on-critical)
fi
"${cmd[@]}"
- name: Emit normalized audit event
if: always()
shell: bash
run: |
set -euo pipefail
if [ -f artifacts/provider-connectivity-matrix.json ]; then
python3 scripts/ci/emit_audit_event.py \
--event-type provider_connectivity \
--input-json artifacts/provider-connectivity-matrix.json \
--output-json artifacts/audit-event-provider-connectivity.json \
--artifact-name provider-connectivity-audit-event \
--retention-days 14
fi
- name: Upload connectivity artifacts
uses: actions/upload-artifact@b7c566a772e6b6bfb58ed0dc250532a479d7789f # v6.0.0
if: always()
with:
name: provider-connectivity-matrix
path: artifacts/provider-connectivity-matrix.*
retention-days: 14
- name: Publish summary
if: always()
shell: bash
run: |
set -euo pipefail
if [ -f artifacts/provider-connectivity-matrix.md ]; then
cat artifacts/provider-connectivity-matrix.md >> "$GITHUB_STEP_SUMMARY"
else
echo "Provider connectivity report missing." >> "$GITHUB_STEP_SUMMARY"
fi
- name: Upload audit event artifact
if: always()
uses: actions/upload-artifact@b7c566a772e6b6bfb58ed0dc250532a479d7789f # v6.0.0
with:
name: provider-connectivity-audit-event
path: artifacts/audit-event-provider-connectivity.json
if-no-files-found: ignore
retention-days: 14

View File

@ -1,152 +0,0 @@
name: CI Queue Hygiene
on:
schedule:
- cron: "*/5 * * * *"
workflow_dispatch:
inputs:
apply:
description: "Cancel selected queued runs (false = dry-run report only)"
required: true
default: false
type: boolean
status:
description: "Queued-run status scope"
required: true
default: queued
type: choice
options:
- queued
- in_progress
- requested
- waiting
max_cancel:
description: "Maximum runs to cancel in one execution"
required: true
default: "120"
type: string
concurrency:
group: ci-queue-hygiene
cancel-in-progress: false
permissions:
actions: write
contents: read
env:
GIT_CONFIG_COUNT: "1"
GIT_CONFIG_KEY_0: core.hooksPath
GIT_CONFIG_VALUE_0: /dev/null
jobs:
hygiene:
name: Queue Hygiene
runs-on: [self-hosted, Linux, X64, aws-india, light, cpu40]
timeout-minutes: 15
steps:
- name: Checkout
uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
- name: Run queue hygiene policy
id: hygiene
shell: bash
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
set -euo pipefail
mkdir -p artifacts
status_scope="queued"
max_cancel="120"
apply_mode="true"
if [ "${GITHUB_EVENT_NAME}" = "workflow_dispatch" ]; then
status_scope="${{ github.event.inputs.status || 'queued' }}"
max_cancel="${{ github.event.inputs.max_cancel || '120' }}"
apply_mode="${{ github.event.inputs.apply || 'false' }}"
fi
cmd=(python3 scripts/ci/queue_hygiene.py
--repo "${{ github.repository }}"
--status "${status_scope}"
--max-cancel "${max_cancel}"
--dedupe-workflow "CI Run"
--dedupe-workflow "Test E2E"
--dedupe-workflow "Docs Deploy"
--dedupe-workflow "PR Intake Checks"
--dedupe-workflow "PR Labeler"
--dedupe-workflow "PR Auto Responder"
--dedupe-workflow "Workflow Sanity"
--dedupe-workflow "PR Label Policy Check"
--priority-branch-prefix "release/"
--dedupe-include-non-pr
--non-pr-key branch
--output-json artifacts/queue-hygiene-report.json
--verbose)
if [ "${apply_mode}" = "true" ]; then
cmd+=(--apply)
fi
"${cmd[@]}"
{
echo "status_scope=${status_scope}"
echo "max_cancel=${max_cancel}"
echo "apply_mode=${apply_mode}"
} >> "$GITHUB_OUTPUT"
- name: Publish queue hygiene summary
if: always()
shell: bash
run: |
set -euo pipefail
if [ ! -f artifacts/queue-hygiene-report.json ]; then
echo "Queue hygiene report not found." >> "$GITHUB_STEP_SUMMARY"
exit 0
fi
python3 - <<'PY'
from __future__ import annotations
import json
from pathlib import Path
report_path = Path("artifacts/queue-hygiene-report.json")
report = json.loads(report_path.read_text(encoding="utf-8"))
counts = report.get("counts", {})
results = report.get("results", {})
reasons = report.get("reason_counts", {})
lines = [
"### Queue Hygiene Report",
f"- Mode: `{report.get('mode', 'unknown')}`",
f"- Status scope: `{report.get('status_scope', 'queued')}`",
f"- Runs in scope: `{counts.get('runs_in_scope', 0)}`",
f"- Candidate runs before cap: `{counts.get('candidate_runs_before_cap', 0)}`",
f"- Candidate runs after cap: `{counts.get('candidate_runs_after_cap', 0)}`",
f"- Skipped by cap: `{counts.get('skipped_by_cap', 0)}`",
f"- Canceled: `{results.get('canceled', 0)}`",
f"- Cancel skipped (already terminal/conflict): `{results.get('skipped', 0)}`",
f"- Cancel failed: `{results.get('failed', 0)}`",
]
if reasons:
lines.append("")
lines.append("Reason counts:")
for reason, value in sorted(reasons.items()):
lines.append(f"- `{reason}`: `{value}`")
with Path("/tmp/queue-hygiene-summary.md").open("w", encoding="utf-8") as handle:
handle.write("\n".join(lines) + "\n")
PY
cat /tmp/queue-hygiene-summary.md >> "$GITHUB_STEP_SUMMARY"
- name: Upload queue hygiene report
if: always()
uses: actions/upload-artifact@b7c566a772e6b6bfb58ed0dc250532a479d7789f # v6.0.0
with:
name: queue-hygiene-report
path: artifacts/queue-hygiene-report.json
if-no-files-found: ignore
retention-days: 14

View File

@ -1,149 +0,0 @@
name: CI Reproducible Build
on:
push:
branches: [dev, main]
paths:
- "Cargo.toml"
- "Cargo.lock"
- "src/**"
- "crates/**"
- "scripts/ci/ensure_c_toolchain.sh"
- "scripts/ci/ensure_cargo_component.sh"
- "scripts/ci/ensure_cc.sh"
- "scripts/ci/reproducible_build_check.sh"
- "scripts/ci/self_heal_rust_toolchain.sh"
- ".github/workflows/ci-reproducible-build.yml"
pull_request:
branches: [dev, main]
paths:
- "Cargo.toml"
- "Cargo.lock"
- "src/**"
- "crates/**"
- "scripts/ci/ensure_c_toolchain.sh"
- "scripts/ci/ensure_cargo_component.sh"
- "scripts/ci/ensure_cc.sh"
- "scripts/ci/reproducible_build_check.sh"
- "scripts/ci/self_heal_rust_toolchain.sh"
- ".github/workflows/ci-reproducible-build.yml"
schedule:
- cron: "45 5 * * 1" # Weekly Monday 05:45 UTC
workflow_dispatch:
inputs:
fail_on_drift:
description: "Fail workflow if deterministic hash drift is detected"
required: true
default: true
type: boolean
allow_build_id_drift:
description: "Treat GNU build-id-only drift as non-blocking"
required: true
default: true
type: boolean
concurrency:
group: repro-build-${{ github.event.pull_request.number || github.ref || github.run_id }}
cancel-in-progress: true
permissions:
contents: read
env:
GIT_CONFIG_COUNT: "1"
GIT_CONFIG_KEY_0: core.hooksPath
GIT_CONFIG_VALUE_0: /dev/null
CARGO_TERM_COLOR: always
jobs:
reproducibility:
name: Reproducible Build Probe
runs-on: [self-hosted, Linux, X64, aws-india, blacksmith-2vcpu-ubuntu-2404]
timeout-minutes: 75
env:
CARGO_HOME: ${{ github.workspace }}/.ci-rust/${{ github.run_id }}-${{ github.run_attempt }}-${{ github.job }}/cargo
RUSTUP_HOME: ${{ github.workspace }}/.ci-rust/${{ github.run_id }}-${{ github.run_attempt }}-${{ github.job }}/rustup
CARGO_TARGET_DIR: ${{ github.workspace }}/.ci-rust/${{ github.run_id }}-${{ github.run_attempt }}-${{ github.job }}/target
steps:
- name: Checkout
uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
- name: Self-heal Rust toolchain cache
shell: bash
run: ./scripts/ci/self_heal_rust_toolchain.sh 1.92.0
- name: Ensure C toolchain
shell: bash
run: bash ./scripts/ci/ensure_c_toolchain.sh
- name: Setup Rust
uses: dtolnay/rust-toolchain@631a55b12751854ce901bb631d5902ceb48146f7 # stable
with:
toolchain: 1.92.0
- name: Ensure C toolchain for Rust builds
run: ./scripts/ci/ensure_cc.sh
- name: Ensure cargo component
shell: bash
env:
ENSURE_CARGO_COMPONENT_STRICT: "true"
run: bash ./scripts/ci/ensure_cargo_component.sh 1.92.0
- name: Run reproducible build check
shell: bash
run: |
set -euo pipefail
fail_on_drift="false"
allow_build_id_drift="true"
if [ "${GITHUB_EVENT_NAME}" = "schedule" ]; then
fail_on_drift="true"
elif [ "${GITHUB_EVENT_NAME}" = "workflow_dispatch" ]; then
fail_on_drift="${{ github.event.inputs.fail_on_drift || 'true' }}"
allow_build_id_drift="${{ github.event.inputs.allow_build_id_drift || 'true' }}"
fi
FAIL_ON_DRIFT="$fail_on_drift" \
ALLOW_BUILD_ID_DRIFT="$allow_build_id_drift" \
OUTPUT_DIR="artifacts" \
./scripts/ci/reproducible_build_check.sh
- name: Emit normalized audit event
if: always()
shell: bash
run: |
set -euo pipefail
if [ -f artifacts/reproducible-build.json ]; then
python3 scripts/ci/emit_audit_event.py \
--event-type reproducible_build \
--input-json artifacts/reproducible-build.json \
--output-json artifacts/audit-event-reproducible-build.json \
--artifact-name reproducible-build-audit-event \
--retention-days 14
fi
- name: Upload reproducibility artifacts
if: always()
uses: actions/upload-artifact@b7c566a772e6b6bfb58ed0dc250532a479d7789f # v6.0.0
with:
name: reproducible-build
path: artifacts/reproducible-build*
retention-days: 14
- name: Upload audit event artifact
if: always()
uses: actions/upload-artifact@b7c566a772e6b6bfb58ed0dc250532a479d7789f # v6.0.0
with:
name: reproducible-build-audit-event
path: artifacts/audit-event-reproducible-build.json
if-no-files-found: ignore
retention-days: 14
- name: Publish summary
if: always()
shell: bash
run: |
set -euo pipefail
if [ -f artifacts/reproducible-build.md ]; then
cat artifacts/reproducible-build.md >> "$GITHUB_STEP_SUMMARY"
else
echo "Reproducible build report missing." >> "$GITHUB_STEP_SUMMARY"
fi

View File

@ -1,258 +0,0 @@
name: CI Rollback Guard
on:
workflow_dispatch:
inputs:
branch:
description: "Integration branch this rollback targets"
required: true
default: dev
type: choice
options:
- dev
- main
mode:
description: "dry-run only plans; execute enables rollback marker/dispatch actions"
required: true
default: dry-run
type: choice
options:
- dry-run
- execute
target_ref:
description: "Optional explicit rollback target (tag/sha/ref). Empty = latest matching tag."
required: false
default: ""
type: string
allow_non_ancestor:
description: "Allow target not being ancestor of current head (warning-only)"
required: true
default: false
type: boolean
fail_on_violation:
description: "Fail workflow when guard violations are detected"
required: true
default: true
type: boolean
create_marker_tag:
description: "In execute mode, create and push rollback marker tag"
required: true
default: false
type: boolean
emit_repository_dispatch:
description: "In execute mode, emit repository_dispatch event `rollback_execute`"
required: true
default: false
type: boolean
schedule:
- cron: "15 7 * * 1" # Weekly Monday 07:15 UTC
concurrency:
group: ci-rollback-${{ github.event_name == 'workflow_dispatch' && (github.event.inputs.branch || 'dev') || github.ref_name }}
cancel-in-progress: false
permissions:
contents: read
actions: read
env:
GIT_CONFIG_COUNT: "1"
GIT_CONFIG_KEY_0: core.hooksPath
GIT_CONFIG_VALUE_0: /dev/null
jobs:
rollback-plan:
name: Rollback Guard Plan
runs-on: [self-hosted, Linux, X64, aws-india, blacksmith-2vcpu-ubuntu-2404]
timeout-minutes: 20
outputs:
branch: ${{ steps.plan.outputs.branch }}
mode: ${{ steps.plan.outputs.mode }}
target_sha: ${{ steps.plan.outputs.target_sha }}
target_ref: ${{ steps.plan.outputs.target_ref }}
ready_to_execute: ${{ steps.plan.outputs.ready_to_execute }}
steps:
- name: Checkout
uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
with:
fetch-depth: 0
ref: ${{ github.event_name == 'workflow_dispatch' && (github.event.inputs.branch || 'dev') || github.ref_name }}
- name: Build rollback plan
id: plan
shell: bash
run: |
set -euo pipefail
mkdir -p artifacts
branch_input="${GITHUB_REF_NAME}"
mode_input="dry-run"
target_ref_input=""
allow_non_ancestor="false"
# Scheduled audits can surface historical rollback violations; report without blocking by default.
fail_on_violation="false"
if [ "${GITHUB_EVENT_NAME}" = "workflow_dispatch" ]; then
branch_input="${{ github.event.inputs.branch || 'dev' }}"
mode_input="${{ github.event.inputs.mode || 'dry-run' }}"
target_ref_input="${{ github.event.inputs.target_ref || '' }}"
allow_non_ancestor="${{ github.event.inputs.allow_non_ancestor || 'false' }}"
fail_on_violation="${{ github.event.inputs.fail_on_violation || 'true' }}"
fi
cmd=(python3 scripts/ci/rollback_guard.py
--repo-root .
--branch "$branch_input"
--mode "$mode_input"
--strategy latest-release-tag
--tag-pattern "v*"
--output-json artifacts/rollback-plan.json
--output-md artifacts/rollback-plan.md)
if [ -n "$target_ref_input" ]; then
cmd+=(--target-ref "$target_ref_input")
fi
if [ "$allow_non_ancestor" = "true" ]; then
cmd+=(--allow-non-ancestor)
fi
if [ "$fail_on_violation" = "true" ]; then
cmd+=(--fail-on-violation)
fi
"${cmd[@]}"
target_sha="$(python3 - <<'PY'
import json
d = json.load(open("artifacts/rollback-plan.json", "r", encoding="utf-8"))
print(d.get("target_sha", ""))
PY
)"
target_ref="$(python3 - <<'PY'
import json
d = json.load(open("artifacts/rollback-plan.json", "r", encoding="utf-8"))
print(d.get("target_ref", ""))
PY
)"
ready_to_execute="$(python3 - <<'PY'
import json
d = json.load(open("artifacts/rollback-plan.json", "r", encoding="utf-8"))
print(str(d.get("ready_to_execute", False)).lower())
PY
)"
{
echo "branch=$branch_input"
echo "mode=$mode_input"
echo "target_sha=$target_sha"
echo "target_ref=$target_ref"
echo "ready_to_execute=$ready_to_execute"
} >> "$GITHUB_OUTPUT"
- name: Emit rollback audit event
if: always()
shell: bash
run: |
set -euo pipefail
if [ -f artifacts/rollback-plan.json ]; then
python3 scripts/ci/emit_audit_event.py \
--event-type rollback_guard \
--input-json artifacts/rollback-plan.json \
--output-json artifacts/audit-event-rollback-guard.json \
--artifact-name ci-rollback-plan \
--retention-days 21
fi
- name: Upload rollback artifacts
if: always()
uses: actions/upload-artifact@b7c566a772e6b6bfb58ed0dc250532a479d7789f # v6.0.0
with:
name: ci-rollback-plan
path: |
artifacts/rollback-plan.*
artifacts/audit-event-rollback-guard.json
if-no-files-found: ignore
retention-days: 21
- name: Publish rollback summary
if: always()
shell: bash
run: |
set -euo pipefail
if [ -f artifacts/rollback-plan.md ]; then
cat artifacts/rollback-plan.md >> "$GITHUB_STEP_SUMMARY"
else
echo "Rollback plan markdown report missing." >> "$GITHUB_STEP_SUMMARY"
fi
rollback-execute:
name: Rollback Execute Actions
needs: [rollback-plan]
if: github.event_name == 'workflow_dispatch' && needs.rollback-plan.outputs.mode == 'execute' && needs.rollback-plan.outputs.ready_to_execute == 'true'
runs-on: [self-hosted, Linux, X64, aws-india, blacksmith-2vcpu-ubuntu-2404]
timeout-minutes: 15
permissions:
contents: write
actions: read
steps:
- name: Checkout
uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
with:
fetch-depth: 0
ref: ${{ needs.rollback-plan.outputs.branch }}
- name: Fetch tags
shell: bash
run: |
set -euo pipefail
git fetch --tags --force origin
- name: Create rollback marker tag
id: marker
if: github.event.inputs.create_marker_tag == 'true'
shell: bash
run: |
set -euo pipefail
target_sha="${{ needs.rollback-plan.outputs.target_sha }}"
if [ -z "$target_sha" ]; then
echo "Rollback guard did not resolve target_sha."
exit 1
fi
marker_tag="rollback-${{ needs.rollback-plan.outputs.branch }}-${{ github.run_id }}"
git tag -a "$marker_tag" "$target_sha" -m "Rollback marker from run ${{ github.run_id }}"
git push origin "$marker_tag"
echo "marker_tag=$marker_tag" >> "$GITHUB_OUTPUT"
- name: Emit rollback repository dispatch
if: github.event.inputs.emit_repository_dispatch == 'true'
uses: actions/github-script@ed597411d8f924073f98dfc5c65a23a2325f34cd # v8
with:
script: |
await github.rest.repos.createDispatchEvent({
owner: context.repo.owner,
repo: context.repo.repo,
event_type: "rollback_execute",
client_payload: {
branch: "${{ needs.rollback-plan.outputs.branch }}",
target_ref: "${{ needs.rollback-plan.outputs.target_ref }}",
target_sha: "${{ needs.rollback-plan.outputs.target_sha }}",
run_id: context.runId,
run_attempt: process.env.GITHUB_RUN_ATTEMPT,
source_sha: context.sha
}
});
- name: Publish execute summary
if: always()
shell: bash
run: |
set -euo pipefail
{
echo "### Rollback Execute Actions"
echo "- Branch: \`${{ needs.rollback-plan.outputs.branch }}\`"
echo "- Target ref: \`${{ needs.rollback-plan.outputs.target_ref }}\`"
echo "- Target sha: \`${{ needs.rollback-plan.outputs.target_sha }}\`"
if [ -n "${{ steps.marker.outputs.marker_tag || '' }}" ]; then
echo "- Marker tag: \`${{ steps.marker.outputs.marker_tag }}\`"
fi
} >> "$GITHUB_STEP_SUMMARY"

View File

@ -1,466 +1,193 @@
name: CI Run
name: CI
on:
push:
branches: [dev, main]
pull_request:
branches: [dev, main]
merge_group:
branches: [dev, main]
push:
branches: [master]
pull_request:
branches: [master]
concurrency:
group: ci-run-${{ github.event_name }}-${{ github.event.pull_request.number || github.ref_name || github.sha }}
cancel-in-progress: true
group: ci-${{ github.event.pull_request.number || 'push-master' }}
cancel-in-progress: true
permissions:
contents: read
contents: read
env:
GIT_CONFIG_COUNT: "1"
GIT_CONFIG_KEY_0: core.hooksPath
GIT_CONFIG_VALUE_0: /dev/null
CARGO_TERM_COLOR: always
CARGO_TERM_COLOR: always
CARGO_INCREMENTAL: 0
jobs:
changes:
name: Detect Change Scope
runs-on: [self-hosted, Linux, X64, aws-india, light, cpu40]
outputs:
docs_only: ${{ steps.scope.outputs.docs_only }}
docs_changed: ${{ steps.scope.outputs.docs_changed }}
rust_changed: ${{ steps.scope.outputs.rust_changed }}
workflow_changed: ${{ steps.scope.outputs.workflow_changed }}
ci_cd_changed: ${{ steps.scope.outputs.ci_cd_changed }}
docs_files: ${{ steps.scope.outputs.docs_files }}
base_sha: ${{ steps.scope.outputs.base_sha }}
steps:
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
with:
fetch-depth: 0
lint:
name: Lint
runs-on: ubuntu-latest
timeout-minutes: 10
steps:
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
with:
fetch-depth: 0
- uses: dtolnay/rust-toolchain@631a55b12751854ce901bb631d5902ceb48146f7 # stable
with:
toolchain: 1.92.0
components: rustfmt, clippy
- uses: Swatinem/rust-cache@779680da715d629ac1d338a641029a2f4372abb5 # v2
- name: Detect docs-only changes
id: scope
shell: bash
env:
EVENT_NAME: ${{ github.event_name }}
BASE_SHA: ${{ github.event_name == 'pull_request' && github.event.pull_request.base.sha || github.event_name == 'merge_group' && github.event.merge_group.base_sha || github.event.before }}
run: ./scripts/ci/detect_change_scope.sh
- name: Ensure web/dist placeholder exists
run: mkdir -p web/dist && touch web/dist/.gitkeep
lint:
name: Lint Gate (Format + Clippy + Strict Delta)
needs: [changes]
if: needs.changes.outputs.rust_changed == 'true'
runs-on: [self-hosted, Linux, X64, aws-india, blacksmith-2vcpu-ubuntu-2404]
timeout-minutes: 75
- name: Check formatting
run: cargo fmt --all -- --check
- name: Clippy
run: cargo clippy --all-targets -- -D warnings
lint-strict-delta:
name: Strict Delta Lint
runs-on: ubuntu-latest
timeout-minutes: 15
steps:
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
with:
fetch-depth: 0
- uses: dtolnay/rust-toolchain@631a55b12751854ce901bb631d5902ceb48146f7 # stable
with:
toolchain: 1.92.0
components: clippy
- uses: Swatinem/rust-cache@779680da715d629ac1d338a641029a2f4372abb5 # v2
- name: Ensure web/dist placeholder exists
run: mkdir -p web/dist && touch web/dist/.gitkeep
- name: Run strict delta lint gate
run: bash scripts/ci/rust_strict_delta_gate.sh
env:
CARGO_HOME: ${{ github.workspace }}/.ci-rust/${{ github.run_id }}-${{ github.run_attempt }}-${{ github.job }}/cargo
RUSTUP_HOME: ${{ github.workspace }}/.ci-rust/${{ github.run_id }}-${{ github.run_attempt }}-${{ github.job }}/rustup
CARGO_TARGET_DIR: ${{ github.workspace }}/.ci-rust/${{ github.run_id }}-${{ github.run_attempt }}-${{ github.job }}/target
steps:
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
with:
fetch-depth: 0
- name: Self-heal Rust toolchain cache
shell: bash
run: ./scripts/ci/self_heal_rust_toolchain.sh 1.92.0
- name: Ensure C toolchain
shell: bash
run: bash ./scripts/ci/ensure_c_toolchain.sh
- uses: dtolnay/rust-toolchain@631a55b12751854ce901bb631d5902ceb48146f7 # stable
with:
toolchain: 1.92.0
components: rustfmt, clippy
- name: Ensure C toolchain for Rust builds
run: ./scripts/ci/ensure_cc.sh
- name: Ensure cargo component
shell: bash
run: bash ./scripts/ci/ensure_cargo_component.sh 1.92.0
- uses: Swatinem/rust-cache@779680da715d629ac1d338a641029a2f4372abb5 # v3
with:
prefix-key: ci-run-check
cache-bin: false
- name: Run rust quality gate
run: ./scripts/ci/rust_quality_gate.sh
- name: Run strict lint delta gate
env:
BASE_SHA: ${{ needs.changes.outputs.base_sha }}
run: ./scripts/ci/rust_strict_delta_gate.sh
BASE_SHA: ${{ github.event.pull_request.base.sha || github.event.before }}
workspace-check:
name: Workspace Check
needs: [changes]
if: needs.changes.outputs.rust_changed == 'true'
runs-on: [self-hosted, Linux, X64, aws-india, blacksmith-2vcpu-ubuntu-2404]
timeout-minutes: 45
steps:
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
- name: Self-heal Rust toolchain cache
shell: bash
run: ./scripts/ci/self_heal_rust_toolchain.sh 1.92.0
- uses: dtolnay/rust-toolchain@631a55b12751854ce901bb631d5902ceb48146f7 # stable
with:
toolchain: 1.92.0
- uses: Swatinem/rust-cache@779680da715d629ac1d338a641029a2f4372abb5 # v3
with:
prefix-key: ci-run-workspace-check
cache-bin: false
- name: Check workspace
run: cargo check --workspace --locked
test:
name: Test
runs-on: ubuntu-latest
timeout-minutes: 30
needs: [lint]
steps:
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
- uses: dtolnay/rust-toolchain@631a55b12751854ce901bb631d5902ceb48146f7 # stable
with:
toolchain: 1.92.0
- uses: Swatinem/rust-cache@779680da715d629ac1d338a641029a2f4372abb5 # v2
package-check:
name: Package Check (${{ matrix.package }})
needs: [changes]
if: needs.changes.outputs.rust_changed == 'true'
runs-on: [self-hosted, Linux, X64, aws-india, blacksmith-2vcpu-ubuntu-2404]
timeout-minutes: 25
strategy:
fail-fast: false
matrix:
package: [zeroclaw-types, zeroclaw-core]
steps:
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
- name: Self-heal Rust toolchain cache
shell: bash
run: ./scripts/ci/self_heal_rust_toolchain.sh 1.92.0
- uses: dtolnay/rust-toolchain@631a55b12751854ce901bb631d5902ceb48146f7 # stable
with:
toolchain: 1.92.0
- uses: Swatinem/rust-cache@779680da715d629ac1d338a641029a2f4372abb5 # v3
with:
prefix-key: ci-run-package-check
cache-bin: false
- name: Check package
run: cargo check -p ${{ matrix.package }} --locked
- name: Ensure web/dist placeholder exists
run: mkdir -p web/dist && touch web/dist/.gitkeep
test:
name: Test
needs: [changes]
if: needs.changes.outputs.rust_changed == 'true'
runs-on: [self-hosted, Linux, X64, aws-india, blacksmith-2vcpu-ubuntu-2404]
timeout-minutes: 120
- name: Install mold linker
run: |
sudo apt-get update -qq
sudo apt-get install -y mold
- name: Install cargo-nextest
run: curl -LsSf https://get.nexte.st/latest/linux | tar zxf - -C ${CARGO_HOME:-~/.cargo}/bin
- name: Run tests
run: cargo nextest run --locked
env:
CARGO_HOME: ${{ github.workspace }}/.ci-rust/${{ github.run_id }}-${{ github.run_attempt }}-${{ github.job }}/cargo
RUSTUP_HOME: ${{ github.workspace }}/.ci-rust/${{ github.run_id }}-${{ github.run_attempt }}-${{ github.job }}/rustup
CARGO_TARGET_DIR: ${{ github.workspace }}/.ci-rust/${{ github.run_id }}-${{ github.run_attempt }}-${{ github.job }}/target
steps:
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
- name: Ensure C toolchain
shell: bash
run: bash ./scripts/ci/ensure_c_toolchain.sh
- name: Self-heal Rust toolchain cache
shell: bash
run: ./scripts/ci/self_heal_rust_toolchain.sh 1.92.0
- uses: dtolnay/rust-toolchain@631a55b12751854ce901bb631d5902ceb48146f7 # stable
with:
toolchain: 1.92.0
- name: Ensure C toolchain for Rust builds
run: ./scripts/ci/ensure_cc.sh
- name: Ensure cargo component
shell: bash
run: bash ./scripts/ci/ensure_cargo_component.sh 1.92.0
- uses: Swatinem/rust-cache@779680da715d629ac1d338a641029a2f4372abb5 # v3
with:
prefix-key: ci-run-check
cache-bin: false
- name: Run tests with flake detection
shell: bash
env:
BLOCK_ON_FLAKE: ${{ vars.CI_BLOCK_ON_FLAKE_SUSPECTED || 'false' }}
run: |
set -euo pipefail
mkdir -p artifacts
CARGO_TARGET_X86_64_UNKNOWN_LINUX_GNU_LINKER: clang
CARGO_TARGET_X86_64_UNKNOWN_LINUX_GNU_RUSTFLAGS: "-C link-arg=-fuse-ld=mold"
toolchain_bin=""
if [ -n "${CARGO:-}" ]; then
toolchain_bin="$(dirname "${CARGO}")"
elif [ -n "${RUSTC:-}" ]; then
toolchain_bin="$(dirname "${RUSTC}")"
fi
build:
name: Build ${{ matrix.target }}
runs-on: ${{ matrix.os }}
timeout-minutes: 40
needs: [lint]
strategy:
fail-fast: false
matrix:
include:
- os: ubuntu-latest
target: x86_64-unknown-linux-gnu
- os: macos-14
target: aarch64-apple-darwin
- os: windows-latest
target: x86_64-pc-windows-msvc
steps:
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
- uses: dtolnay/rust-toolchain@631a55b12751854ce901bb631d5902ceb48146f7 # stable
with:
toolchain: 1.92.0
targets: ${{ matrix.target }}
- uses: Swatinem/rust-cache@779680da715d629ac1d338a641029a2f4372abb5 # v2
if: runner.os != 'Windows'
if [ -n "${toolchain_bin}" ] && [ -d "${toolchain_bin}" ]; then
case ":$PATH:" in
*":${toolchain_bin}:"*) ;;
*) export PATH="${toolchain_bin}:$PATH" ;;
esac
fi
- name: Install mold linker
if: runner.os == 'Linux'
run: |
sudo apt-get update -qq
sudo apt-get install -y mold
if cargo test --locked --verbose; then
echo '{"flake_suspected":false,"status":"success"}' > artifacts/flake-probe.json
exit 0
fi
- name: Ensure web/dist placeholder exists
shell: bash
run: mkdir -p web/dist && touch web/dist/.gitkeep
echo "::warning::First test run failed. Retrying for flake detection..."
if cargo test --locked --verbose; then
echo '{"flake_suspected":true,"status":"flake"}' > artifacts/flake-probe.json
echo "::warning::Flake suspected — test passed on retry"
if [ "${BLOCK_ON_FLAKE}" = "true" ]; then
echo "BLOCK_ON_FLAKE is set; failing on suspected flake."
exit 1
fi
exit 0
fi
echo '{"flake_suspected":false,"status":"failure"}' > artifacts/flake-probe.json
exit 1
- name: Publish flake probe summary
if: always()
shell: bash
run: |
set -euo pipefail
if [ -f artifacts/flake-probe.json ]; then
status=$(python3 -c "import json; print(json.load(open('artifacts/flake-probe.json'))['status'])")
flake=$(python3 -c "import json; print(json.load(open('artifacts/flake-probe.json'))['flake_suspected'])")
{
echo "### Test Flake Probe"
echo "- Status: \`${status}\`"
echo "- Flake suspected: \`${flake}\`"
} >> "$GITHUB_STEP_SUMMARY"
fi
- name: Upload flake probe artifact
if: always()
uses: actions/upload-artifact@b7c566a772e6b6bfb58ed0dc250532a479d7789f # v6.0.0
with:
name: test-flake-probe
path: artifacts/flake-probe.*
if-no-files-found: ignore
retention-days: 14
build:
name: Build (Smoke)
needs: [changes]
if: needs.changes.outputs.rust_changed == 'true'
runs-on: [self-hosted, Linux, X64, aws-india, blacksmith-2vcpu-ubuntu-2404]
timeout-minutes: 90
- name: Build release
shell: bash
run: cargo build --profile ci --locked --target ${{ matrix.target }}
env:
CARGO_HOME: ${{ github.workspace }}/.ci-rust/${{ github.run_id }}-${{ github.run_attempt }}-${{ github.job }}/cargo
RUSTUP_HOME: ${{ github.workspace }}/.ci-rust/${{ github.run_id }}-${{ github.run_attempt }}-${{ github.job }}/rustup
CARGO_TARGET_DIR: ${{ github.workspace }}/.ci-rust/${{ github.run_id }}-${{ github.run_attempt }}-${{ github.job }}/target
CARGO_TARGET_X86_64_UNKNOWN_LINUX_GNU_LINKER: clang
CARGO_TARGET_X86_64_UNKNOWN_LINUX_GNU_RUSTFLAGS: "-C link-arg=-fuse-ld=mold"
steps:
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
- name: Ensure C toolchain
shell: bash
run: bash ./scripts/ci/ensure_c_toolchain.sh
- name: Self-heal Rust toolchain cache
shell: bash
run: ./scripts/ci/self_heal_rust_toolchain.sh 1.92.0
- uses: dtolnay/rust-toolchain@631a55b12751854ce901bb631d5902ceb48146f7 # stable
with:
toolchain: 1.92.0
- name: Ensure C toolchain for Rust builds
run: ./scripts/ci/ensure_cc.sh
- name: Ensure cargo component
shell: bash
run: bash ./scripts/ci/ensure_cargo_component.sh 1.92.0
- uses: Swatinem/rust-cache@779680da715d629ac1d338a641029a2f4372abb5 # v3
with:
prefix-key: ci-run-build
cache-targets: true
cache-bin: false
- name: Build binary (smoke check)
env:
CARGO_BUILD_JOBS: 2
CI_SMOKE_BUILD_ATTEMPTS: 3
run: bash scripts/ci/smoke_build_retry.sh
- name: Check binary size
env:
BINARY_SIZE_HARD_LIMIT_MB: 28
BINARY_SIZE_ADVISORY_MB: 20
BINARY_SIZE_TARGET_MB: 5
run: bash scripts/ci/check_binary_size.sh target/release-fast/zeroclaw
check-all-features:
name: Check (all features)
runs-on: ubuntu-latest
timeout-minutes: 20
needs: [lint]
steps:
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
with:
fetch-depth: 0
- uses: dtolnay/rust-toolchain@631a55b12751854ce901bb631d5902ceb48146f7 # stable
with:
toolchain: 1.92.0
- uses: Swatinem/rust-cache@779680da715d629ac1d338a641029a2f4372abb5 # v2
docs-only:
name: Docs-Only Fast Path
needs: [changes]
if: needs.changes.outputs.docs_only == 'true'
runs-on: [self-hosted, Linux, X64, aws-india, light, cpu40]
steps:
- name: Skip heavy jobs for docs-only change
run: echo "Docs-only change detected. Rust lint/test/build skipped."
- name: Install system dependencies
run: sudo apt-get update -qq && sudo apt-get install -y libudev-dev
non-rust:
name: Non-Rust Fast Path
needs: [changes]
if: needs.changes.outputs.docs_only != 'true' && needs.changes.outputs.rust_changed != 'true'
runs-on: [self-hosted, Linux, X64, aws-india, light, cpu40]
steps:
- name: Skip Rust jobs for non-Rust change scope
run: echo "No Rust-impacting files changed. Rust lint/test/build skipped."
- name: Ensure web/dist placeholder exists
run: mkdir -p web/dist && touch web/dist/.gitkeep
docs-quality:
name: Docs Quality
needs: [changes]
if: needs.changes.outputs.docs_changed == 'true'
runs-on: [self-hosted, Linux, X64, aws-india, light, cpu40]
timeout-minutes: 15
steps:
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
with:
fetch-depth: 0
- name: Setup Node.js for markdown lint
uses: actions/setup-node@49933ea5288caeca8642d1e84afbd3f7d6820020 # v4
with:
node-version: "22"
- name: Check all features
run: cargo check --features ci-all --locked
- name: Markdown lint (changed lines only)
env:
BASE_SHA: ${{ needs.changes.outputs.base_sha }}
DOCS_FILES: ${{ needs.changes.outputs.docs_files }}
run: ./scripts/ci/docs_quality_gate.sh
docs-quality:
name: Docs Quality
runs-on: ubuntu-latest
timeout-minutes: 10
steps:
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
with:
fetch-depth: 0
- uses: actions/setup-node@1d0ff469b7ec7b3cb9d8673fde0c81c44821de2a # v4
with:
node-version: 20
- uses: actions/setup-python@a26af69be951a213d495a4c3e4e4022e16d87065 # v5
with:
python-version: "3.12"
- name: Collect added links
id: collect_links
shell: bash
env:
BASE_SHA: ${{ needs.changes.outputs.base_sha }}
DOCS_FILES: ${{ needs.changes.outputs.docs_files }}
run: |
set -euo pipefail
python3 ./scripts/ci/collect_changed_links.py \
--base "$BASE_SHA" \
--docs-files "$DOCS_FILES" \
--output .ci-added-links.txt
count=$(wc -l < .ci-added-links.txt | tr -d ' ')
echo "count=$count" >> "$GITHUB_OUTPUT"
if [ "$count" -gt 0 ]; then
echo "Added links queued for check:"
cat .ci-added-links.txt
else
echo "No added links found in changed docs lines."
fi
- name: Run docs quality gate
run: bash scripts/ci/docs_quality_gate.sh
env:
BASE_SHA: ${{ github.event.pull_request.base.sha || github.event.before }}
- name: Link check (offline, added links only)
if: steps.collect_links.outputs.count != '0'
uses: lycheeverse/lychee-action@8646ba30535128ac92d33dfc9133794bfdd9b411 # v2
with:
fail: true
args: >-
--offline
--no-progress
--format detailed
.ci-added-links.txt
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
- name: Skip link check (no added links)
if: steps.collect_links.outputs.count == '0'
run: echo "No added links in changed docs lines. Link check skipped."
lint-feedback:
name: Lint Feedback
if: github.event_name == 'pull_request'
needs: [changes, lint, docs-quality]
runs-on: [self-hosted, Linux, X64, aws-india, light, cpu40]
permissions:
contents: read
pull-requests: write
issues: write
steps:
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
- name: Post actionable lint failure summary
if: always()
uses: actions/github-script@ed597411d8f924073f98dfc5c65a23a2325f34cd # v8
env:
RUST_CHANGED: ${{ needs.changes.outputs.rust_changed }}
DOCS_CHANGED: ${{ needs.changes.outputs.docs_changed }}
LINT_RESULT: ${{ needs.lint.result }}
LINT_DELTA_RESULT: ${{ needs.lint.result }}
DOCS_RESULT: ${{ needs.docs-quality.result }}
with:
script: |
const script = require('./.github/workflows/scripts/lint_feedback.js');
await script({github, context, core});
license-file-owner-guard:
name: License File Owner Guard
needs: [changes]
if: github.event_name == 'pull_request'
runs-on: [self-hosted, Linux, X64, aws-india, light, cpu40]
permissions:
contents: read
pull-requests: read
steps:
- name: Checkout repository
uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
- name: Enforce owner-only edits for root license files
uses: actions/github-script@ed597411d8f924073f98dfc5c65a23a2325f34cd # v8
with:
script: |
const script = require('./.github/workflows/scripts/ci_license_file_owner_guard.js');
await script({ github, context, core });
ci-required:
name: CI Required Gate
if: always()
needs: [changes, lint, workspace-check, package-check, test, build, docs-only, non-rust, docs-quality, lint-feedback, license-file-owner-guard]
runs-on: [self-hosted, Linux, X64, aws-india, light, cpu40]
steps:
- name: Enforce required status
shell: bash
run: |
set -euo pipefail
event_name="${{ github.event_name }}"
rust_changed="${{ needs.changes.outputs.rust_changed }}"
docs_changed="${{ needs.changes.outputs.docs_changed }}"
docs_result="${{ needs.docs-quality.result }}"
license_owner_result="${{ needs.license-file-owner-guard.result }}"
# --- Helper: enforce PR governance gates ---
check_pr_governance() {
if [ "$event_name" != "pull_request" ]; then return 0; fi
if [ "$license_owner_result" != "success" ]; then
echo "License file owner guard did not pass."
exit 1
fi
}
check_docs_quality() {
if [ "$docs_changed" = "true" ] && [ "$docs_result" != "success" ]; then
echo "Docs changed but docs-quality did not pass."
exit 1
fi
}
# --- Docs-only fast path ---
if [ "${{ needs.changes.outputs.docs_only }}" = "true" ]; then
check_pr_governance
check_docs_quality
echo "Docs-only fast path passed."
exit 0
fi
# --- Non-rust fast path ---
if [ "$rust_changed" != "true" ]; then
check_pr_governance
check_docs_quality
echo "Non-rust fast path passed."
exit 0
fi
# --- Rust change path ---
lint_result="${{ needs.lint.result }}"
workspace_check_result="${{ needs.workspace-check.result }}"
package_check_result="${{ needs.package-check.result }}"
test_result="${{ needs.test.result }}"
build_result="${{ needs.build.result }}"
echo "lint=${lint_result}"
echo "workspace-check=${workspace_check_result}"
echo "package-check=${package_check_result}"
echo "test=${test_result}"
echo "build=${build_result}"
echo "docs=${docs_result}"
echo "license_file_owner_guard=${license_owner_result}"
check_pr_governance
if [ "$lint_result" != "success" ] || [ "$workspace_check_result" != "success" ] || [ "$package_check_result" != "success" ] || [ "$test_result" != "success" ] || [ "$build_result" != "success" ]; then
echo "Required CI jobs did not pass: lint=${lint_result} workspace-check=${workspace_check_result} package-check=${package_check_result} test=${test_result} build=${build_result}"
exit 1
fi
check_docs_quality
echo "All required checks passed."
# Composite status check — branch protection requires this single job.
gate:
name: CI Required Gate
if: always()
needs: [lint, lint-strict-delta, test, build, docs-quality, check-all-features]
runs-on: ubuntu-latest
steps:
- name: Check upstream job results
env:
HAS_FAILURE: ${{ contains(needs.*.result, 'failure') || contains(needs.*.result, 'cancelled') }}
run: |
if [[ "$HAS_FAILURE" == "true" ]]; then
echo "::error::One or more upstream jobs failed or were cancelled"
exit 1
fi

View File

@ -1,150 +0,0 @@
name: CI Supply Chain Provenance
on:
push:
branches: [dev, main]
paths:
- "Cargo.toml"
- "Cargo.lock"
- "src/**"
- "crates/**"
- "scripts/ci/ensure_cc.sh"
- "scripts/ci/generate_provenance.py"
- ".github/workflows/ci-supply-chain-provenance.yml"
workflow_dispatch:
schedule:
- cron: "20 6 * * 1" # Weekly Monday 06:20 UTC
concurrency:
group: supply-chain-provenance-${{ github.ref || github.run_id }}
cancel-in-progress: true
permissions:
contents: read
id-token: write
env:
GIT_CONFIG_COUNT: "1"
GIT_CONFIG_KEY_0: core.hooksPath
GIT_CONFIG_VALUE_0: /dev/null
CARGO_TERM_COLOR: always
jobs:
provenance:
name: Build + Provenance Bundle
runs-on: [self-hosted, Linux, X64, aws-india, blacksmith-2vcpu-ubuntu-2404]
timeout-minutes: 60
steps:
- name: Checkout
uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
- name: Setup Rust
uses: dtolnay/rust-toolchain@631a55b12751854ce901bb631d5902ceb48146f7 # stable
with:
toolchain: 1.92.0
- name: Ensure cargo component
shell: bash
env:
ENSURE_CARGO_COMPONENT_STRICT: "true"
run: bash ./scripts/ci/ensure_cargo_component.sh 1.92.0
- name: Activate toolchain binaries on PATH
shell: bash
run: |
set -euo pipefail
toolchain_bin="$(dirname "$(rustup which --toolchain 1.92.0 cargo)")"
echo "$toolchain_bin" >> "$GITHUB_PATH"
- name: Resolve host target
id: rust-meta
shell: bash
run: |
set -euo pipefail
host_target="$(rustup run 1.92.0 rustc -vV | sed -n 's/^host: //p')"
if [ -z "${host_target}" ]; then
echo "::error::Unable to resolve Rust host target."
exit 1
fi
echo "host_target=${host_target}" >> "$GITHUB_OUTPUT"
- name: Runner preflight (compiler + disk)
shell: bash
run: |
set -euo pipefail
./scripts/ci/ensure_cc.sh
echo "Runner: ${RUNNER_NAME:-unknown} (${RUNNER_OS:-unknown}/${RUNNER_ARCH:-unknown})"
free_kb="$(df -Pk . | awk 'NR==2 {print $4}')"
min_kb=$((10 * 1024 * 1024))
if [ "${free_kb}" -lt "${min_kb}" ]; then
echo "::error::Insufficient disk space on runner (<10 GiB free)."
df -h .
exit 1
fi
- name: Build release-fast artifact
shell: bash
run: |
set -euo pipefail
mkdir -p artifacts
host_target="${{ steps.rust-meta.outputs.host_target }}"
cargo build --profile release-fast --locked --target "$host_target"
cp "target/${host_target}/release-fast/zeroclaw" "artifacts/zeroclaw-${host_target}"
sha256sum "artifacts/zeroclaw-${host_target}" > "artifacts/zeroclaw-${host_target}.sha256"
- name: Generate provenance statement
shell: bash
run: |
set -euo pipefail
host_target="${{ steps.rust-meta.outputs.host_target }}"
python3 scripts/ci/generate_provenance.py \
--artifact "artifacts/zeroclaw-${host_target}" \
--subject-name "zeroclaw-${host_target}" \
--output "artifacts/provenance-${host_target}.intoto.json"
- name: Install cosign
uses: sigstore/cosign-installer@faadad0cce49287aee09b3a48701e75088a2c6ad # v4.0.0
- name: Sign provenance bundle
shell: bash
run: |
set -euo pipefail
host_target="${{ steps.rust-meta.outputs.host_target }}"
statement="artifacts/provenance-${host_target}.intoto.json"
cosign sign-blob --yes \
--bundle="${statement}.sigstore.json" \
--output-signature="${statement}.sig" \
--output-certificate="${statement}.pem" \
"${statement}"
- name: Emit normalized audit event
shell: bash
run: |
set -euo pipefail
host_target="${{ steps.rust-meta.outputs.host_target }}"
python3 scripts/ci/emit_audit_event.py \
--event-type supply_chain_provenance \
--input-json "artifacts/provenance-${host_target}.intoto.json" \
--output-json "artifacts/audit-event-supply-chain-provenance.json" \
--artifact-name supply-chain-provenance \
--retention-days 30
- name: Upload provenance artifacts
uses: actions/upload-artifact@b7c566a772e6b6bfb58ed0dc250532a479d7789f # v6.0.0
with:
name: supply-chain-provenance
path: artifacts/*
retention-days: 30
- name: Publish summary
shell: bash
run: |
set -euo pipefail
host_target="${{ steps.rust-meta.outputs.host_target }}"
{
echo "### Supply Chain Provenance"
echo "- Target: \`${host_target}\`"
echo "- Artifact: \`artifacts/zeroclaw-${host_target}\`"
echo "- Statement: \`artifacts/provenance-${host_target}.intoto.json\`"
echo "- Signature: \`artifacts/provenance-${host_target}.intoto.json.sig\`"
} >> "$GITHUB_STEP_SUMMARY"

View File

@ -0,0 +1,82 @@
name: Cross-Platform Build
on:
workflow_dispatch:
permissions:
contents: read
env:
CARGO_TERM_COLOR: always
CARGO_INCREMENTAL: 0
jobs:
web:
name: Build Web Dashboard
runs-on: ubuntu-latest
timeout-minutes: 10
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: 22
cache: npm
cache-dependency-path: web/package-lock.json
- name: Build web dashboard
run: cd web && npm ci && npm run build
- uses: actions/upload-artifact@v4
with:
name: web-dist
path: web/dist/
retention-days: 1
build:
name: Build ${{ matrix.target }}
needs: [web]
runs-on: ${{ matrix.os }}
timeout-minutes: 40
strategy:
fail-fast: false
matrix:
include:
- os: ubuntu-latest
target: aarch64-unknown-linux-gnu
cross_compiler: gcc-aarch64-linux-gnu
linker_env: CARGO_TARGET_AARCH64_UNKNOWN_LINUX_GNU_LINKER
linker: aarch64-linux-gnu-gcc
- os: ubuntu-latest
target: armv7-unknown-linux-gnueabihf
cross_compiler: gcc-arm-linux-gnueabihf
linker_env: CARGO_TARGET_ARMV7_UNKNOWN_LINUX_GNUEABIHF_LINKER
linker: arm-linux-gnueabihf-gcc
- os: macos-15-intel
target: x86_64-apple-darwin
- os: windows-latest
target: x86_64-pc-windows-msvc
steps:
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
- uses: dtolnay/rust-toolchain@631a55b12751854ce901bb631d5902ceb48146f7 # stable
with:
toolchain: 1.92.0
targets: ${{ matrix.target }}
- uses: Swatinem/rust-cache@779680da715d629ac1d338a641029a2f4372abb5 # v2
if: runner.os != 'Windows'
- uses: actions/download-artifact@v8
with:
name: web-dist
path: web/dist/
- name: Install cross compiler
if: matrix.cross_compiler
run: |
sudo apt-get update -qq
sudo apt-get install -y ${{ matrix.cross_compiler }}
- name: Build release
shell: bash
run: |
if [ -n "${{ matrix.linker_env || '' }}" ] && [ -n "${{ matrix.linker || '' }}" ]; then
export "${{ matrix.linker_env }}=${{ matrix.linker }}"
fi
cargo build --release --locked --features channel-matrix,channel-lark,memory-postgres --target ${{ matrix.target }}

View File

@ -1,56 +0,0 @@
name: Deploy Web to GitHub Pages
on:
push:
branches: [main]
paths:
- 'web/**'
workflow_dispatch:
permissions:
contents: read
pages: write
id-token: write
concurrency:
group: "pages"
cancel-in-progress: false
jobs:
build:
runs-on: [self-hosted, Linux, X64, aws-india, blacksmith-2vcpu-ubuntu-2404]
steps:
- name: Checkout
uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
- name: Setup Node
uses: actions/setup-node@49933ea5288caeca8642d1e84afbd3f7d6820020 # v4
with:
node-version: '20'
- name: Install dependencies
working-directory: ./web
run: npm ci
- name: Build
working-directory: ./web
run: npm run build
- name: Setup Pages
uses: actions/configure-pages@983d7736d9b0ae728b81ab479565c72886d7745b # v5
- name: Upload artifact
uses: actions/upload-pages-artifact@7b1f4a764d45c48632c6b24a0339c27f5614fb0b # v4
with:
path: ./web/dist
deploy:
environment:
name: github-pages
url: ${{ steps.deployment.outputs.page_url }}
runs-on: [self-hosted, Linux, X64, aws-india, blacksmith-2vcpu-ubuntu-2404]
needs: build
steps:
- name: Deploy to GitHub Pages
id: deployment
uses: actions/deploy-pages@d6db90164ac5ed86f2b6aed7e0febac5b3c0c03e # v4

View File

@ -1,301 +0,0 @@
name: Docs Deploy
on:
pull_request:
branches: [dev, main]
paths:
- "docs/**"
- "README*.md"
- ".github/workflows/docs-deploy.yml"
- "scripts/ci/docs_quality_gate.sh"
- "scripts/ci/collect_changed_links.py"
- ".github/release/docs-deploy-policy.json"
- "scripts/ci/docs_deploy_guard.py"
push:
branches: [dev, main]
paths:
- "docs/**"
- "README*.md"
- ".github/workflows/docs-deploy.yml"
- "scripts/ci/docs_quality_gate.sh"
- "scripts/ci/collect_changed_links.py"
- ".github/release/docs-deploy-policy.json"
- "scripts/ci/docs_deploy_guard.py"
workflow_dispatch:
inputs:
deploy_target:
description: "preview uploads artifact only; production deploys to Pages"
required: true
default: preview
type: choice
options:
- preview
- production
preview_evidence_run_url:
description: "Required for manual production deploys when policy enforces preview promotion evidence"
required: false
default: ""
rollback_ref:
description: "Optional rollback source ref (tag/sha/ref) for manual production dispatch"
required: false
default: ""
concurrency:
group: docs-deploy-${{ github.event_name }}-${{ github.event.pull_request.number || github.ref_name || github.sha }}
cancel-in-progress: true
permissions:
contents: read
env:
GIT_CONFIG_COUNT: "1"
GIT_CONFIG_KEY_0: core.hooksPath
GIT_CONFIG_VALUE_0: /dev/null
jobs:
docs-quality:
name: Docs Quality Gate
runs-on: [self-hosted, Linux, X64, aws-india, blacksmith-2vcpu-ubuntu-2404]
timeout-minutes: 20
outputs:
docs_files: ${{ steps.scope.outputs.docs_files }}
base_sha: ${{ steps.scope.outputs.base_sha }}
deploy_target: ${{ steps.deploy_guard.outputs.deploy_target }}
deploy_mode: ${{ steps.deploy_guard.outputs.deploy_mode }}
source_ref: ${{ steps.deploy_guard.outputs.source_ref }}
production_branch_ref: ${{ steps.deploy_guard.outputs.production_branch_ref }}
ready_to_deploy: ${{ steps.deploy_guard.outputs.ready_to_deploy }}
docs_preview_retention_days: ${{ steps.deploy_guard.outputs.docs_preview_retention_days }}
docs_guard_artifact_retention_days: ${{ steps.deploy_guard.outputs.docs_guard_artifact_retention_days }}
steps:
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
with:
fetch-depth: 0
- name: Setup Node.js
uses: actions/setup-node@49933ea5288caeca8642d1e84afbd3f7d6820020 # v4
with:
node-version: "22"
- name: Resolve docs diff scope
id: scope
shell: bash
run: |
set -euo pipefail
base_sha=""
docs_files=""
if [ "${GITHUB_EVENT_NAME}" = "pull_request" ]; then
base_sha="${{ github.event.pull_request.base.sha }}"
docs_files="$(git diff --name-only "$base_sha" HEAD | awk '/\.md$|\.mdx$|^README/ {print}')"
elif [ "${GITHUB_EVENT_NAME}" = "push" ]; then
base_sha="${{ github.event.before }}"
if [ -n "$base_sha" ] && [ "$base_sha" != "0000000000000000000000000000000000000000" ]; then
docs_files="$(git diff --name-only "$base_sha" HEAD | awk '/\.md$|\.mdx$|^README/ {print}')"
fi
else
docs_files="$(git ls-files 'docs/**/*.md' 'README*.md')"
fi
{
echo "base_sha=${base_sha}"
echo "docs_files<<EOF"
printf '%s\n' "$docs_files"
echo "EOF"
} >> "$GITHUB_OUTPUT"
- name: Validate docs deploy contract
id: deploy_guard
shell: bash
env:
INPUT_DEPLOY_TARGET: ${{ github.event.inputs.deploy_target || '' }}
INPUT_PREVIEW_EVIDENCE_RUN_URL: ${{ github.event.inputs.preview_evidence_run_url || '' }}
INPUT_ROLLBACK_REF: ${{ github.event.inputs.rollback_ref || '' }}
run: |
set -euo pipefail
mkdir -p artifacts
python3 scripts/ci/docs_deploy_guard.py \
--repo-root "$PWD" \
--event-name "${GITHUB_EVENT_NAME}" \
--git-ref "${GITHUB_REF}" \
--git-sha "${GITHUB_SHA}" \
--input-deploy-target "${INPUT_DEPLOY_TARGET}" \
--input-preview-evidence-run-url "${INPUT_PREVIEW_EVIDENCE_RUN_URL}" \
--input-rollback-ref "${INPUT_ROLLBACK_REF}" \
--policy-file .github/release/docs-deploy-policy.json \
--output-json artifacts/docs-deploy-guard.json \
--output-md artifacts/docs-deploy-guard.md \
--github-output-file "$GITHUB_OUTPUT" \
--fail-on-violation
- name: Emit docs deploy guard audit event
if: always()
shell: bash
run: |
set -euo pipefail
if [ -f artifacts/docs-deploy-guard.json ]; then
python3 scripts/ci/emit_audit_event.py \
--event-type docs_deploy_guard \
--input-json artifacts/docs-deploy-guard.json \
--output-json artifacts/audit-event-docs-deploy-guard.json \
--artifact-name docs-deploy-guard \
--retention-days 21
fi
- name: Publish docs deploy guard summary
if: always()
shell: bash
run: |
set -euo pipefail
if [ -f artifacts/docs-deploy-guard.md ]; then
cat artifacts/docs-deploy-guard.md >> "$GITHUB_STEP_SUMMARY"
fi
- name: Upload docs deploy guard artifacts
if: always()
uses: actions/upload-artifact@b7c566a772e6b6bfb58ed0dc250532a479d7789f # v6.0.0
with:
name: docs-deploy-guard
path: |
artifacts/docs-deploy-guard.json
artifacts/docs-deploy-guard.md
artifacts/audit-event-docs-deploy-guard.json
if-no-files-found: ignore
retention-days: ${{ steps.deploy_guard.outputs.docs_guard_artifact_retention_days || 21 }}
- name: Setup Node.js for markdown lint
uses: actions/setup-node@49933ea5288caeca8642d1e84afbd3f7d6820020 # v4
with:
node-version: "22"
- name: Markdown quality gate
env:
BASE_SHA: ${{ steps.scope.outputs.base_sha }}
DOCS_FILES: ${{ steps.scope.outputs.docs_files }}
run: ./scripts/ci/docs_quality_gate.sh
- name: Collect added links
id: links
if: github.event_name != 'workflow_dispatch'
shell: bash
env:
BASE_SHA: ${{ steps.scope.outputs.base_sha }}
DOCS_FILES: ${{ steps.scope.outputs.docs_files }}
run: |
set -euo pipefail
python3 ./scripts/ci/collect_changed_links.py \
--base "$BASE_SHA" \
--docs-files "$DOCS_FILES" \
--output .ci-added-links.txt
count=$(wc -l < .ci-added-links.txt | tr -d ' ')
echo "count=$count" >> "$GITHUB_OUTPUT"
- name: Link check (added links)
if: github.event_name != 'workflow_dispatch' && steps.links.outputs.count != '0'
uses: lycheeverse/lychee-action@8646ba30535128ac92d33dfc9133794bfdd9b411 # v2
with:
fail: true
args: >-
--offline
--no-progress
--format detailed
.ci-added-links.txt
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
- name: Skip link check (none added)
if: github.event_name != 'workflow_dispatch' && steps.links.outputs.count == '0'
run: echo "No added links detected in changed docs lines."
docs-preview:
name: Docs Preview Artifact
needs: [docs-quality]
if: github.event_name == 'pull_request' || (github.event_name == 'workflow_dispatch' && github.event.inputs.deploy_target == 'preview')
runs-on: [self-hosted, Linux, X64, aws-india, blacksmith-2vcpu-ubuntu-2404]
timeout-minutes: 15
steps:
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
- name: Build preview bundle
shell: bash
run: |
set -euo pipefail
rm -rf site
mkdir -p site/docs
cp -R docs/. site/docs/
cp README.md site/README.md
cat > site/index.md <<'EOF'
# ZeroClaw Docs Preview
This preview bundle is produced by `.github/workflows/docs-deploy.yml`.
- [Repository README](./README.md)
- [Docs Home](./docs/README.md)
EOF
- name: Upload preview artifact
uses: actions/upload-artifact@b7c566a772e6b6bfb58ed0dc250532a479d7789f # v6.0.0
with:
name: docs-preview
path: site/**
if-no-files-found: error
retention-days: ${{ needs.docs-quality.outputs.docs_preview_retention_days || 14 }}
docs-deploy:
name: Deploy Docs to GitHub Pages
needs: [docs-quality]
if: needs.docs-quality.outputs.deploy_target == 'production' && needs.docs-quality.outputs.ready_to_deploy == 'true'
runs-on: [self-hosted, Linux, X64, aws-india, blacksmith-2vcpu-ubuntu-2404]
timeout-minutes: 20
permissions:
contents: read
pages: write
id-token: write
environment:
name: github-pages
url: ${{ steps.deployment.outputs.page_url }}
steps:
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
with:
ref: ${{ needs.docs-quality.outputs.source_ref }}
- name: Build deploy bundle
shell: bash
run: |
set -euo pipefail
rm -rf site
mkdir -p site/docs
cp -R docs/. site/docs/
cp README.md site/README.md
cat > site/index.md <<'EOF'
# ZeroClaw Documentation
This site is deployed automatically from `main` by `.github/workflows/docs-deploy.yml`.
- [Repository README](./README.md)
- [Docs Home](./docs/README.md)
EOF
- name: Publish deploy source summary
shell: bash
run: |
{
echo "## Docs Deploy Source"
echo "- Deploy mode: \`${{ needs.docs-quality.outputs.deploy_mode }}\`"
echo "- Source ref: \`${{ needs.docs-quality.outputs.source_ref }}\`"
echo "- Production branch ref: \`${{ needs.docs-quality.outputs.production_branch_ref }}\`"
} >> "$GITHUB_STEP_SUMMARY"
- name: Setup Pages
uses: actions/configure-pages@983d7736d9b0ae728b81ab479565c72886d7745b # v5
- name: Upload Pages artifact
uses: actions/upload-pages-artifact@7b1f4a764d45c48632c6b24a0339c27f5614fb0b # v4
with:
path: site
- name: Deploy to GitHub Pages
id: deployment
uses: actions/deploy-pages@d6db90164ac5ed86f2b6aed7e0febac5b3c0c03e # v4

View File

@ -1,380 +0,0 @@
name: Feature Matrix
on:
push:
branches: [dev]
paths:
- "Cargo.toml"
- "Cargo.lock"
- "src/**"
- "crates/**"
- "scripts/ci/nightly_matrix_report.py"
- ".github/release/nightly-owner-routing.json"
- ".github/workflows/feature-matrix.yml"
pull_request:
branches: [dev, main]
types: [labeled]
merge_group:
branches: [dev, main]
schedule:
- cron: "30 4 * * 1" # Weekly Monday 04:30 UTC
- cron: "15 3 * * *" # Daily 03:15 UTC (nightly profile)
workflow_dispatch:
inputs:
profile:
description: "compile = merge-gate matrix, nightly = integration-oriented lane commands"
required: true
default: compile
type: choice
options:
- compile
- nightly
fail_on_failure:
description: "Fail summary job when any lane fails"
required: true
default: true
type: boolean
concurrency:
group: feature-matrix-${{ github.event.pull_request.number || github.ref || github.run_id }}-${{ github.event.inputs.profile || 'auto' }}
cancel-in-progress: true
permissions:
contents: read
env:
GIT_CONFIG_COUNT: "1"
GIT_CONFIG_KEY_0: core.hooksPath
GIT_CONFIG_VALUE_0: /dev/null
CARGO_TERM_COLOR: always
jobs:
resolve-profile:
name: Resolve Matrix Profile
runs-on: [self-hosted, Linux, X64, aws-india, blacksmith-2vcpu-ubuntu-2404]
outputs:
profile: ${{ steps.resolve.outputs.profile }}
lane_job_prefix: ${{ steps.resolve.outputs.lane_job_prefix }}
summary_job_name: ${{ steps.resolve.outputs.summary_job_name }}
lane_retention_days: ${{ steps.resolve.outputs.lane_retention_days }}
lane_timeout_minutes: ${{ steps.resolve.outputs.lane_timeout_minutes }}
max_attempts: ${{ steps.resolve.outputs.max_attempts }}
summary_artifact_name: ${{ steps.resolve.outputs.summary_artifact_name }}
summary_json_name: ${{ steps.resolve.outputs.summary_json_name }}
summary_md_name: ${{ steps.resolve.outputs.summary_md_name }}
lane_artifact_prefix: ${{ steps.resolve.outputs.lane_artifact_prefix }}
fail_on_failure: ${{ steps.resolve.outputs.fail_on_failure }}
collect_history: ${{ steps.resolve.outputs.collect_history }}
steps:
- name: Resolve effective profile
id: resolve
shell: bash
run: |
set -euo pipefail
profile="compile"
fail_on_failure="true"
lane_job_prefix="Matrix Lane"
summary_job_name="Feature Matrix Summary"
lane_retention_days="21"
lane_timeout_minutes="55"
max_attempts="1"
summary_artifact_name="feature-matrix-summary"
summary_json_name="feature-matrix-summary.json"
summary_md_name="feature-matrix-summary.md"
lane_artifact_prefix="feature-matrix"
collect_history="false"
if [ "${GITHUB_EVENT_NAME}" = "schedule" ] && [ "${{ github.event.schedule }}" = "15 3 * * *" ]; then
profile="nightly"
elif [ "${GITHUB_EVENT_NAME}" = "workflow_dispatch" ]; then
profile="${{ github.event.inputs.profile || 'compile' }}"
fail_on_failure="${{ github.event.inputs.fail_on_failure || 'true' }}"
fi
if [ "$profile" = "nightly" ]; then
lane_job_prefix="Nightly Lane"
summary_job_name="Nightly Summary & Routing"
lane_retention_days="30"
lane_timeout_minutes="70"
max_attempts="2"
summary_artifact_name="nightly-all-features-summary"
summary_json_name="nightly-summary.json"
summary_md_name="nightly-summary.md"
lane_artifact_prefix="nightly-lane"
collect_history="true"
fi
{
echo "profile=${profile}"
echo "lane_job_prefix=${lane_job_prefix}"
echo "summary_job_name=${summary_job_name}"
echo "lane_retention_days=${lane_retention_days}"
echo "lane_timeout_minutes=${lane_timeout_minutes}"
echo "max_attempts=${max_attempts}"
echo "summary_artifact_name=${summary_artifact_name}"
echo "summary_json_name=${summary_json_name}"
echo "summary_md_name=${summary_md_name}"
echo "lane_artifact_prefix=${lane_artifact_prefix}"
echo "fail_on_failure=${fail_on_failure}"
echo "collect_history=${collect_history}"
} >> "$GITHUB_OUTPUT"
feature-check:
name: ${{ needs.resolve-profile.outputs.lane_job_prefix }} (${{ matrix.name }})
needs: [resolve-profile]
if: >-
github.event_name != 'pull_request' ||
contains(github.event.pull_request.labels.*.name, 'ci:full') ||
contains(github.event.pull_request.labels.*.name, 'ci:feature-matrix')
runs-on: [self-hosted, Linux, X64, aws-india, blacksmith-2vcpu-ubuntu-2404]
timeout-minutes: ${{ fromJSON(needs.resolve-profile.outputs.lane_timeout_minutes) }}
strategy:
fail-fast: false
matrix:
include:
- name: default
compile_command: cargo check --locked
nightly_command: cargo test --locked --test agent_e2e --verbose
install_libudev: false
- name: whatsapp-web
compile_command: cargo check --locked --no-default-features --features whatsapp-web
nightly_command: cargo check --locked --no-default-features --features whatsapp-web --verbose
install_libudev: false
- name: browser-native
compile_command: cargo check --locked --no-default-features --features browser-native
nightly_command: cargo check --locked --no-default-features --features browser-native --verbose
install_libudev: false
- name: nightly-all-features
compile_command: cargo check --locked --all-features
nightly_command: cargo test --locked --all-features --test agent_e2e --verbose
install_libudev: true
steps:
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
- uses: dtolnay/rust-toolchain@631a55b12751854ce901bb631d5902ceb48146f7 # stable
with:
toolchain: 1.92.0
- name: Ensure cargo component
shell: bash
env:
ENSURE_CARGO_COMPONENT_STRICT: "true"
run: bash ./scripts/ci/ensure_cargo_component.sh 1.92.0
- uses: Swatinem/rust-cache@779680da715d629ac1d338a641029a2f4372abb5 # v3
with:
prefix-key: feature-matrix-${{ matrix.name }}
- name: Ensure Linux deps for all-features lane
if: matrix.install_libudev
shell: bash
run: |
set -euo pipefail
if command -v pkg-config >/dev/null 2>&1 && pkg-config --exists libudev; then
echo "libudev development headers already available; skipping apt install."
exit 0
fi
echo "Installing missing libudev build dependencies..."
for attempt in 1 2 3; do
if sudo apt-get update -qq -o DPkg::Lock::Timeout=300 && \
sudo apt-get install -y --no-install-recommends --no-upgrade -o DPkg::Lock::Timeout=300 libudev-dev pkg-config; then
echo "Dependency installation succeeded on attempt ${attempt}."
exit 0
fi
if [ "$attempt" -eq 3 ]; then
echo "Failed to install libudev-dev/pkg-config after ${attempt} attempts." >&2
exit 1
fi
echo "Dependency installation failed on attempt ${attempt}; retrying in 10s..."
sleep 10
done
- name: Run matrix lane command
id: lane
shell: bash
run: |
set -euo pipefail
mkdir -p artifacts
profile="${{ needs.resolve-profile.outputs.profile }}"
lane_command="${{ matrix.compile_command }}"
if [ "$profile" = "nightly" ]; then
lane_command="${{ matrix.nightly_command }}"
fi
max_attempts="${{ needs.resolve-profile.outputs.max_attempts }}"
attempt=1
status=1
started_at="$(date +%s)"
while [ "$attempt" -le "$max_attempts" ]; do
echo "Running lane command (attempt ${attempt}/${max_attempts}): ${lane_command}"
set +e
bash -lc "${lane_command}"
status=$?
set -e
if [ "$status" -eq 0 ]; then
break
fi
if [ "$attempt" -lt "$max_attempts" ]; then
sleep 5
fi
attempt="$((attempt + 1))"
done
finished_at="$(date +%s)"
duration="$((finished_at - started_at))"
lane_status="success"
if [ "$status" -ne 0 ]; then
lane_status="failure"
fi
cat > "artifacts/nightly-result-${{ matrix.name }}.json" <<EOF
{
"lane": "${{ matrix.name }}",
"mode": "${profile}",
"status": "${lane_status}",
"exit_code": ${status},
"duration_seconds": ${duration},
"command": "${lane_command}",
"attempts_used": ${attempt},
"max_attempts": ${max_attempts}
}
EOF
{
echo "### ${{ needs.resolve-profile.outputs.lane_job_prefix }}: ${{ matrix.name }}"
echo "- Profile: \`${profile}\`"
echo "- Command: \`${lane_command}\`"
echo "- Status: ${lane_status}"
echo "- Exit code: ${status}"
echo "- Duration (s): ${duration}"
echo "- Attempts: ${attempt}/${max_attempts}"
} >> "$GITHUB_STEP_SUMMARY"
echo "lane_status=${lane_status}" >> "$GITHUB_OUTPUT"
echo "lane_exit_code=${status}" >> "$GITHUB_OUTPUT"
- name: Upload lane report
if: always()
uses: actions/upload-artifact@b7c566a772e6b6bfb58ed0dc250532a479d7789f # v6.0.0
with:
name: ${{ needs.resolve-profile.outputs.lane_artifact_prefix }}-${{ matrix.name }}
path: artifacts/nightly-result-${{ matrix.name }}.json
if-no-files-found: error
retention-days: ${{ fromJSON(needs.resolve-profile.outputs.lane_retention_days) }}
- name: Enforce lane success
if: steps.lane.outputs.lane_status != 'success'
shell: bash
run: |
set -euo pipefail
code="${{ steps.lane.outputs.lane_exit_code }}"
if [[ "$code" =~ ^[0-9]+$ ]]; then
# shellcheck disable=SC2242
exit "$code"
fi
echo "Invalid lane exit code: $code" >&2
exit 1
summary:
name: ${{ needs.resolve-profile.outputs.summary_job_name }}
needs: [resolve-profile, feature-check]
if: always()
runs-on: [self-hosted, Linux, X64, aws-india, blacksmith-2vcpu-ubuntu-2404]
steps:
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
- name: Download lane reports
uses: actions/download-artifact@37930b1c2abaa49bbe596cd826c3c89aef350131 # v7.0.0
with:
path: artifacts
- name: Collect recent nightly history
if: needs.resolve-profile.outputs.collect_history == 'true'
uses: actions/github-script@ed597411d8f924073f98dfc5c65a23a2325f34cd # v8
with:
script: |
const fs = require("fs");
const path = require("path");
const workflowId = "feature-matrix.yml";
const owner = context.repo.owner;
const repo = context.repo.repo;
const events = ["schedule", "workflow_dispatch"];
let runs = [];
for (const event of events) {
const resp = await github.rest.actions.listWorkflowRuns({
owner,
repo,
workflow_id: workflowId,
branch: "dev",
event,
per_page: 20,
});
runs = runs.concat(resp.data.workflow_runs || []);
}
const currentRunId = context.runId;
runs = runs
.filter((run) => run.id !== currentRunId && run.status === "completed")
.sort((a, b) => new Date(b.created_at).getTime() - new Date(a.created_at).getTime())
.slice(0, 3)
.map((run) => ({
run_id: run.id,
url: run.html_url,
event: run.event,
conclusion: run.conclusion || "unknown",
created_at: run.created_at,
head_sha: run.head_sha,
display_title: run.display_title || "",
}));
fs.mkdirSync("artifacts", { recursive: true });
fs.writeFileSync(
path.join("artifacts", "nightly-history.json"),
`${JSON.stringify(runs, null, 2)}\n`,
{ encoding: "utf8" }
);
- name: Aggregate matrix summary
shell: bash
run: |
set -euo pipefail
args=(
--input-dir artifacts
--owners-file .github/release/nightly-owner-routing.json
--output-json "artifacts/${{ needs.resolve-profile.outputs.summary_json_name }}"
--output-md "artifacts/${{ needs.resolve-profile.outputs.summary_md_name }}"
)
if [ "${{ needs.resolve-profile.outputs.collect_history }}" = "true" ] && [ -f artifacts/nightly-history.json ]; then
args+=(--history-file artifacts/nightly-history.json)
fi
if [ "${{ needs.resolve-profile.outputs.fail_on_failure }}" = "true" ]; then
args+=(--fail-on-failure)
fi
python3 scripts/ci/nightly_matrix_report.py "${args[@]}"
- name: Publish summary
shell: bash
run: |
set -euo pipefail
cat "artifacts/${{ needs.resolve-profile.outputs.summary_md_name }}" >> "$GITHUB_STEP_SUMMARY"
- name: Upload summary artifact
if: always()
uses: actions/upload-artifact@b7c566a772e6b6bfb58ed0dc250532a479d7789f # v6.0.0
with:
name: ${{ needs.resolve-profile.outputs.summary_artifact_name }}
path: |
artifacts/${{ needs.resolve-profile.outputs.summary_json_name }}
artifacts/${{ needs.resolve-profile.outputs.summary_md_name }}
artifacts/nightly-history.json
if-no-files-found: error
retention-days: ${{ fromJSON(needs.resolve-profile.outputs.lane_retention_days) }}

View File

@ -1,263 +0,0 @@
# Main Branch Delivery Flows
This document explains what runs when code is proposed to `dev`/`main`, merged to `main`, and released.
Use this with:
- [`docs/ci-map.md`](../../docs/ci-map.md)
- [`docs/pr-workflow.md`](../../docs/pr-workflow.md)
- [`docs/release-process.md`](../../docs/release-process.md)
## Event Summary
| Event | Main workflows |
| --- | --- |
| PR activity (`pull_request_target`) | `pr-intake-checks.yml`, `pr-labeler.yml`, `pr-auto-response.yml` |
| PR activity (`pull_request`) | `ci-run.yml`, `sec-audit.yml`, plus path-scoped workflows |
| Push to `dev`/`main` | `ci-run.yml`, `sec-audit.yml`, plus path-scoped workflows |
| Tag push (`v*`) | `pub-release.yml` publish mode, `pub-docker-img.yml` publish job |
| Scheduled/manual | `pub-release.yml` verification mode, `sec-codeql.yml`, `feature-matrix.yml`, `test-fuzz.yml`, `pr-check-stale.yml`, `pr-check-status.yml`, `ci-queue-hygiene.yml`, `sync-contributors.yml`, `test-benchmarks.yml`, `test-e2e.yml` |
## Runtime and Docker Matrix
Observed averages below are from recent completed runs (sampled from GitHub Actions on February 17, 2026). Values are directional, not SLA.
| Workflow | Typical trigger in main flow | Avg runtime | Docker build? | Docker run? | Docker push? |
| --- | --- | ---:| --- | --- | --- |
| `pr-intake-checks.yml` | PR open/update (`pull_request_target`) | 14.5s | No | No | No |
| `pr-labeler.yml` | PR open/update (`pull_request_target`) | 53.7s | No | No | No |
| `pr-auto-response.yml` | PR/issue automation | 24.3s | No | No | No |
| `ci-run.yml` | PR + push to `dev`/`main` | 74.7s | No | No | No |
| `sec-audit.yml` | PR + push to `dev`/`main` | 127.2s | No | No | No |
| `workflow-sanity.yml` | Workflow-file changes | 34.2s | No | No | No |
| `pr-label-policy-check.yml` | Label policy/automation changes | 14.7s | No | No | No |
| `pub-docker-img.yml` (`pull_request`) | Docker build-input PR changes | 240.4s | Yes | Yes | No |
| `pub-docker-img.yml` (`push`) | tag push `v*` | 139.9s | Yes | No | Yes |
| `pub-release.yml` | Tag push `v*` (publish) + manual/scheduled verification (no publish) | N/A in recent sample | No | No | No |
Notes:
1. `pub-docker-img.yml` is the only workflow in the main PR/push path that builds Docker images.
2. Container runtime verification (`docker run`) occurs in PR smoke only.
3. Container registry push occurs on tag pushes (`v*`) only.
4. `ci-run.yml` "Build (Smoke)" builds Rust binaries, not Docker images.
## Step-By-Step
### 1) PR from branch in this repository -> `dev`
1. Contributor opens or updates PR against `dev`.
2. `pull_request_target` automation runs (typical runtime):
- `pr-intake-checks.yml` posts intake warnings/errors.
- `pr-labeler.yml` sets size/risk/scope labels.
- `pr-auto-response.yml` runs first-interaction and label routes.
3. `pull_request` CI workflows start:
- `ci-run.yml`
- `feature-matrix.yml` (Rust/workflow path scope)
- `sec-audit.yml`
- `sec-codeql.yml` (if Rust/codeql paths changed)
- path-scoped workflows if matching files changed:
- `pub-docker-img.yml` (Docker build-input paths only)
- `docs-deploy.yml` (docs + README markdown paths; deploy contract guard enforces promotion + rollback ref policy)
- `workflow-sanity.yml` (workflow files only)
- `pr-label-policy-check.yml` (label-policy files only)
- `ci-change-audit.yml` (CI/security path changes)
- `ci-provider-connectivity.yml` (probe config/script/workflow changes)
- `ci-reproducible-build.yml` (Rust/build reproducibility paths)
4. In `ci-run.yml`, `changes` computes:
- `docs_only`
- `docs_changed`
- `rust_changed`
- `workflow_changed`
5. `build` runs for Rust-impacting changes.
6. On PRs, full lint/test/docs checks run when PR has label `ci:full`:
- `lint`
- `lint-strict-delta`
- `test`
- `flake-probe` (single-retry telemetry; optional block via `CI_BLOCK_ON_FLAKE_SUSPECTED`)
- `docs-quality`
7. If root license files (`LICENSE-APACHE`, `LICENSE-MIT`) changed, `license-file-owner-guard` allows only PR author `willsarg`.
8. `lint-feedback` posts actionable comment if lint/docs gates fail.
9. `CI Required Gate` aggregates results to final pass/fail.
10. Maintainer merges PR once checks and review policy are satisfied.
11. Merge emits a `push` event on `dev` (see scenario 4).
### 2) PR from fork -> `dev`
1. External contributor opens PR from `fork/<branch>` into `zeroclaw:dev`.
2. Immediately on `opened`:
- `pull_request_target` workflows start with base-repo context and base-repo token:
- `pr-intake-checks.yml`
- `pr-labeler.yml`
- `pr-auto-response.yml`
- `pull_request` workflows are queued for the fork head commit:
- `ci-run.yml`
- `sec-audit.yml`
- path-scoped workflows (`pub-docker-img.yml`, `workflow-sanity.yml`, `pr-label-policy-check.yml`) if changed files match.
3. Fork-specific permission behavior in `pull_request` workflows:
- token is restricted (read-focused), so jobs that try to write PR comments/status extras can be limited.
- secrets from the base repo are not exposed to fork PR `pull_request` jobs.
4. Approval gate possibility:
- if Actions settings require maintainer approval for fork workflows, the `pull_request` run stays in `action_required`/waiting state until approved.
5. Event fan-out after labeling:
- manual label changes emit `labeled`/`unlabeled` events.
- those events retrigger only label-driven `pull_request_target` automation (`pr-auto-response.yml`); `pr-labeler.yml` now runs only on PR lifecycle events (`opened`/`reopened`/`synchronize`/`ready_for_review`) to reduce churn.
6. When contributor pushes new commits to fork branch (`synchronize`):
- reruns: `pr-intake-checks.yml`, `pr-labeler.yml`, `ci-run.yml`, `sec-audit.yml`, and matching path-scoped PR workflows.
- does not rerun `pr-auto-response.yml` unless label/open events occur.
7. `ci-run.yml` execution details for fork PR:
- `changes` computes `docs_only`, `docs_changed`, `rust_changed`, `workflow_changed`.
- `build` runs for Rust-impacting changes.
- `lint`/`lint-strict-delta`/`test`/`docs-quality` run on PR when `ci:full` label exists.
- `CI Required Gate` emits final pass/fail for the PR head.
8. Fork PR merge blockers to check first when diagnosing stalls:
- run approval pending for fork workflows.
- `license-file-owner-guard` failing when root license files are modified by non-owner PR author.
- `CI Required Gate` failure caused by upstream jobs.
- repeated `pull_request_target` reruns from label churn causing noisy signals.
9. After merge, normal `push` workflows on `dev` execute (scenario 4).
### 3) PR to `main` (direct or from `dev`)
1. Contributor or maintainer opens PR with base `main`.
2. `ci-run.yml` and `sec-audit.yml` run on the PR, plus any path-scoped workflows.
3. Maintainer merges PR once checks and review policy pass.
4. Merge emits a `push` event on `main`.
### 4) Push/Merge Queue to `dev` or `main` (including after merge)
1. Commit reaches `dev` or `main` (usually from a merged PR), or merge queue creates a `merge_group` validation commit.
2. `ci-run.yml` runs on `push` and `merge_group`.
3. `feature-matrix.yml` runs on `push` to `dev` for Rust/workflow paths and on `merge_group`.
4. `sec-audit.yml` runs on `push` and `merge_group`.
5. `sec-codeql.yml` runs on `push`/`merge_group` when Rust/codeql paths change (path-scoped on push).
6. `ci-supply-chain-provenance.yml` runs on push when Rust/build provenance paths change.
7. Path-filtered workflows run only if touched files match their filters.
8. In `ci-run.yml`, push/merge-group behavior differs from PR behavior:
- Rust path: `lint`, `lint-strict-delta`, `test`, `build` are expected.
- Docs/non-rust paths: fast-path behavior applies.
9. `CI Required Gate` computes overall push/merge-group result.
## Docker Publish Logic
Workflow: `.github/workflows/pub-docker-img.yml`
### PR behavior
1. Triggered on `pull_request` to `dev` or `main` when Docker build-input paths change.
2. Runs `PR Docker Smoke` job:
- Builds local smoke image with Buildx builder.
- Verifies container with `docker run ... --version`.
3. Typical runtime in recent sample: ~240.4s.
4. No registry push happens on PR events.
### Push behavior
1. `publish` job runs on tag pushes `v*` only.
2. Workflow trigger includes semantic version tag pushes (`v*`) only.
3. Login to `ghcr.io` uses `${{ github.actor }}` and `${{ secrets.GITHUB_TOKEN }}`.
4. Tag computation includes semantic tag from pushed git tag (`vX.Y.Z`) + SHA tag (`sha-<12>`) + `latest`.
5. Multi-platform publish is used for tag pushes (`linux/amd64,linux/arm64`).
6. `scripts/ci/ghcr_publish_contract_guard.py` validates anonymous pullability and digest parity across `vX.Y.Z`, `sha-<12>`, and `latest`, then emits rollback candidate mapping evidence.
7. Trivy scans are emitted for version, SHA, and latest references.
8. `scripts/ci/ghcr_vulnerability_gate.py` validates Trivy JSON outputs against `.github/release/ghcr-vulnerability-policy.json` and emits audit-event evidence.
9. Typical runtime in recent sample: ~139.9s.
10. Result: pushed image tags under `ghcr.io/<owner>/<repo>` with publish-contract + vulnerability-gate + scan artifacts.
Important: Docker publish now requires a `v*` tag push; regular `dev`/`main` branch pushes do not publish images.
## Release Logic
Workflow: `.github/workflows/pub-release.yml`
1. Trigger modes:
- Tag push `v*` -> publish mode.
- Manual dispatch -> verification-only or publish mode (input-driven).
- Weekly schedule -> verification-only mode.
2. `prepare` resolves release context (`release_ref`, `release_tag`, publish/draft mode) and runs `scripts/ci/release_trigger_guard.py`.
- publish mode enforces actor authorization, stable annotated tag policy, `origin/main` ancestry, and `release_tag` == `Cargo.toml` version at the tag commit.
- trigger provenance is emitted as `release-trigger-guard` artifacts.
3. `build-release` builds matrix artifacts across Linux/macOS/Windows targets.
4. `verify-artifacts` runs `scripts/ci/release_artifact_guard.py` against `.github/release/release-artifact-contract.json` in verify-stage mode (archive contract required; manifest/SBOM/notice checks intentionally skipped) and uploads `release-artifact-guard-verify` evidence.
5. In publish mode, workflow generates SBOM (`CycloneDX` + `SPDX`), `SHA256SUMS`, and a checksum provenance statement (`zeroclaw.sha256sums.intoto.json`) plus audit-event envelope.
6. In publish mode, after manifest generation, workflow reruns `release_artifact_guard.py` in full-contract mode and emits `release-artifact-guard.publish.json` plus `audit-event-release-artifact-guard-publish.json`.
7. In publish mode, workflow keyless-signs release artifacts and composes a supply-chain release-notes preface via `release_notes_with_supply_chain_refs.py`.
8. In publish mode, workflow verifies GHCR release-tag availability.
9. In publish mode, workflow creates/updates the GitHub Release for the resolved tag and commit-ish, combining generated supply-chain preface with GitHub auto-generated commit notes.
Pre-release path:
1. Pre-release tags (`vX.Y.Z-alpha.N`, `vX.Y.Z-beta.N`, `vX.Y.Z-rc.N`) trigger `.github/workflows/pub-prerelease.yml`.
2. `scripts/ci/prerelease_guard.py` enforces stage progression, `origin/main` ancestry, and Cargo version/tag alignment.
3. In publish mode, prerelease assets are attached to a GitHub prerelease for the stage tag.
Canary policy lane:
1. `.github/workflows/ci-canary-gate.yml` runs weekly or manually.
2. `scripts/ci/canary_guard.py` evaluates metrics against `.github/release/canary-policy.json`.
3. Decision output is explicit (`promote`, `hold`, `abort`) with auditable artifacts and optional dispatch signal.
## Merge/Policy Notes
1. Workflow-file changes (`.github/workflows/**`) are validated through `pr-intake-checks.yml`, `ci-change-audit.yml`, and `CI Required Gate` without a dedicated owner-approval gate.
2. PR lint/test strictness is intentionally controlled by `ci:full` label.
4. `sec-audit.yml` runs on PR/push/merge queue (`merge_group`), plus scheduled weekly.
5. `ci-change-audit.yml` enforces pinned `uses:` references for CI/security workflow changes.
6. `sec-audit.yml` includes deny policy hygiene checks (`deny_policy_guard.py`) before cargo-deny.
7. `sec-audit.yml` includes gitleaks allowlist governance checks (`secrets_governance_guard.py`) against `.github/security/gitleaks-allowlist-governance.json`.
8. `ci-reproducible-build.yml` and `ci-supply-chain-provenance.yml` provide scheduled supply-chain assurance signals outside release-only windows.
9. Some workflows are operational and non-merge-path (`pr-check-stale`, `pr-check-status`, `sync-contributors`, etc.).
10. Workflow-specific JavaScript helpers are organized under `.github/workflows/scripts/`.
11. `ci-run.yml` includes cache partitioning (`prefix-key`) across lint/test/build/flake-probe lanes to reduce cache contention.
12. `ci-rollback.yml` provides a guarded rollback planning lane (scheduled dry-run + manual execute controls) with audit artifacts.
13. `ci-queue-hygiene.yml` periodically deduplicates superseded queued runs for lightweight PR automation workflows to reduce queue pressure.
## Mermaid Diagrams
### PR to Dev
```mermaid
flowchart TD
A["PR opened or updated -> dev"] --> B["pull_request_target lane"]
B --> B1["pr-intake-checks.yml"]
B --> B2["pr-labeler.yml"]
B --> B3["pr-auto-response.yml"]
A --> C["pull_request CI lane"]
C --> C1["ci-run.yml"]
C --> C2["sec-audit.yml"]
C --> C3["pub-docker-img.yml (if Docker paths changed)"]
C --> C4["workflow-sanity.yml (if workflow files changed)"]
C --> C5["pr-label-policy-check.yml (if policy files changed)"]
C1 --> D["CI Required Gate"]
D --> E{"Checks + review policy pass?"}
E -->|No| F["PR stays open"]
E -->|Yes| G["Merge PR"]
G --> H["push event on dev"]
```
### Main Delivery and Release
```mermaid
flowchart TD
D0["Commit reaches dev"] --> B0["ci-run.yml"]
D0 --> C0["sec-audit.yml"]
PRM["PR to main"] --> QM["ci-run.yml + sec-audit.yml (+ path-scoped)"]
QM --> M["Merge to main"]
M --> A["Commit reaches main"]
A --> B["ci-run.yml"]
A --> C["sec-audit.yml"]
A --> D["path-scoped workflows (if matched)"]
T["Tag push v*"] --> R["pub-release.yml"]
W["Manual/Scheduled release verify"] --> R
T --> DP["pub-docker-img.yml publish job"]
R --> R1["Artifacts + SBOM + checksums + signatures + GitHub Release"]
W --> R2["Verification build only (no GitHub Release publish)"]
DP --> P1["Push ghcr image tags (version + sha + latest)"]
```
## Quick Troubleshooting
1. Unexpected skipped jobs: inspect `scripts/ci/detect_change_scope.sh` outputs.
2. CI/CD-change PR blocked: verify `@chumyin` approved review is present.
3. Fork PR appears stalled: check whether Actions run approval is pending.
4. Docker not published: confirm a `v*` tag was pushed to the intended commit.

130
.github/workflows/master-branch-flow.md vendored Normal file
View File

@ -0,0 +1,130 @@
# Master Branch Delivery Flows
This document explains what runs when code is proposed to `master` and released.
Use this with:
- [`docs/ci-map.md`](../../docs/contributing/ci-map.md)
- [`docs/pr-workflow.md`](../../docs/contributing/pr-workflow.md)
- [`docs/release-process.md`](../../docs/contributing/release-process.md)
## Branching Model
ZeroClaw uses a single default branch: `master`. All contributor PRs target `master` directly. There is no `dev` or promotion branch.
Current maintainers with PR approval authority: `theonlyhennygod`, `JordanTheJet`, and `SimianAstronaut7`.
## Active Workflows
| File | Trigger | Purpose |
| --- | --- | --- |
| `checks-on-pr.yml` | `pull_request``master` | Lint + test + build + security audit on every PR |
| `cross-platform-build-manual.yml` | `workflow_dispatch` | Full platform build matrix (manual) |
| `release-beta-on-push.yml` | `push``master` | Beta release on every master commit |
| `release-stable-manual.yml` | `workflow_dispatch` | Stable release (manual, version-gated) |
## Event Summary
| Event | Workflows triggered |
| --- | --- |
| PR opened or updated against `master` | `checks-on-pr.yml` |
| Push to `master` (including after merge) | `release-beta-on-push.yml` |
| Manual dispatch | `cross-platform-build-manual.yml`, `release-stable-manual.yml` |
## Step-By-Step
### 1) PR → `master`
1. Contributor opens or updates a PR against `master`.
2. `checks-on-pr.yml` starts:
- `lint` job: runs `cargo fmt --check` and `cargo clippy -D warnings`.
- `test` job: runs `cargo nextest run --locked` on `ubuntu-latest` with Rust 1.92.0 and mold linker.
- `build` job (matrix): compiles release binary on `x86_64-unknown-linux-gnu` and `aarch64-apple-darwin`.
- `security` job: runs `cargo audit` and `cargo deny check licenses sources`.
- Concurrency group cancels in-progress runs for the same PR on new pushes.
3. All jobs must pass before merge.
4. Maintainer (`theonlyhennygod`, `JordanTheJet`, or `SimianAstronaut7`) merges PR once checks and review policy are satisfied.
5. Merge emits a `push` event on `master` (see section 2).
### 2) Push to `master` (including after merge)
1. Commit reaches `master`.
2. `release-beta-on-push.yml` (Release Beta) starts:
- `version` job: computes beta tag as `v{cargo_version}-beta.{run_number}`.
- `build` job (matrix, 4 targets): `x86_64-linux`, `aarch64-linux`, `aarch64-darwin`, `x86_64-windows`.
- `publish` job: generates `SHA256SUMS`, creates a GitHub pre-release with all artifacts. Artifact retention: 7 days.
- `docker` job: builds multi-platform image (`linux/amd64,linux/arm64`) and pushes to `ghcr.io` with `:beta` and the versioned beta tag.
3. This runs on every push to `master` without filtering. Every merged PR produces a beta pre-release.
### 3) Stable Release (manual)
1. Maintainer runs `release-stable-manual.yml` via `workflow_dispatch` with a version input (e.g. `0.2.0`).
2. `validate` job checks:
- Input matches semver `X.Y.Z` format.
- `Cargo.toml` version matches input exactly.
- Tag `vX.Y.Z` does not already exist on the remote.
3. `build` job (matrix, same 4 targets as beta): compiles release binary.
4. `publish` job: generates `SHA256SUMS`, creates a stable GitHub Release (not pre-release). Artifact retention: 14 days.
5. `docker` job: pushes to `ghcr.io` with `:latest` and `:vX.Y.Z`.
### 4) Full Platform Build (manual)
1. Maintainer runs `cross-platform-build-manual.yml` via `workflow_dispatch`.
2. `build` job (matrix, 3 targets): `aarch64-linux-gnu`, `x86_64-darwin` (macOS 15 Intel), `x86_64-windows-msvc`.
3. Build-only, no tests, no publish. Used to verify cross-compilation on platforms not covered by `checks-on-pr.yml`.
## Build Targets by Workflow
| Target | `checks-on-pr.yml` | `cross-platform-build-manual.yml` | `release-beta-on-push.yml` | `release-stable-manual.yml` |
| --- | :---: | :---: | :---: | :---: |
| `x86_64-unknown-linux-gnu` | ✓ | | ✓ | ✓ |
| `aarch64-unknown-linux-gnu` | | ✓ | ✓ | ✓ |
| `aarch64-apple-darwin` | ✓ | | ✓ | ✓ |
| `x86_64-apple-darwin` | | ✓ | | |
| `x86_64-pc-windows-msvc` | ✓ | ✓ | ✓ | ✓ |
## Mermaid Diagrams
### PR to Master
```mermaid
flowchart TD
A["PR opened or updated → master"] --> B["checks-on-pr.yml"]
B --> B0["lint: fmt + clippy"]
B --> B1["test: cargo nextest (ubuntu-latest)"]
B --> B2["build: x86_64-linux + aarch64-darwin"]
B --> B3["security: audit + deny"]
B0 & B1 & B2 & B3 --> C{"Checks pass?"}
C -->|No| D["PR stays open"]
C -->|Yes| E["Maintainer merges"]
E --> F["push event on master"]
```
### Beta Release (on every master push)
```mermaid
flowchart TD
A["Push to master"] --> B["release-beta-on-push.yml"]
B --> B1["version: compute v{x.y.z}-beta.{N}"]
B1 --> B2["build: 4 targets"]
B2 --> B3["publish: GitHub pre-release + SHA256SUMS"]
B2 --> B4["docker: push ghcr.io :beta + versioned tag"]
```
### Stable Release (manual)
```mermaid
flowchart TD
A["workflow_dispatch: version=X.Y.Z"] --> B["release-stable-manual.yml"]
B --> B1["validate: semver + Cargo.toml + tag uniqueness"]
B1 --> B2["build: 4 targets"]
B2 --> B3["publish: GitHub stable release + SHA256SUMS"]
B2 --> B4["docker: push ghcr.io :latest + :vX.Y.Z"]
```
## Quick Troubleshooting
1. **Quality gate failing on PR**: check `lint` job for formatting/clippy issues; check `test` job for test failures; check `build` job for compile errors; check `security` job for audit/deny failures.
2. **Beta release not appearing**: confirm the push landed on `master` (not another branch); check `release-beta-on-push.yml` run status.
3. **Stable release failing at validate**: ensure `Cargo.toml` version matches the input version and the tag does not already exist.
4. **Full matrix build needed**: run `cross-platform-build-manual.yml` manually from the Actions tab.

View File

@ -1,192 +0,0 @@
name: Nightly All-Features
on:
schedule:
- cron: "15 3 * * *" # Daily 03:15 UTC
workflow_dispatch:
inputs:
fail_on_failure:
description: "Fail workflow when any nightly lane fails"
required: true
default: true
type: boolean
concurrency:
group: nightly-all-features-${{ github.ref || github.run_id }}
cancel-in-progress: true
permissions:
contents: read
env:
GIT_CONFIG_COUNT: "1"
GIT_CONFIG_KEY_0: core.hooksPath
GIT_CONFIG_VALUE_0: /dev/null
CARGO_TERM_COLOR: always
jobs:
nightly-lanes:
name: Nightly Lane (${{ matrix.name }})
runs-on: [self-hosted, Linux, X64, aws-india, blacksmith-2vcpu-ubuntu-2404]
timeout-minutes: 70
strategy:
fail-fast: false
matrix:
include:
- name: default
command: cargo test --locked --test agent_e2e --verbose
install_libudev: false
- name: whatsapp-web
command: cargo check --locked --no-default-features --features whatsapp-web --verbose
install_libudev: false
- name: browser-native
command: cargo check --locked --no-default-features --features browser-native --verbose
install_libudev: false
- name: nightly-all-features
command: cargo test --locked --all-features --test agent_e2e --verbose
install_libudev: true
steps:
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
with:
fetch-depth: 0
- uses: dtolnay/rust-toolchain@631a55b12751854ce901bb631d5902ceb48146f7 # stable
with:
toolchain: 1.92.0
- name: Ensure cargo component
shell: bash
env:
ENSURE_CARGO_COMPONENT_STRICT: "true"
run: bash ./scripts/ci/ensure_cargo_component.sh 1.92.0
- uses: Swatinem/rust-cache@779680da715d629ac1d338a641029a2f4372abb5 # v3
with:
prefix-key: nightly-all-features-${{ matrix.name }}
- name: Ensure Linux deps for all-features lane
if: matrix.install_libudev
shell: bash
run: |
set -euo pipefail
if command -v pkg-config >/dev/null 2>&1 && pkg-config --exists libudev; then
echo "libudev development headers already available; skipping apt install."
exit 0
fi
echo "Installing missing libudev build dependencies..."
for attempt in 1 2 3; do
if sudo apt-get update -qq -o DPkg::Lock::Timeout=300 && \
sudo apt-get install -y --no-install-recommends --no-upgrade -o DPkg::Lock::Timeout=300 libudev-dev pkg-config; then
echo "Dependency installation succeeded on attempt ${attempt}."
exit 0
fi
if [ "$attempt" -eq 3 ]; then
echo "Failed to install libudev-dev/pkg-config after ${attempt} attempts." >&2
exit 1
fi
echo "Dependency installation failed on attempt ${attempt}; retrying in 10s..."
sleep 10
done
- name: Run nightly lane command
id: lane
shell: bash
run: |
set -euo pipefail
mkdir -p artifacts
started_at="$(date +%s)"
set +e
bash -lc "${{ matrix.command }}"
status=$?
set -e
finished_at="$(date +%s)"
duration="$((finished_at - started_at))"
lane_status="success"
if [ "$status" -ne 0 ]; then
lane_status="failure"
fi
cat > "artifacts/nightly-result-${{ matrix.name }}.json" <<EOF
{
"lane": "${{ matrix.name }}",
"status": "${lane_status}",
"exit_code": ${status},
"duration_seconds": ${duration},
"command": "${{ matrix.command }}"
}
EOF
{
echo "### Nightly Lane: ${{ matrix.name }}"
echo "- Command: \`${{ matrix.command }}\`"
echo "- Status: ${lane_status}"
echo "- Exit code: ${status}"
echo "- Duration (s): ${duration}"
} >> "$GITHUB_STEP_SUMMARY"
echo "lane_status=${lane_status}" >> "$GITHUB_OUTPUT"
echo "lane_exit_code=${status}" >> "$GITHUB_OUTPUT"
- name: Upload nightly lane artifact
if: always()
uses: actions/upload-artifact@b7c566a772e6b6bfb58ed0dc250532a479d7789f # v6.0.0
with:
name: nightly-lane-${{ matrix.name }}
path: artifacts/nightly-result-${{ matrix.name }}.json
if-no-files-found: error
retention-days: 30
nightly-summary:
name: Nightly Summary & Routing
needs: [nightly-lanes]
if: always()
runs-on: [self-hosted, Linux, X64, aws-india, blacksmith-2vcpu-ubuntu-2404]
steps:
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
- name: Download nightly artifacts
uses: actions/download-artifact@37930b1c2abaa49bbe596cd826c3c89aef350131 # v7.0.0
with:
path: artifacts
- name: Aggregate nightly report
shell: bash
env:
FAIL_ON_FAILURE_INPUT: ${{ github.event.inputs.fail_on_failure || 'true' }}
run: |
set -euo pipefail
fail_on_failure="true"
if [ "${GITHUB_EVENT_NAME}" = "workflow_dispatch" ]; then
fail_on_failure="${FAIL_ON_FAILURE_INPUT}"
fi
args=()
if [ "$fail_on_failure" = "true" ]; then
args+=(--fail-on-failure)
fi
python3 scripts/ci/nightly_matrix_report.py \
--input-dir artifacts \
--owners-file .github/release/nightly-owner-routing.json \
--output-json artifacts/nightly-summary.json \
--output-md artifacts/nightly-summary.md \
"${args[@]}"
- name: Publish nightly summary
shell: bash
run: |
set -euo pipefail
cat artifacts/nightly-summary.md >> "$GITHUB_STEP_SUMMARY"
- name: Upload nightly summary artifacts
if: always()
uses: actions/upload-artifact@b7c566a772e6b6bfb58ed0dc250532a479d7789f # v6.0.0
with:
name: nightly-all-features-summary
path: |
artifacts/nightly-summary.json
artifacts/nightly-summary.md
if-no-files-found: error
retention-days: 30

View File

@ -1,64 +0,0 @@
name: Deploy GitHub Pages
on:
push:
branches:
- main
paths:
- site/**
- docs/**
- README.md
- .github/workflows/pages-deploy.yml
workflow_dispatch:
permissions:
contents: read
pages: write
id-token: write
concurrency:
group: github-pages
cancel-in-progress: true
jobs:
build:
runs-on: [self-hosted, Linux, X64, aws-india, blacksmith-2vcpu-ubuntu-2404]
steps:
- name: Checkout
uses: actions/checkout@v4
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: 22
cache: npm
cache-dependency-path: site/package-lock.json
- name: Install Dependencies
working-directory: site
run: npm ci
- name: Build Site
working-directory: site
run: npm run build
- name: Configure Pages
uses: actions/configure-pages@v5
- name: Upload Artifact
uses: actions/upload-pages-artifact@v3
with:
path: gh-pages
deploy:
needs: build
runs-on: [self-hosted, Linux, X64, aws-india, blacksmith-2vcpu-ubuntu-2404]
environment:
name: github-pages
url: ${{ steps.deployment.outputs.page_url }}
steps:
- name: Deploy to GitHub Pages
id: deployment
uses: actions/deploy-pages@v4

View File

@ -1,94 +0,0 @@
name: PR Auto Responder
on:
issues:
types: [opened, reopened, labeled, unlabeled]
pull_request_target:
branches: [dev, main]
types: [opened, labeled, unlabeled]
concurrency:
# Keep cancellation within the same lifecycle action to avoid `labeled`
# events canceling an in-flight `opened` run for the same issue/PR.
group: pr-auto-response-${{ github.event.pull_request.number || github.event.issue.number || github.run_id }}-${{ github.event.action || 'unknown' }}
cancel-in-progress: true
permissions: {}
env:
GIT_CONFIG_COUNT: "1"
GIT_CONFIG_KEY_0: core.hooksPath
GIT_CONFIG_VALUE_0: /dev/null
LABEL_POLICY_PATH: .github/label-policy.json
jobs:
contributor-tier-issues:
# Only run for opened/reopened events to avoid duplicate runs with labeled-routes job
if: >-
(github.event_name == 'issues' &&
(github.event.action == 'opened' || github.event.action == 'reopened'))
runs-on: [self-hosted, Linux, X64, aws-india, light, cpu40]
permissions:
contents: read
issues: write
pull-requests: write
steps:
- name: Checkout repository
uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
- name: Apply contributor tier label for issue author
uses: actions/github-script@ed597411d8f924073f98dfc5c65a23a2325f34cd # v8
env:
LABEL_POLICY_PATH: .github/label-policy.json
with:
script: |
const script = require('./.github/workflows/scripts/pr_auto_response_contributor_tier.js');
await script({ github, context, core });
first-interaction:
if: github.event.action == 'opened'
runs-on: [self-hosted, Linux, X64, aws-india, light, cpu40]
permissions:
issues: write
pull-requests: write
steps:
- name: Greet first-time contributors
uses: actions/first-interaction@a1db7729b356323c7988c20ed6f0d33fe31297be # v1
with:
repo_token: ${{ secrets.GITHUB_TOKEN }}
issue_message: |
Thanks for opening this issue.
Before maintainers triage it, please confirm:
- Repro steps are complete and run on latest `main`
- Environment details are included (OS, Rust version, ZeroClaw version)
- Sensitive values are redacted
This helps us keep issue throughput high and response latency low.
pr_message: |
Thanks for contributing to ZeroClaw.
For faster review, please ensure:
- PR template sections are fully completed
- `cargo fmt --all -- --check`, `cargo clippy --all-targets -- -D warnings`, and `cargo test` are included
- If automation/agents were used heavily, add brief workflow notes
- Scope is focused (prefer one concern per PR)
See `CONTRIBUTING.md` and `docs/pr-workflow.md` for full collaboration rules.
labeled-routes:
if: github.event.action == 'labeled'
runs-on: [self-hosted, Linux, X64, aws-india, light, cpu40]
permissions:
contents: read
issues: write
pull-requests: write
steps:
- name: Checkout repository
uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
- name: Handle label-driven responses
uses: actions/github-script@ed597411d8f924073f98dfc5c65a23a2325f34cd # v8
with:
script: |
const script = require('./.github/workflows/scripts/pr_auto_response_labeled_routes.js');
await script({ github, context, core });

View File

@ -1,50 +0,0 @@
name: PR Check Stale
on:
schedule:
- cron: "20 2 * * *"
workflow_dispatch:
permissions: {}
env:
GIT_CONFIG_COUNT: "1"
GIT_CONFIG_KEY_0: core.hooksPath
GIT_CONFIG_VALUE_0: /dev/null
jobs:
stale:
permissions:
issues: write
pull-requests: write
runs-on: [self-hosted, Linux, X64, aws-india, light, cpu40]
timeout-minutes: 10
steps:
- name: Mark stale issues and pull requests
uses: actions/stale@b5d41d4e1d5dceea10e7104786b73624c18a190f # v10.2.0
with:
repo-token: ${{ secrets.GITHUB_TOKEN }}
days-before-issue-stale: 21
days-before-issue-close: 7
days-before-pr-stale: 14
days-before-pr-close: 7
stale-issue-label: stale
stale-pr-label: stale
exempt-issue-labels: security,pinned,no-stale,no-pr-hygiene,maintainer
exempt-pr-labels: no-stale,no-pr-hygiene,maintainer
remove-stale-when-updated: true
exempt-all-assignees: true
operations-per-run: 300
stale-issue-message: |
This issue was automatically marked as stale due to inactivity.
Please provide an update, reproduction details, or current status to keep it open.
close-issue-message: |
Closing this issue due to inactivity.
If the problem still exists on the latest `main`, please open a new issue with fresh repro steps.
close-issue-reason: not_planned
stale-pr-message: |
This PR was automatically marked as stale due to inactivity.
Please rebase/update and post the latest validation results.
close-pr-message: |
Closing this PR due to inactivity.
Maintainers can reopen once the branch is updated and validation is provided.

View File

@ -1,37 +0,0 @@
name: PR Check Status
on:
schedule:
- cron: "15 8 * * *" # Once daily at 8:15am UTC
workflow_dispatch:
permissions: {}
concurrency:
group: pr-check-status
cancel-in-progress: true
env:
GIT_CONFIG_COUNT: "1"
GIT_CONFIG_KEY_0: core.hooksPath
GIT_CONFIG_VALUE_0: /dev/null
jobs:
nudge-stale-prs:
runs-on: [self-hosted, Linux, X64, aws-india, light, cpu40]
timeout-minutes: 10
permissions:
contents: read
pull-requests: write
issues: write
env:
STALE_HOURS: "48"
steps:
- name: Checkout repository
uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
- name: Nudge PRs that need rebase or CI refresh
uses: actions/github-script@ed597411d8f924073f98dfc5c65a23a2325f34cd # v8
with:
script: |
const script = require('./.github/workflows/scripts/pr_check_status_nudge.js');
await script({ github, context, core });

View File

@ -1,37 +0,0 @@
name: PR Intake Checks
on:
pull_request_target:
branches: [dev, main]
types: [opened, reopened, synchronize, ready_for_review]
concurrency:
group: pr-intake-checks-${{ github.event.pull_request.number || github.run_id }}
cancel-in-progress: true
permissions:
contents: read
pull-requests: write
issues: write
env:
GIT_CONFIG_COUNT: "1"
GIT_CONFIG_KEY_0: core.hooksPath
GIT_CONFIG_VALUE_0: /dev/null
jobs:
intake:
name: Intake Checks
runs-on: [self-hosted, Linux, X64, aws-india, light, cpu40]
timeout-minutes: 10
steps:
- name: Checkout repository
uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
- name: Run safe PR intake checks
uses: actions/github-script@ed597411d8f924073f98dfc5c65a23a2325f34cd # v8
with:
script: |
const script = require('./.github/workflows/scripts/pr_intake_checks.js');
await script({ github, context, core });

View File

@ -1,81 +0,0 @@
name: PR Label Policy Check
on:
pull_request:
paths:
- ".github/label-policy.json"
- ".github/workflows/pr-labeler.yml"
- ".github/workflows/pr-auto-response.yml"
push:
branches: [dev, main]
paths:
- ".github/label-policy.json"
- ".github/workflows/pr-labeler.yml"
- ".github/workflows/pr-auto-response.yml"
concurrency:
group: pr-label-policy-check-${{ github.event.pull_request.number || github.sha }}
cancel-in-progress: true
permissions:
contents: read
env:
GIT_CONFIG_COUNT: "1"
GIT_CONFIG_KEY_0: core.hooksPath
GIT_CONFIG_VALUE_0: /dev/null
jobs:
contributor-tier-consistency:
runs-on: [self-hosted, Linux, X64, aws-india, light, cpu40]
timeout-minutes: 10
steps:
- name: Checkout
uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
- name: Verify shared label policy and workflow wiring
shell: bash
run: |
set -euo pipefail
python3 - <<'PY'
import json
import re
from pathlib import Path
policy_path = Path('.github/label-policy.json')
policy = json.loads(policy_path.read_text(encoding='utf-8'))
color = str(policy.get('contributor_tier_color', '')).upper()
rules = policy.get('contributor_tiers', [])
if not re.fullmatch(r'[0-9A-F]{6}', color):
raise SystemExit('invalid contributor_tier_color in .github/label-policy.json')
if not rules:
raise SystemExit('contributor_tiers must not be empty in .github/label-policy.json')
labels = set()
prev_min = None
for entry in rules:
label = str(entry.get('label', '')).strip().lower()
min_merged = int(entry.get('min_merged_prs', 0))
if not label.endswith('contributor'):
raise SystemExit(f'invalid contributor tier label: {label}')
if label in labels:
raise SystemExit(f'duplicate contributor tier label: {label}')
if prev_min is not None and min_merged > prev_min:
raise SystemExit('contributor_tiers must be sorted descending by min_merged_prs')
labels.add(label)
prev_min = min_merged
workflow_paths = [
Path('.github/workflows/pr-labeler.yml'),
Path('.github/workflows/pr-auto-response.yml'),
]
for workflow in workflow_paths:
text = workflow.read_text(encoding='utf-8')
if '.github/label-policy.json' not in text:
raise SystemExit(f'{workflow} must load .github/label-policy.json')
if re.search(r'contributorTierColor\s*=\s*"[0-9A-Fa-f]{6}"', text):
raise SystemExit(f'{workflow} contains hardcoded contributorTierColor')
print('label policy file is valid and workflow consumers are wired to shared policy')
PY

View File

@ -1,56 +0,0 @@
name: PR Labeler
on:
pull_request_target:
branches: [dev, main]
types: [opened, reopened, synchronize, ready_for_review]
workflow_dispatch:
inputs:
mode:
description: "Run mode for managed-label governance"
required: true
default: "audit"
type: choice
options:
- audit
- repair
concurrency:
group: pr-labeler-${{ github.event.pull_request.number || github.run_id }}
cancel-in-progress: true
permissions:
contents: read
pull-requests: write
issues: write
env:
GIT_CONFIG_COUNT: "1"
GIT_CONFIG_KEY_0: core.hooksPath
GIT_CONFIG_VALUE_0: /dev/null
LABEL_POLICY_PATH: .github/label-policy.json
jobs:
label:
runs-on: [self-hosted, Linux, X64, aws-india, light, cpu40]
steps:
- name: Checkout repository
uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
- name: Apply path labels
if: github.event_name == 'pull_request_target'
uses: actions/labeler@634933edcd8ababfe52f92936142cc22ac488b1b # v6.0.1
continue-on-error: true
with:
repo-token: ${{ secrets.GITHUB_TOKEN }}
sync-labels: true
- name: Apply size/risk/module labels
uses: actions/github-script@ed597411d8f924073f98dfc5c65a23a2325f34cd # v8
continue-on-error: true
env:
LABEL_POLICY_PATH: .github/label-policy.json
with:
script: |
const script = require('./.github/workflows/scripts/pr_labeler.js');
await script({ github, context, core });

19
.github/workflows/pr-path-labeler.yml vendored Normal file
View File

@ -0,0 +1,19 @@
name: PR Path Labeler
on:
pull_request_target:
types: [opened, synchronize, reopened]
permissions:
contents: read
pull-requests: write
jobs:
label:
name: Apply path labels
runs-on: ubuntu-latest
timeout-minutes: 5
steps:
- uses: actions/labeler@8558fd74291d67161a8a78ce36a881fa63b766a9 # v5
with:
sync-labels: true

181
.github/workflows/pub-aur.yml vendored Normal file
View File

@ -0,0 +1,181 @@
name: Pub AUR Package
on:
workflow_call:
inputs:
release_tag:
description: "Existing release tag (vX.Y.Z)"
required: true
type: string
dry_run:
description: "Generate PKGBUILD only (no push)"
required: false
default: false
type: boolean
secrets:
AUR_SSH_KEY:
required: false
workflow_dispatch:
inputs:
release_tag:
description: "Existing release tag (vX.Y.Z)"
required: true
type: string
dry_run:
description: "Generate PKGBUILD only (no push)"
required: false
default: true
type: boolean
concurrency:
group: aur-publish-${{ github.run_id }}
cancel-in-progress: false
permissions:
contents: read
jobs:
publish-aur:
name: Update AUR Package
runs-on: ubuntu-latest
env:
RELEASE_TAG: ${{ inputs.release_tag }}
DRY_RUN: ${{ inputs.dry_run }}
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Validate and compute metadata
id: meta
shell: bash
run: |
set -euo pipefail
if [[ ! "$RELEASE_TAG" =~ ^v[0-9]+\.[0-9]+\.[0-9]+$ ]]; then
echo "::error::release_tag must be vX.Y.Z format."
exit 1
fi
version="${RELEASE_TAG#v}"
tarball_url="https://github.com/${GITHUB_REPOSITORY}/archive/refs/tags/${RELEASE_TAG}.tar.gz"
tarball_sha="$(curl -fsSL "$tarball_url" | sha256sum | awk '{print $1}')"
if [[ -z "$tarball_sha" ]]; then
echo "::error::Could not compute SHA256 for source tarball."
exit 1
fi
{
echo "version=$version"
echo "tarball_url=$tarball_url"
echo "tarball_sha=$tarball_sha"
} >> "$GITHUB_OUTPUT"
{
echo "### AUR Package Metadata"
echo "- version: \`${version}\`"
echo "- tarball_url: \`${tarball_url}\`"
echo "- tarball_sha: \`${tarball_sha}\`"
} >> "$GITHUB_STEP_SUMMARY"
- name: Generate PKGBUILD
id: pkgbuild
shell: bash
env:
VERSION: ${{ steps.meta.outputs.version }}
TARBALL_SHA: ${{ steps.meta.outputs.tarball_sha }}
run: |
set -euo pipefail
pkgbuild_file="$(mktemp)"
sed -e "s/^pkgver=.*/pkgver=${VERSION}/" \
-e "s/^sha256sums=.*/sha256sums=('${TARBALL_SHA}')/" \
dist/aur/PKGBUILD > "$pkgbuild_file"
echo "pkgbuild_file=$pkgbuild_file" >> "$GITHUB_OUTPUT"
echo "### Generated PKGBUILD" >> "$GITHUB_STEP_SUMMARY"
echo '```bash' >> "$GITHUB_STEP_SUMMARY"
cat "$pkgbuild_file" >> "$GITHUB_STEP_SUMMARY"
echo '```' >> "$GITHUB_STEP_SUMMARY"
- name: Generate .SRCINFO
id: srcinfo
shell: bash
env:
VERSION: ${{ steps.meta.outputs.version }}
TARBALL_SHA: ${{ steps.meta.outputs.tarball_sha }}
run: |
set -euo pipefail
srcinfo_file="$(mktemp)"
sed -e "s/pkgver = .*/pkgver = ${VERSION}/" \
-e "s/sha256sums = .*/sha256sums = ${TARBALL_SHA}/" \
-e "s|zeroclaw-[0-9.]*.tar.gz|zeroclaw-${VERSION}.tar.gz|g" \
-e "s|/v[0-9.]*\.tar\.gz|/v${VERSION}.tar.gz|g" \
dist/aur/.SRCINFO > "$srcinfo_file"
echo "srcinfo_file=$srcinfo_file" >> "$GITHUB_OUTPUT"
- name: Push to AUR
if: inputs.dry_run == false
shell: bash
env:
AUR_SSH_KEY: ${{ secrets.AUR_SSH_KEY }}
PKGBUILD_FILE: ${{ steps.pkgbuild.outputs.pkgbuild_file }}
SRCINFO_FILE: ${{ steps.srcinfo.outputs.srcinfo_file }}
VERSION: ${{ steps.meta.outputs.version }}
run: |
set -euo pipefail
if [[ -z "${AUR_SSH_KEY}" ]]; then
echo "::error::Secret AUR_SSH_KEY is required for non-dry-run."
exit 1
fi
# Set up SSH key — normalize line endings and ensure trailing newline
mkdir -p ~/.ssh
chmod 700 ~/.ssh
printf '%s\n' "$AUR_SSH_KEY" | tr -d '\r' > ~/.ssh/aur
chmod 600 ~/.ssh/aur
cat > ~/.ssh/config <<'SSH_CONFIG'
Host aur.archlinux.org
IdentityFile ~/.ssh/aur
User aur
StrictHostKeyChecking accept-new
SSH_CONFIG
chmod 600 ~/.ssh/config
# Verify key is valid and print fingerprint for debugging
echo "::group::SSH key diagnostics"
ssh-keygen -l -f ~/.ssh/aur || { echo "::error::AUR_SSH_KEY is not a valid SSH private key"; exit 1; }
echo "::endgroup::"
# Test SSH connectivity before attempting clone
ssh -T -o BatchMode=yes -o ConnectTimeout=10 aur@aur.archlinux.org 2>&1 || true
tmp_dir="$(mktemp -d)"
git clone ssh://aur@aur.archlinux.org/zeroclaw.git "$tmp_dir/aur"
cp "$PKGBUILD_FILE" "$tmp_dir/aur/PKGBUILD"
cp "$SRCINFO_FILE" "$tmp_dir/aur/.SRCINFO"
cd "$tmp_dir/aur"
git config user.name "zeroclaw-bot"
git config user.email "bot@zeroclaw.dev"
git add PKGBUILD .SRCINFO
git commit -m "zeroclaw ${VERSION}"
git push origin HEAD
echo "AUR package updated to ${VERSION}"
- name: Summary
shell: bash
run: |
if [[ "$DRY_RUN" == "true" ]]; then
echo "Dry run complete: PKGBUILD generated, no push performed."
else
echo "Publish complete: AUR package pushed."
fi

View File

@ -1,423 +0,0 @@
name: Pub Docker Img
on:
push:
tags: ["v*"]
pull_request:
branches: [dev, main]
paths:
- "Dockerfile"
- ".dockerignore"
- "docker-compose.yml"
- "rust-toolchain.toml"
- "dev/config.template.toml"
- ".github/workflows/pub-docker-img.yml"
- ".github/release/ghcr-tag-policy.json"
- ".github/release/ghcr-vulnerability-policy.json"
- "scripts/ci/ghcr_publish_contract_guard.py"
- "scripts/ci/ghcr_vulnerability_gate.py"
workflow_dispatch:
inputs:
release_tag:
description: "Existing release tag to publish (e.g. v0.2.0). Leave empty for smoke-only run."
required: false
type: string
concurrency:
group: docker-${{ github.event.pull_request.number || github.ref }}
cancel-in-progress: true
env:
GIT_CONFIG_COUNT: "1"
GIT_CONFIG_KEY_0: core.hooksPath
GIT_CONFIG_VALUE_0: /dev/null
REGISTRY: ghcr.io
IMAGE_NAME: ${{ github.repository }}
jobs:
pr-smoke:
name: PR Docker Smoke
if: (github.event_name == 'pull_request' && github.event.pull_request.head.repo.full_name == github.repository) || (github.event_name == 'workflow_dispatch' && inputs.release_tag == '')
runs-on: [self-hosted, Linux, X64, aws-india, blacksmith-2vcpu-ubuntu-2404]
timeout-minutes: 25
permissions:
contents: read
steps:
- name: Checkout repository
uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
- name: Resolve Docker API version
shell: bash
run: |
set -euo pipefail
server_api="$(docker version --format '{{.Server.APIVersion}}')"
min_api="$(docker version --format '{{.Server.MinAPIVersion}}' 2>/dev/null || true)"
if [[ -z "${server_api}" || "${server_api}" == "<no value>" ]]; then
echo "::error::Unable to detect Docker server API version."
docker version || true
exit 1
fi
echo "DOCKER_API_VERSION=${server_api}" >> "$GITHUB_ENV"
echo "Using Docker API version ${server_api} (server min: ${min_api:-unknown})"
- name: Setup Buildx
uses: docker/setup-buildx-action@8d2750c68a42422c14e847fe6c8ac0403b4cbd6f # v3
- name: Extract metadata (tags, labels)
if: github.event_name == 'pull_request'
id: meta
uses: docker/metadata-action@c299e40c65443455700f0fdfc63efafe5b349051 # v5
with:
images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}
tags: |
type=ref,event=pr
- name: Build smoke image
uses: docker/build-push-action@10e90e3645eae34f1e60eeb005ba3a3d33f178e8 # v6
with:
context: .
push: false
load: true
provenance: false
sbom: false
tags: zeroclaw-pr-smoke:latest
labels: ${{ steps.meta.outputs.labels || '' }}
platforms: linux/amd64
cache-from: type=gha
cache-to: type=gha,mode=max
- name: Verify image
run: docker run --rm zeroclaw-pr-smoke:latest --version
publish:
name: Build and Push Docker Image
if: github.repository == 'zeroclaw-labs/zeroclaw' && ((github.event_name == 'push' && startsWith(github.ref, 'refs/tags/v')) || (github.event_name == 'workflow_dispatch' && inputs.release_tag != ''))
runs-on: [self-hosted, Linux, X64, aws-india, blacksmith-2vcpu-ubuntu-2404]
timeout-minutes: 90
permissions:
contents: read
packages: write
security-events: write
steps:
- name: Checkout repository
uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
with:
ref: ${{ github.event_name == 'workflow_dispatch' && format('refs/tags/{0}', inputs.release_tag) || github.ref }}
- name: Resolve Docker API version
shell: bash
run: |
set -euo pipefail
server_api="$(docker version --format '{{.Server.APIVersion}}')"
min_api="$(docker version --format '{{.Server.MinAPIVersion}}' 2>/dev/null || true)"
if [[ -z "${server_api}" || "${server_api}" == "<no value>" ]]; then
echo "::error::Unable to detect Docker server API version."
docker version || true
exit 1
fi
echo "DOCKER_API_VERSION=${server_api}" >> "$GITHUB_ENV"
echo "Using Docker API version ${server_api} (server min: ${min_api:-unknown})"
- name: Setup Buildx
uses: docker/setup-buildx-action@8d2750c68a42422c14e847fe6c8ac0403b4cbd6f # v3
- name: Log in to Container Registry
uses: docker/login-action@c94ce9fb468520275223c153574b00df6fe4bcc9 # v3
with:
registry: ${{ env.REGISTRY }}
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Compute tags
id: meta
shell: bash
run: |
set -euo pipefail
IMAGE="${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}"
if [[ "${GITHUB_EVENT_NAME}" == "push" ]]; then
if [[ "${GITHUB_REF}" != refs/tags/v* ]]; then
echo "::error::Docker publish is restricted to v* tag pushes."
exit 1
fi
RELEASE_TAG="${GITHUB_REF#refs/tags/}"
elif [[ "${GITHUB_EVENT_NAME}" == "workflow_dispatch" ]]; then
RELEASE_TAG="${{ inputs.release_tag }}"
if [[ -z "${RELEASE_TAG}" ]]; then
echo "::error::workflow_dispatch publish requires inputs.release_tag"
exit 1
fi
if [[ ! "${RELEASE_TAG}" =~ ^v[0-9]+\.[0-9]+\.[0-9]+([.-][0-9A-Za-z.-]+)?$ ]]; then
echo "::error::release_tag must be vX.Y.Z or vX.Y.Z-suffix (received: ${RELEASE_TAG})"
exit 1
fi
if ! git rev-parse --verify "refs/tags/${RELEASE_TAG}" >/dev/null 2>&1; then
echo "::error::release tag not found in checkout: ${RELEASE_TAG}"
exit 1
fi
else
echo "::error::Unsupported event for publish: ${GITHUB_EVENT_NAME}"
exit 1
fi
RELEASE_SHA="$(git rev-parse HEAD)"
SHA_SUFFIX="sha-${RELEASE_SHA::12}"
SHA_TAG="${IMAGE}:${SHA_SUFFIX}"
LATEST_SUFFIX="latest"
LATEST_TAG="${IMAGE}:${LATEST_SUFFIX}"
VERSION_TAG="${IMAGE}:${RELEASE_TAG}"
TAGS="${VERSION_TAG},${SHA_TAG},${LATEST_TAG}"
{
echo "tags=${TAGS}"
echo "release_tag=${RELEASE_TAG}"
echo "release_sha=${RELEASE_SHA}"
echo "sha_tag=${SHA_SUFFIX}"
echo "latest_tag=${LATEST_SUFFIX}"
} >> "$GITHUB_OUTPUT"
- name: Build and push Docker image
uses: docker/build-push-action@10e90e3645eae34f1e60eeb005ba3a3d33f178e8 # v6
with:
context: .
push: true
build-args: |
ZEROCLAW_CARGO_ALL_FEATURES=true
tags: ${{ steps.meta.outputs.tags }}
platforms: linux/amd64,linux/arm64
cache-from: type=gha
cache-to: type=gha,mode=max
- name: Set GHCR package visibility to public
shell: bash
env:
GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
run: |
set -euo pipefail
owner="${GITHUB_REPOSITORY_OWNER,,}"
repo="${GITHUB_REPOSITORY#*/}"
# Package path can vary depending on repository/package linkage.
candidates=(
"$repo"
"${owner}%2F${repo}"
)
for scope in orgs users; do
for pkg in "${candidates[@]}"; do
code="$(curl -sS -o /tmp/ghcr-visibility.json -w "%{http_code}" \
-X PATCH \
-H "Authorization: Bearer ${GH_TOKEN}" \
-H "Accept: application/vnd.github+json" \
-H "X-GitHub-Api-Version: 2022-11-28" \
"https://api.github.com/${scope}/${owner}/packages/container/${pkg}/visibility" \
-d '{"visibility":"public"}' || true)"
if [ "$code" = "200" ] || [ "$code" = "204" ]; then
echo "GHCR package visibility is public (${scope}/${owner}/${pkg})."
exit 0
fi
echo "Visibility attempt ${scope}/${owner}/${pkg} returned HTTP ${code}."
done
done
echo "::warning::Unable to update GHCR visibility via API in this run; proceeding to GHCR publish contract verification."
- name: Validate GHCR publish contract
shell: bash
run: |
set -euo pipefail
mkdir -p artifacts
python3 scripts/ci/ghcr_publish_contract_guard.py \
--repository "${GITHUB_REPOSITORY,,}" \
--release-tag "${{ steps.meta.outputs.release_tag }}" \
--sha "${{ steps.meta.outputs.release_sha }}" \
--policy-file .github/release/ghcr-tag-policy.json \
--output-json artifacts/ghcr-publish-contract.json \
--output-md artifacts/ghcr-publish-contract.md \
--fail-on-violation
- name: Emit GHCR publish contract audit event
if: always()
shell: bash
run: |
set -euo pipefail
if [ -f artifacts/ghcr-publish-contract.json ]; then
python3 scripts/ci/emit_audit_event.py \
--event-type ghcr_publish_contract \
--input-json artifacts/ghcr-publish-contract.json \
--output-json artifacts/audit-event-ghcr-publish-contract.json \
--artifact-name ghcr-publish-contract \
--retention-days 21
fi
- name: Publish GHCR contract summary
if: always()
shell: bash
run: |
set -euo pipefail
if [ -f artifacts/ghcr-publish-contract.md ]; then
cat artifacts/ghcr-publish-contract.md >> "$GITHUB_STEP_SUMMARY"
fi
- name: Upload GHCR publish contract artifacts
if: always()
uses: actions/upload-artifact@b7c566a772e6b6bfb58ed0dc250532a479d7789f # v6.0.0
with:
name: ghcr-publish-contract
path: |
artifacts/ghcr-publish-contract.json
artifacts/ghcr-publish-contract.md
artifacts/audit-event-ghcr-publish-contract.json
if-no-files-found: ignore
retention-days: 21
- name: Scan published image for vulnerabilities (Trivy)
shell: bash
run: |
set -euo pipefail
mkdir -p artifacts
TAG_NAME="${{ steps.meta.outputs.release_tag }}"
SHA_TAG="${{ steps.meta.outputs.sha_tag }}"
LATEST_TAG="${{ steps.meta.outputs.latest_tag }}"
IMAGE_BASE="${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}"
VERSION_REF="${IMAGE_BASE}:${TAG_NAME}"
SHA_REF="${IMAGE_BASE}:${SHA_TAG}"
LATEST_REF="${IMAGE_BASE}:${LATEST_TAG}"
SARIF_OUT="artifacts/trivy-${TAG_NAME}.sarif"
TABLE_OUT="artifacts/trivy-${TAG_NAME}.txt"
JSON_OUT="artifacts/trivy-${TAG_NAME}.json"
SHA_TABLE_OUT="artifacts/trivy-${SHA_TAG}.txt"
SHA_JSON_OUT="artifacts/trivy-${SHA_TAG}.json"
LATEST_TABLE_OUT="artifacts/trivy-${LATEST_TAG}.txt"
LATEST_JSON_OUT="artifacts/trivy-${LATEST_TAG}.json"
scan_trivy() {
local image_ref="$1"
local output_prefix="$2"
docker run --rm \
-v "$PWD/artifacts:/work" \
aquasec/trivy:0.58.2 image \
--quiet \
--ignore-unfixed \
--severity HIGH,CRITICAL \
--format json \
--output "/work/${output_prefix}.json" \
"${image_ref}"
docker run --rm \
-v "$PWD/artifacts:/work" \
aquasec/trivy:0.58.2 image \
--quiet \
--ignore-unfixed \
--severity HIGH,CRITICAL \
--format table \
--output "/work/${output_prefix}.txt" \
"${image_ref}"
}
docker run --rm \
-v "$PWD/artifacts:/work" \
aquasec/trivy:0.58.2 image \
--quiet \
--ignore-unfixed \
--severity HIGH,CRITICAL \
--format sarif \
--output "/work/trivy-${TAG_NAME}.sarif" \
"${VERSION_REF}"
scan_trivy "${VERSION_REF}" "trivy-${TAG_NAME}"
scan_trivy "${SHA_REF}" "trivy-${SHA_TAG}"
scan_trivy "${LATEST_REF}" "trivy-${LATEST_TAG}"
echo "Generated Trivy reports:"
ls -1 "$SARIF_OUT" "$TABLE_OUT" "$JSON_OUT" "$SHA_TABLE_OUT" "$SHA_JSON_OUT" "$LATEST_TABLE_OUT" "$LATEST_JSON_OUT"
- name: Validate GHCR vulnerability gate
shell: bash
run: |
set -euo pipefail
python3 scripts/ci/ghcr_vulnerability_gate.py \
--release-tag "${{ steps.meta.outputs.release_tag }}" \
--sha-tag "${{ steps.meta.outputs.sha_tag }}" \
--latest-tag "${{ steps.meta.outputs.latest_tag }}" \
--release-report-json "artifacts/trivy-${{ steps.meta.outputs.release_tag }}.json" \
--sha-report-json "artifacts/trivy-${{ steps.meta.outputs.sha_tag }}.json" \
--latest-report-json "artifacts/trivy-${{ steps.meta.outputs.latest_tag }}.json" \
--policy-file .github/release/ghcr-vulnerability-policy.json \
--output-json artifacts/ghcr-vulnerability-gate.json \
--output-md artifacts/ghcr-vulnerability-gate.md \
--fail-on-violation
- name: Emit GHCR vulnerability gate audit event
if: always()
shell: bash
run: |
set -euo pipefail
if [ -f artifacts/ghcr-vulnerability-gate.json ]; then
python3 scripts/ci/emit_audit_event.py \
--event-type ghcr_vulnerability_gate \
--input-json artifacts/ghcr-vulnerability-gate.json \
--output-json artifacts/audit-event-ghcr-vulnerability-gate.json \
--artifact-name ghcr-vulnerability-gate \
--retention-days 21
fi
- name: Publish GHCR vulnerability summary
if: always()
shell: bash
run: |
set -euo pipefail
if [ -f artifacts/ghcr-vulnerability-gate.md ]; then
cat artifacts/ghcr-vulnerability-gate.md >> "$GITHUB_STEP_SUMMARY"
fi
- name: Upload GHCR vulnerability gate artifacts
if: always()
uses: actions/upload-artifact@b7c566a772e6b6bfb58ed0dc250532a479d7789f # v6.0.0
with:
name: ghcr-vulnerability-gate
path: |
artifacts/ghcr-vulnerability-gate.json
artifacts/ghcr-vulnerability-gate.md
artifacts/audit-event-ghcr-vulnerability-gate.json
if-no-files-found: ignore
retention-days: 21
- name: Detect Trivy SARIF report
id: trivy-sarif
if: always()
shell: bash
run: |
set -euo pipefail
sarif_path="artifacts/trivy-${{ steps.meta.outputs.release_tag }}.sarif"
if [ -f "${sarif_path}" ]; then
echo "exists=true" >> "$GITHUB_OUTPUT"
else
echo "exists=false" >> "$GITHUB_OUTPUT"
echo "::notice::Trivy SARIF report not found at ${sarif_path}; skipping SARIF upload."
fi
- name: Upload Trivy SARIF
if: always() && steps.trivy-sarif.outputs.exists == 'true'
uses: github/codeql-action/upload-sarif@89a39a4e59826350b863aa6b6252a07ad50cf83e # v4
with:
sarif_file: artifacts/trivy-${{ steps.meta.outputs.release_tag }}.sarif
category: ghcr-trivy
- name: Upload Trivy report artifacts
if: always()
uses: actions/upload-artifact@b7c566a772e6b6bfb58ed0dc250532a479d7789f # v6.0.0
with:
name: ghcr-trivy-report
path: |
artifacts/trivy-${{ steps.meta.outputs.release_tag }}.sarif
artifacts/trivy-${{ steps.meta.outputs.release_tag }}.txt
artifacts/trivy-${{ steps.meta.outputs.release_tag }}.json
artifacts/trivy-sha-*.txt
artifacts/trivy-sha-*.json
artifacts/trivy-latest.txt
artifacts/trivy-latest.json
if-no-files-found: ignore
retention-days: 14

228
.github/workflows/pub-homebrew-core.yml vendored Normal file
View File

@ -0,0 +1,228 @@
name: Pub Homebrew Core
on:
workflow_call:
inputs:
release_tag:
description: "Existing release tag to publish (vX.Y.Z)"
required: true
type: string
dry_run:
description: "Patch formula only (no push/PR)"
required: false
default: false
type: boolean
secrets:
HOMEBREW_UPSTREAM_PR_TOKEN:
required: false
HOMEBREW_CORE_BOT_TOKEN:
required: false
workflow_dispatch:
inputs:
release_tag:
description: "Existing release tag to publish (vX.Y.Z)"
required: true
type: string
dry_run:
description: "Patch formula only (no push/PR)"
required: false
default: true
type: boolean
concurrency:
group: homebrew-core-${{ github.run_id }}
cancel-in-progress: false
permissions:
contents: read
jobs:
publish-homebrew-core:
name: Publish Homebrew Core PR
runs-on: ubuntu-latest
env:
UPSTREAM_REPO: Homebrew/homebrew-core
FORMULA_PATH: Formula/z/zeroclaw.rb
RELEASE_TAG: ${{ inputs.release_tag }}
DRY_RUN: ${{ inputs.dry_run }}
BOT_FORK_REPO: ${{ vars.HOMEBREW_CORE_BOT_FORK_REPO }}
BOT_EMAIL: ${{ vars.HOMEBREW_CORE_BOT_EMAIL }}
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Validate release tag and version alignment
id: release_meta
shell: bash
run: |
set -euo pipefail
semver_pattern='^v[0-9]+\.[0-9]+\.[0-9]+([.-][0-9A-Za-z.-]+)?$'
if [[ ! "$RELEASE_TAG" =~ $semver_pattern ]]; then
echo "::error::release_tag must match semver-like format (vX.Y.Z[-suffix])."
exit 1
fi
if ! git rev-parse "refs/tags/${RELEASE_TAG}" >/dev/null 2>&1; then
git fetch --tags origin
fi
tag_version="${RELEASE_TAG#v}"
cargo_version="$(git show "${RELEASE_TAG}:Cargo.toml" \
| sed -n 's/^version = "\([^"]*\)"/\1/p' | head -n1)"
if [[ -z "$cargo_version" ]]; then
echo "::error::Unable to read Cargo.toml version from tag ${RELEASE_TAG}."
exit 1
fi
if [[ "$cargo_version" != "$tag_version" ]]; then
echo "::error::Tag ${RELEASE_TAG} does not match Cargo.toml version (${cargo_version})."
exit 1
fi
tarball_url="https://github.com/${GITHUB_REPOSITORY}/archive/refs/tags/${RELEASE_TAG}.tar.gz"
tarball_sha="$(curl -fsSL "$tarball_url" | sha256sum | awk '{print $1}')"
{
echo "tag_version=$tag_version"
echo "tarball_url=$tarball_url"
echo "tarball_sha=$tarball_sha"
} >> "$GITHUB_OUTPUT"
{
echo "### Release Metadata"
echo "- release_tag: \`${RELEASE_TAG}\`"
echo "- cargo_version: \`${cargo_version}\`"
echo "- tarball_sha256: \`${tarball_sha}\`"
echo "- dry_run: ${DRY_RUN}"
} >> "$GITHUB_STEP_SUMMARY"
- name: Patch Homebrew formula
id: patch_formula
shell: bash
env:
HOMEBREW_CORE_BOT_TOKEN: ${{ secrets.HOMEBREW_UPSTREAM_PR_TOKEN || secrets.HOMEBREW_CORE_BOT_TOKEN }}
GH_TOKEN: ${{ secrets.HOMEBREW_UPSTREAM_PR_TOKEN || secrets.HOMEBREW_CORE_BOT_TOKEN }}
run: |
set -euo pipefail
tmp_repo="$(mktemp -d)"
echo "tmp_repo=$tmp_repo" >> "$GITHUB_OUTPUT"
if [[ "$DRY_RUN" == "true" ]]; then
git clone --depth=1 "https://github.com/${UPSTREAM_REPO}.git" "$tmp_repo/homebrew-core"
else
if [[ -z "${BOT_FORK_REPO}" ]]; then
echo "::error::Repository variable HOMEBREW_CORE_BOT_FORK_REPO is required when dry_run=false."
exit 1
fi
if [[ -z "${HOMEBREW_CORE_BOT_TOKEN}" ]]; then
echo "::error::Repository secret HOMEBREW_CORE_BOT_TOKEN is required when dry_run=false."
exit 1
fi
if [[ "$BOT_FORK_REPO" != */* ]]; then
echo "::error::HOMEBREW_CORE_BOT_FORK_REPO must be in owner/repo format."
exit 1
fi
if ! gh api "repos/${BOT_FORK_REPO}" >/dev/null 2>&1; then
echo "::error::HOMEBREW_CORE_BOT_TOKEN cannot access ${BOT_FORK_REPO}."
exit 1
fi
gh repo clone "${BOT_FORK_REPO}" "$tmp_repo/homebrew-core" -- --depth=1
fi
repo_dir="$tmp_repo/homebrew-core"
formula_file="$repo_dir/$FORMULA_PATH"
if [[ ! -f "$formula_file" ]]; then
echo "::error::Formula file not found: $FORMULA_PATH"
exit 1
fi
if [[ "$DRY_RUN" == "false" ]]; then
if git -C "$repo_dir" remote get-url upstream >/dev/null 2>&1; then
git -C "$repo_dir" remote set-url upstream "https://github.com/${UPSTREAM_REPO}.git"
else
git -C "$repo_dir" remote add upstream "https://github.com/${UPSTREAM_REPO}.git"
fi
if git -C "$repo_dir" ls-remote --exit-code --heads upstream main >/dev/null 2>&1; then
upstream_ref="main"
else
upstream_ref="master"
fi
git -C "$repo_dir" fetch --depth=1 upstream "$upstream_ref"
branch_name="zeroclaw-${RELEASE_TAG}-${GITHUB_RUN_ID}"
git -C "$repo_dir" checkout -B "$branch_name" "upstream/$upstream_ref"
echo "branch_name=$branch_name" >> "$GITHUB_OUTPUT"
fi
tarball_url="$(grep 'tarball_url=' "$GITHUB_OUTPUT" | head -1 | cut -d= -f2-)"
tarball_sha="$(grep 'tarball_sha=' "$GITHUB_OUTPUT" | head -1 | cut -d= -f2-)"
perl -0pi -e "s|^ url \".*\"| url \"${tarball_url}\"|m" "$formula_file"
perl -0pi -e "s|^ sha256 \".*\"| sha256 \"${tarball_sha}\"|m" "$formula_file"
perl -0pi -e "s|^ license \".*\"| license \"Apache-2.0 OR MIT\"|m" "$formula_file"
# Ensure Node.js build dependency is declared so that build.rs can
# run `npm ci && npm run build` to produce the web frontend assets.
if ! grep -q 'depends_on "node" => :build' "$formula_file"; then
perl -0pi -e 's|( depends_on "rust" => :build\n)|\1 depends_on "node" => :build\n|m' "$formula_file"
fi
git -C "$repo_dir" diff -- "$FORMULA_PATH" > "$tmp_repo/formula.diff"
if [[ ! -s "$tmp_repo/formula.diff" ]]; then
echo "::error::No formula changes generated. Nothing to publish."
exit 1
fi
{
echo "### Formula Diff"
echo '```diff'
cat "$tmp_repo/formula.diff"
echo '```'
} >> "$GITHUB_STEP_SUMMARY"
- name: Push branch and open Homebrew PR
if: inputs.dry_run == false
shell: bash
env:
GH_TOKEN: ${{ secrets.HOMEBREW_UPSTREAM_PR_TOKEN || secrets.HOMEBREW_CORE_BOT_TOKEN }}
TMP_REPO: ${{ steps.patch_formula.outputs.tmp_repo }}
BRANCH_NAME: ${{ steps.patch_formula.outputs.branch_name }}
TAG_VERSION: ${{ steps.release_meta.outputs.tag_version }}
TARBALL_URL: ${{ steps.release_meta.outputs.tarball_url }}
TARBALL_SHA: ${{ steps.release_meta.outputs.tarball_sha }}
run: |
set -euo pipefail
repo_dir="${TMP_REPO}/homebrew-core"
fork_owner="${BOT_FORK_REPO%%/*}"
bot_email="${BOT_EMAIL:-${fork_owner}@users.noreply.github.com}"
git -C "$repo_dir" config user.name "$fork_owner"
git -C "$repo_dir" config user.email "$bot_email"
git -C "$repo_dir" add "$FORMULA_PATH"
git -C "$repo_dir" commit -m "zeroclaw ${TAG_VERSION}"
gh auth setup-git
git -C "$repo_dir" push --set-upstream origin "$BRANCH_NAME"
pr_body="Automated formula bump from ZeroClaw release workflow.
- Release tag: ${RELEASE_TAG}
- Source tarball: ${TARBALL_URL}
- Source sha256: ${TARBALL_SHA}"
gh pr create \
--repo "$UPSTREAM_REPO" \
--base main \
--head "${fork_owner}:${BRANCH_NAME}" \
--title "zeroclaw ${TAG_VERSION}" \
--body "$pr_body"
- name: Summary
shell: bash
run: |
if [[ "$DRY_RUN" == "true" ]]; then
echo "Dry run complete: formula diff generated, no push/PR performed."
else
echo "Publish complete: branch pushed and PR opened from bot fork."
fi

View File

@ -1,261 +0,0 @@
name: Pub Pre-release
on:
push:
tags:
- "v*-alpha.*"
- "v*-beta.*"
- "v*-rc.*"
workflow_dispatch:
inputs:
tag:
description: "Existing pre-release tag (e.g. v0.1.8-rc.1)"
required: true
default: ""
type: string
mode:
description: "dry-run validates/builds only; publish creates prerelease"
required: true
default: dry-run
type: choice
options:
- dry-run
- publish
draft:
description: "Create prerelease as draft"
required: true
default: true
type: boolean
concurrency:
group: prerelease-${{ github.ref || github.run_id }}
cancel-in-progress: false
permissions:
contents: write
env:
GIT_CONFIG_COUNT: "1"
GIT_CONFIG_KEY_0: core.hooksPath
GIT_CONFIG_VALUE_0: /dev/null
CARGO_TERM_COLOR: always
jobs:
prerelease-guard:
name: Pre-release Guard
runs-on: [self-hosted, Linux, X64, aws-india, blacksmith-2vcpu-ubuntu-2404]
timeout-minutes: 20
outputs:
release_tag: ${{ steps.vars.outputs.release_tag }}
mode: ${{ steps.vars.outputs.mode }}
draft: ${{ steps.vars.outputs.draft }}
ready_to_publish: ${{ steps.extract.outputs.ready_to_publish }}
stage: ${{ steps.extract.outputs.stage }}
transition_outcome: ${{ steps.extract.outputs.transition_outcome }}
latest_stage: ${{ steps.extract.outputs.latest_stage }}
latest_stage_tag: ${{ steps.extract.outputs.latest_stage_tag }}
steps:
- name: Checkout
uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
with:
fetch-depth: 0
- name: Resolve prerelease inputs
id: vars
shell: bash
run: |
set -euo pipefail
if [ "${GITHUB_EVENT_NAME}" = "push" ]; then
release_tag="${GITHUB_REF_NAME}"
mode="publish"
draft="false"
else
release_tag="${{ inputs.tag }}"
mode="${{ inputs.mode }}"
draft="${{ inputs.draft }}"
fi
{
echo "release_tag=${release_tag}"
echo "mode=${mode}"
echo "draft=${draft}"
} >> "$GITHUB_OUTPUT"
- name: Validate prerelease stage gate
shell: bash
run: |
set -euo pipefail
mkdir -p artifacts
python3 scripts/ci/prerelease_guard.py \
--repo-root . \
--tag "${{ steps.vars.outputs.release_tag }}" \
--stage-config-file .github/release/prerelease-stage-gates.json \
--mode "${{ steps.vars.outputs.mode }}" \
--output-json artifacts/prerelease-guard.json \
--output-md artifacts/prerelease-guard.md \
--fail-on-violation
- name: Extract prerelease outputs
id: extract
shell: bash
run: |
set -euo pipefail
ready_to_publish="$(python3 - <<'PY'
import json
data = json.load(open('artifacts/prerelease-guard.json', encoding='utf-8'))
print(str(bool(data.get('ready_to_publish', False))).lower())
PY
)"
stage="$(python3 - <<'PY'
import json
data = json.load(open('artifacts/prerelease-guard.json', encoding='utf-8'))
print(data.get('stage', 'unknown'))
PY
)"
transition_outcome="$(python3 - <<'PY'
import json
data = json.load(open('artifacts/prerelease-guard.json', encoding='utf-8'))
transition = data.get('transition') or {}
print(transition.get('outcome', 'unknown'))
PY
)"
latest_stage="$(python3 - <<'PY'
import json
data = json.load(open('artifacts/prerelease-guard.json', encoding='utf-8'))
history = data.get('stage_history') or {}
print(history.get('latest_stage', 'unknown'))
PY
)"
latest_stage_tag="$(python3 - <<'PY'
import json
data = json.load(open('artifacts/prerelease-guard.json', encoding='utf-8'))
history = data.get('stage_history') or {}
print(history.get('latest_tag', 'unknown'))
PY
)"
{
echo "ready_to_publish=${ready_to_publish}"
echo "stage=${stage}"
echo "transition_outcome=${transition_outcome}"
echo "latest_stage=${latest_stage}"
echo "latest_stage_tag=${latest_stage_tag}"
} >> "$GITHUB_OUTPUT"
- name: Emit prerelease audit event
if: always()
shell: bash
run: |
set -euo pipefail
python3 scripts/ci/emit_audit_event.py \
--event-type prerelease_guard \
--input-json artifacts/prerelease-guard.json \
--output-json artifacts/audit-event-prerelease-guard.json \
--artifact-name prerelease-guard \
--retention-days 21
- name: Publish prerelease summary
if: always()
shell: bash
run: |
set -euo pipefail
cat artifacts/prerelease-guard.md >> "$GITHUB_STEP_SUMMARY"
- name: Upload prerelease guard artifacts
if: always()
uses: actions/upload-artifact@b7c566a772e6b6bfb58ed0dc250532a479d7789f # v6.0.0
with:
name: prerelease-guard
path: |
artifacts/prerelease-guard.json
artifacts/prerelease-guard.md
artifacts/audit-event-prerelease-guard.json
if-no-files-found: error
retention-days: 21
build-prerelease:
name: Build Pre-release Artifact
needs: [prerelease-guard]
# Keep GNU Linux prerelease artifacts on Ubuntu 22.04 so runtime GLIBC
# symbols remain compatible with Debian 12 / Ubuntu 22.04 hosts.
runs-on: [self-hosted, Linux, X64, aws-india, blacksmith-2vcpu-ubuntu-2404]
timeout-minutes: 45
steps:
- name: Checkout tag
uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
with:
ref: ${{ needs.prerelease-guard.outputs.release_tag }}
- uses: dtolnay/rust-toolchain@631a55b12751854ce901bb631d5902ceb48146f7 # stable
with:
toolchain: 1.92.0
- uses: Swatinem/rust-cache@779680da715d629ac1d338a641029a2f4372abb5 # v3
with:
prefix-key: prerelease-${{ needs.prerelease-guard.outputs.release_tag }}
cache-targets: true
- name: Build release-fast binary
shell: bash
run: |
set -euo pipefail
cargo build --profile release-fast --locked --target x86_64-unknown-linux-gnu
- name: Package prerelease artifact
shell: bash
run: |
set -euo pipefail
mkdir -p artifacts
cp target/x86_64-unknown-linux-gnu/release-fast/zeroclaw artifacts/zeroclaw
tar czf artifacts/zeroclaw-x86_64-unknown-linux-gnu.tar.gz -C artifacts zeroclaw
rm artifacts/zeroclaw
- name: Generate manifest + checksums
shell: bash
run: |
set -euo pipefail
python3 scripts/ci/release_manifest.py \
--artifacts-dir artifacts \
--release-tag "${{ needs.prerelease-guard.outputs.release_tag }}" \
--output-json artifacts/prerelease-manifest.json \
--output-md artifacts/prerelease-manifest.md \
--checksums-path artifacts/SHA256SUMS \
--fail-empty
- name: Publish prerelease build summary
shell: bash
run: |
set -euo pipefail
cat artifacts/prerelease-manifest.md >> "$GITHUB_STEP_SUMMARY"
- name: Upload prerelease build artifacts
uses: actions/upload-artifact@b7c566a772e6b6bfb58ed0dc250532a479d7789f # v6.0.0
with:
name: prerelease-artifacts
path: artifacts/*
if-no-files-found: error
retention-days: 14
publish-prerelease:
name: Publish GitHub Pre-release
needs: [prerelease-guard, build-prerelease]
if: needs.prerelease-guard.outputs.ready_to_publish == 'true'
runs-on: [self-hosted, Linux, X64, aws-india, blacksmith-2vcpu-ubuntu-2404]
timeout-minutes: 15
steps:
- name: Download prerelease artifacts
uses: actions/download-artifact@37930b1c2abaa49bbe596cd826c3c89aef350131 # v7.0.0
with:
name: prerelease-artifacts
path: artifacts
- name: Create or update GitHub pre-release
uses: softprops/action-gh-release@a06a81a03ee405af7f2048a818ed3f03bbf83c7b # v2
with:
tag_name: ${{ needs.prerelease-guard.outputs.release_tag }}
prerelease: true
draft: ${{ needs.prerelease-guard.outputs.draft == 'true' }}
generate_release_notes: true
files: |
artifacts/**/*
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

View File

@ -1,722 +0,0 @@
name: Pub Release
on:
push:
tags: ["v*"]
workflow_dispatch:
inputs:
release_ref:
description: "Git ref (branch, tag, or SHA) to build"
required: false
default: "main"
type: string
publish_release:
description: "Publish a GitHub release (false = verification build only)"
required: false
default: false
type: boolean
release_tag:
description: "Existing release tag (required when publish_release=true), e.g. v0.1.1"
required: false
default: ""
type: string
draft:
description: "Create release as draft (manual publish only)"
required: false
default: true
type: boolean
schedule:
# Weekly release-readiness verification on default branch (no publish)
- cron: "17 8 * * 1"
concurrency:
group: release-${{ github.ref || github.run_id }}
cancel-in-progress: false
permissions:
contents: write
packages: read
id-token: write # Required for cosign keyless signing via OIDC
env:
GIT_CONFIG_COUNT: "1"
GIT_CONFIG_KEY_0: core.hooksPath
GIT_CONFIG_VALUE_0: /dev/null
CARGO_TERM_COLOR: always
jobs:
prepare:
name: Prepare Release Context
if: github.event_name != 'push' || !contains(github.ref_name, '-')
runs-on: [self-hosted, Linux, X64, aws-india, blacksmith-2vcpu-ubuntu-2404]
outputs:
release_ref: ${{ steps.vars.outputs.release_ref }}
release_tag: ${{ steps.vars.outputs.release_tag }}
publish_release: ${{ steps.vars.outputs.publish_release }}
draft_release: ${{ steps.vars.outputs.draft_release }}
steps:
- name: Resolve release inputs
id: vars
shell: bash
run: |
set -euo pipefail
event_name="${GITHUB_EVENT_NAME}"
publish_release="false"
draft_release="false"
if [[ "$event_name" == "push" ]]; then
release_ref="${GITHUB_REF_NAME}"
release_tag="${GITHUB_REF_NAME}"
publish_release="true"
elif [[ "$event_name" == "workflow_dispatch" ]]; then
release_ref="${{ inputs.release_ref }}"
publish_release="${{ inputs.publish_release }}"
draft_release="${{ inputs.draft }}"
if [[ "$publish_release" == "true" ]]; then
release_tag="${{ inputs.release_tag }}"
if [[ -z "$release_tag" ]]; then
echo "::error::release_tag is required when publish_release=true"
exit 1
fi
release_ref="$release_tag"
else
release_tag="verify-${GITHUB_SHA::12}"
fi
else
# schedule
release_ref="main"
release_tag="verify-${GITHUB_SHA::12}"
fi
{
echo "release_ref=${release_ref}"
echo "release_tag=${release_tag}"
echo "publish_release=${publish_release}"
echo "draft_release=${draft_release}"
} >> "$GITHUB_OUTPUT"
{
echo "### Release Context"
echo "- event: ${event_name}"
echo "- release_ref: ${release_ref}"
echo "- release_tag: ${release_tag}"
echo "- publish_release: ${publish_release}"
echo "- draft_release: ${draft_release}"
} >> "$GITHUB_STEP_SUMMARY"
- name: Checkout
uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
- name: Install gh CLI
shell: bash
run: |
set -euo pipefail
if command -v gh &>/dev/null; then
echo "gh already available: $(gh --version | head -1)"
exit 0
fi
echo "Installing gh CLI..."
curl -fsSL https://cli.github.com/packages/githubcli-archive-keyring.gpg \
| sudo dd of=/usr/share/keyrings/githubcli-archive-keyring.gpg
echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/githubcli-archive-keyring.gpg] https://cli.github.com/packages stable main" \
| sudo tee /etc/apt/sources.list.d/github-cli.list > /dev/null
for i in {1..60}; do
if sudo fuser /var/lib/apt/lists/lock >/dev/null 2>&1 \
|| sudo fuser /var/lib/dpkg/lock-frontend >/dev/null 2>&1 \
|| sudo fuser /var/lib/dpkg/lock >/dev/null 2>&1; then
echo "apt/dpkg locked; waiting ($i/60)..."
sleep 5
else
break
fi
done
sudo apt-get -o DPkg::Lock::Timeout=600 -o Acquire::Retries=3 update -qq
sudo apt-get -o DPkg::Lock::Timeout=600 -o Acquire::Retries=3 install -y gh
env:
GH_TOKEN: ${{ github.token }}
- name: Validate release trigger and authorization guard
shell: bash
run: |
set -euo pipefail
mkdir -p artifacts
python3 scripts/ci/release_trigger_guard.py \
--repo-root . \
--repository "${GITHUB_REPOSITORY}" \
--event-name "${GITHUB_EVENT_NAME}" \
--actor "${GITHUB_ACTOR}" \
--release-ref "${{ steps.vars.outputs.release_ref }}" \
--release-tag "${{ steps.vars.outputs.release_tag }}" \
--publish-release "${{ steps.vars.outputs.publish_release }}" \
--authorized-actors "${{ vars.RELEASE_AUTHORIZED_ACTORS || 'willsarg,theonlyhennygod,chumyin' }}" \
--authorized-tagger-emails "${{ vars.RELEASE_AUTHORIZED_TAGGER_EMAILS || '' }}" \
--require-annotated-tag true \
--output-json artifacts/release-trigger-guard.json \
--output-md artifacts/release-trigger-guard.md \
--fail-on-violation
env:
GH_TOKEN: ${{ github.token }}
- name: Emit release trigger audit event
if: always()
shell: bash
run: |
set -euo pipefail
python3 scripts/ci/emit_audit_event.py \
--event-type release_trigger_guard \
--input-json artifacts/release-trigger-guard.json \
--output-json artifacts/audit-event-release-trigger-guard.json \
--artifact-name release-trigger-guard \
--retention-days 30
- name: Publish release trigger guard summary
if: always()
shell: bash
run: |
set -euo pipefail
cat artifacts/release-trigger-guard.md >> "$GITHUB_STEP_SUMMARY"
- name: Upload release trigger guard artifacts
if: always()
uses: actions/upload-artifact@b7c566a772e6b6bfb58ed0dc250532a479d7789f # v6.0.0
with:
name: release-trigger-guard
path: |
artifacts/release-trigger-guard.json
artifacts/release-trigger-guard.md
artifacts/audit-event-release-trigger-guard.json
if-no-files-found: error
retention-days: 30
build-release:
name: Build ${{ matrix.target }}
needs: [prepare]
runs-on: ${{ matrix.os }}
timeout-minutes: 40
env:
CARGO_HOME: ${{ github.workspace }}/.ci-rust/${{ github.run_id }}-${{ github.run_attempt }}-${{ github.job }}-${{ matrix.target }}/cargo
RUSTUP_HOME: ${{ github.workspace }}/.ci-rust/${{ github.run_id }}-${{ github.run_attempt }}-${{ github.job }}-${{ matrix.target }}/rustup
CARGO_TARGET_DIR: ${{ github.workspace }}/target
strategy:
fail-fast: false
matrix:
include:
# Keep GNU Linux release artifacts on Ubuntu 22.04 to preserve
# a broadly compatible GLIBC baseline for user distributions.
- os: [self-hosted, Linux, X64, aws-india, blacksmith-2vcpu-ubuntu-2404]
target: x86_64-unknown-linux-gnu
artifact: zeroclaw
archive_ext: tar.gz
cross_compiler: ""
linker_env: ""
linker: ""
- os: [self-hosted, Linux, X64, aws-india, blacksmith-2vcpu-ubuntu-2404]
target: x86_64-unknown-linux-musl
artifact: zeroclaw
archive_ext: tar.gz
cross_compiler: ""
linker_env: ""
linker: ""
use_cross: true
- os: [self-hosted, Linux, X64, aws-india, blacksmith-2vcpu-ubuntu-2404]
target: aarch64-unknown-linux-gnu
artifact: zeroclaw
archive_ext: tar.gz
cross_compiler: gcc-aarch64-linux-gnu
linker_env: CARGO_TARGET_AARCH64_UNKNOWN_LINUX_GNU_LINKER
linker: aarch64-linux-gnu-gcc
- os: [self-hosted, Linux, X64, aws-india, blacksmith-2vcpu-ubuntu-2404]
target: aarch64-unknown-linux-musl
artifact: zeroclaw
archive_ext: tar.gz
cross_compiler: ""
linker_env: ""
linker: ""
use_cross: true
- os: [self-hosted, Linux, X64, aws-india, blacksmith-2vcpu-ubuntu-2404]
target: armv7-unknown-linux-gnueabihf
artifact: zeroclaw
archive_ext: tar.gz
cross_compiler: gcc-arm-linux-gnueabihf
linker_env: CARGO_TARGET_ARMV7_UNKNOWN_LINUX_GNUEABIHF_LINKER
linker: arm-linux-gnueabihf-gcc
- os: [self-hosted, Linux, X64, aws-india, blacksmith-2vcpu-ubuntu-2404]
target: armv7-linux-androideabi
artifact: zeroclaw
archive_ext: tar.gz
cross_compiler: ""
linker_env: ""
linker: ""
android_ndk: true
android_api: 21
- os: [self-hosted, Linux, X64, aws-india, blacksmith-2vcpu-ubuntu-2404]
target: aarch64-linux-android
artifact: zeroclaw
archive_ext: tar.gz
cross_compiler: ""
linker_env: ""
linker: ""
android_ndk: true
android_api: 21
- os: [self-hosted, Linux, X64, aws-india, blacksmith-2vcpu-ubuntu-2404]
target: x86_64-unknown-freebsd
artifact: zeroclaw
archive_ext: tar.gz
cross_compiler: ""
linker_env: ""
linker: ""
use_cross: true
- os: macos-15-intel
target: x86_64-apple-darwin
artifact: zeroclaw
archive_ext: tar.gz
cross_compiler: ""
linker_env: ""
linker: ""
- os: macos-14
target: aarch64-apple-darwin
artifact: zeroclaw
archive_ext: tar.gz
cross_compiler: ""
linker_env: ""
linker: ""
- os: windows-latest
target: x86_64-pc-windows-msvc
artifact: zeroclaw.exe
archive_ext: zip
cross_compiler: ""
linker_env: ""
linker: ""
steps:
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
with:
ref: ${{ needs.prepare.outputs.release_ref }}
- name: Self-heal Rust toolchain cache
shell: bash
run: ./scripts/ci/self_heal_rust_toolchain.sh 1.92.0
- uses: dtolnay/rust-toolchain@631a55b12751854ce901bb631d5902ceb48146f7 # stable
with:
toolchain: 1.92.0
targets: ${{ matrix.target }}
- uses: Swatinem/rust-cache@779680da715d629ac1d338a641029a2f4372abb5 # v3
if: runner.os != 'Windows'
- name: Install cross for cross-built targets
if: matrix.use_cross
shell: bash
run: |
set -euo pipefail
echo "${CARGO_HOME:-$HOME/.cargo}/bin" >> "$GITHUB_PATH"
cargo install cross --locked --version 0.2.5
command -v cross
cross --version
- name: Install cross-compilation toolchain (Linux)
if: runner.os == 'Linux' && matrix.cross_compiler != ''
run: |
set -euo pipefail
for i in {1..60}; do
if sudo fuser /var/lib/apt/lists/lock >/dev/null 2>&1 \
|| sudo fuser /var/lib/dpkg/lock-frontend >/dev/null 2>&1 \
|| sudo fuser /var/lib/dpkg/lock >/dev/null 2>&1; then
echo "apt/dpkg locked; waiting ($i/60)..."
sleep 5
else
break
fi
done
sudo apt-get -o DPkg::Lock::Timeout=600 -o Acquire::Retries=3 update -qq
sudo apt-get -o DPkg::Lock::Timeout=600 -o Acquire::Retries=3 install -y "${{ matrix.cross_compiler }}"
# Install matching libc dev headers for cross targets
# (required by ring/aws-lc-sys C compilation)
case "${{ matrix.target }}" in
armv7-unknown-linux-gnueabihf)
sudo apt-get -o DPkg::Lock::Timeout=600 -o Acquire::Retries=3 install -y libc6-dev-armhf-cross ;;
aarch64-unknown-linux-gnu)
sudo apt-get -o DPkg::Lock::Timeout=600 -o Acquire::Retries=3 install -y libc6-dev-arm64-cross ;;
esac
- name: Setup Android NDK
if: matrix.android_ndk
shell: bash
run: |
set -euo pipefail
NDK_VERSION="r26d"
NDK_ZIP="android-ndk-${NDK_VERSION}-linux.zip"
NDK_URL="https://dl.google.com/android/repository/${NDK_ZIP}"
NDK_ROOT="${RUNNER_TEMP}/android-ndk"
NDK_HOME="${NDK_ROOT}/android-ndk-${NDK_VERSION}"
for i in {1..60}; do
if sudo fuser /var/lib/apt/lists/lock >/dev/null 2>&1 \
|| sudo fuser /var/lib/dpkg/lock-frontend >/dev/null 2>&1 \
|| sudo fuser /var/lib/dpkg/lock >/dev/null 2>&1; then
echo "apt/dpkg locked; waiting ($i/60)..."
sleep 5
else
break
fi
done
sudo apt-get -o DPkg::Lock::Timeout=600 -o Acquire::Retries=3 update -qq
sudo apt-get -o DPkg::Lock::Timeout=600 -o Acquire::Retries=3 install -y unzip
mkdir -p "${NDK_ROOT}"
curl -fsSL "${NDK_URL}" -o "${RUNNER_TEMP}/${NDK_ZIP}"
unzip -q "${RUNNER_TEMP}/${NDK_ZIP}" -d "${NDK_ROOT}"
echo "ANDROID_NDK_HOME=${NDK_HOME}" >> "$GITHUB_ENV"
echo "${NDK_HOME}/toolchains/llvm/prebuilt/linux-x86_64/bin" >> "$GITHUB_PATH"
- name: Configure Android toolchain
if: matrix.android_ndk
shell: bash
run: |
echo "Setting up Android NDK toolchain for ${{ matrix.target }}"
NDK_HOME="${ANDROID_NDK_HOME:-}"
if [[ -z "$NDK_HOME" ]]; then
echo "::error::ANDROID_NDK_HOME was not configured."
exit 1
fi
TOOLCHAIN="$NDK_HOME/toolchains/llvm/prebuilt/linux-x86_64/bin"
# Add to path for linker resolution
echo "$TOOLCHAIN" >> "$GITHUB_PATH"
# Set linker environment variables
if [[ "${{ matrix.target }}" == "armv7-linux-androideabi" ]]; then
ARMV7_CC="${TOOLCHAIN}/armv7a-linux-androideabi${{ matrix.android_api }}-clang"
ARMV7_CXX="${TOOLCHAIN}/armv7a-linux-androideabi${{ matrix.android_api }}-clang++"
# Some crates still probe legacy compiler names (arm-linux-androideabi-clang).
ln -sf "$ARMV7_CC" "${TOOLCHAIN}/arm-linux-androideabi-clang"
ln -sf "$ARMV7_CXX" "${TOOLCHAIN}/arm-linux-androideabi-clang++"
{
echo "CARGO_TARGET_ARMV7_LINUX_ANDROIDEABI_LINKER=${ARMV7_CC}"
echo "CC_armv7_linux_androideabi=${ARMV7_CC}"
echo "CXX_armv7_linux_androideabi=${ARMV7_CXX}"
echo "AR_armv7_linux_androideabi=${TOOLCHAIN}/llvm-ar"
} >> "$GITHUB_ENV"
elif [[ "${{ matrix.target }}" == "aarch64-linux-android" ]]; then
AARCH64_CC="${TOOLCHAIN}/aarch64-linux-android${{ matrix.android_api }}-clang"
AARCH64_CXX="${TOOLCHAIN}/aarch64-linux-android${{ matrix.android_api }}-clang++"
{
echo "CARGO_TARGET_AARCH64_LINUX_ANDROID_LINKER=${AARCH64_CC}"
echo "CC_aarch64_linux_android=${AARCH64_CC}"
echo "CXX_aarch64_linux_android=${AARCH64_CXX}"
echo "AR_aarch64_linux_android=${TOOLCHAIN}/llvm-ar"
} >> "$GITHUB_ENV"
fi
- name: Build release
shell: bash
env:
LINKER_ENV: ${{ matrix.linker_env }}
LINKER: ${{ matrix.linker }}
USE_CROSS: ${{ matrix.use_cross }}
run: |
if [ -n "$LINKER_ENV" ] && [ -n "$LINKER" ]; then
echo "Using linker override: $LINKER_ENV=$LINKER"
export "$LINKER_ENV=$LINKER"
fi
if [ "$USE_CROSS" = "true" ]; then
echo "Using cross for MUSL target"
cross build --profile release-fast --locked --target ${{ matrix.target }}
else
cargo build --profile release-fast --locked --target ${{ matrix.target }}
fi
- name: Check binary size (Unix)
if: runner.os != 'Windows'
env:
BINARY_SIZE_HARD_LIMIT_MB: 28
BINARY_SIZE_ADVISORY_MB: 20
BINARY_SIZE_TARGET_MB: 5
run: bash scripts/ci/check_binary_size.sh "target/${{ matrix.target }}/release-fast/${{ matrix.artifact }}" "${{ matrix.target }}"
- name: Package (Unix)
if: runner.os != 'Windows'
run: |
cd target/${{ matrix.target }}/release-fast
tar czf ../../../zeroclaw-${{ matrix.target }}.${{ matrix.archive_ext }} ${{ matrix.artifact }}
- name: Package (Windows)
if: runner.os == 'Windows'
run: |
cd target/${{ matrix.target }}/release-fast
7z a ../../../zeroclaw-${{ matrix.target }}.${{ matrix.archive_ext }} ${{ matrix.artifact }}
- name: Upload artifact
uses: actions/upload-artifact@b7c566a772e6b6bfb58ed0dc250532a479d7789f # v6
with:
name: zeroclaw-${{ matrix.target }}
path: zeroclaw-${{ matrix.target }}.${{ matrix.archive_ext }}
retention-days: 7
verify-artifacts:
name: Verify Artifact Set
needs: [prepare, build-release]
runs-on: [self-hosted, Linux, X64, aws-india, blacksmith-2vcpu-ubuntu-2404]
steps:
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
with:
ref: ${{ needs.prepare.outputs.release_ref }}
- name: Download all artifacts
uses: actions/download-artifact@37930b1c2abaa49bbe596cd826c3c89aef350131 # v7.0.0
with:
path: artifacts
- name: Validate release archive contract (verify stage)
shell: bash
run: |
set -euo pipefail
python3 scripts/ci/release_artifact_guard.py \
--artifacts-dir artifacts \
--contract-file .github/release/release-artifact-contract.json \
--output-json artifacts/release-artifact-guard.verify.json \
--output-md artifacts/release-artifact-guard.verify.md \
--allow-extra-archives \
--skip-manifest-files \
--skip-sbom-files \
--skip-notice-files \
--fail-on-violation
- name: Emit verify-stage artifact guard audit event
if: always()
shell: bash
run: |
set -euo pipefail
python3 scripts/ci/emit_audit_event.py \
--event-type release_artifact_guard_verify \
--input-json artifacts/release-artifact-guard.verify.json \
--output-json artifacts/audit-event-release-artifact-guard-verify.json \
--artifact-name release-artifact-guard-verify \
--retention-days 21
- name: Publish verify-stage artifact guard summary
if: always()
shell: bash
run: |
set -euo pipefail
cat artifacts/release-artifact-guard.verify.md >> "$GITHUB_STEP_SUMMARY"
- name: Upload verify-stage artifact guard reports
if: always()
uses: actions/upload-artifact@b7c566a772e6b6bfb58ed0dc250532a479d7789f # v6.0.0
with:
name: release-artifact-guard-verify
path: |
artifacts/release-artifact-guard.verify.json
artifacts/release-artifact-guard.verify.md
artifacts/audit-event-release-artifact-guard-verify.json
if-no-files-found: error
retention-days: 21
publish:
name: Publish Release
if: needs.prepare.outputs.publish_release == 'true'
needs: [prepare, verify-artifacts]
runs-on: [self-hosted, Linux, X64, aws-india, blacksmith-2vcpu-ubuntu-2404]
timeout-minutes: 45
steps:
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
with:
ref: ${{ needs.prepare.outputs.release_ref }}
- name: Download all artifacts
uses: actions/download-artifact@37930b1c2abaa49bbe596cd826c3c89aef350131 # v7.0.0
with:
path: artifacts
- name: Install syft
shell: bash
run: |
set -euo pipefail
mkdir -p "${RUNNER_TEMP}/bin"
./scripts/ci/install_syft.sh "${RUNNER_TEMP}/bin"
echo "${RUNNER_TEMP}/bin" >> "$GITHUB_PATH"
- name: Generate SBOM (CycloneDX)
run: |
syft dir:. --source-name zeroclaw -o cyclonedx-json=artifacts/zeroclaw.cdx.json -o spdx-json=artifacts/zeroclaw.spdx.json
{
echo "### SBOM Generated"
echo "- CycloneDX: zeroclaw.cdx.json"
echo "- SPDX: zeroclaw.spdx.json"
} >> "$GITHUB_STEP_SUMMARY"
- name: Attach license and notice files
run: |
cp LICENSE-APACHE artifacts/LICENSE-APACHE
cp LICENSE-MIT artifacts/LICENSE-MIT
cp NOTICE artifacts/NOTICE
- name: Generate release manifest + checksums
shell: bash
env:
RELEASE_TAG: ${{ needs.prepare.outputs.release_tag }}
run: |
set -euo pipefail
python3 scripts/ci/release_manifest.py \
--artifacts-dir artifacts \
--release-tag "${RELEASE_TAG}" \
--output-json artifacts/release-manifest.json \
--output-md artifacts/release-manifest.md \
--checksums-path artifacts/SHA256SUMS \
--fail-empty
- name: Generate SHA256SUMS provenance statement
shell: bash
env:
RELEASE_TAG: ${{ needs.prepare.outputs.release_tag }}
run: |
set -euo pipefail
python3 scripts/ci/generate_provenance.py \
--artifact artifacts/SHA256SUMS \
--subject-name "zeroclaw-${RELEASE_TAG}-sha256sums" \
--output artifacts/zeroclaw.sha256sums.intoto.json
- name: Emit SHA256SUMS provenance audit event
shell: bash
run: |
set -euo pipefail
python3 scripts/ci/emit_audit_event.py \
--event-type release_sha256sums_provenance \
--input-json artifacts/zeroclaw.sha256sums.intoto.json \
--output-json artifacts/audit-event-release-sha256sums-provenance.json \
--artifact-name release-sha256sums-provenance \
--retention-days 30
- name: Validate release artifact contract (publish stage)
shell: bash
run: |
set -euo pipefail
python3 scripts/ci/release_artifact_guard.py \
--artifacts-dir artifacts \
--contract-file .github/release/release-artifact-contract.json \
--output-json artifacts/release-artifact-guard.publish.json \
--output-md artifacts/release-artifact-guard.publish.md \
--allow-extra-archives \
--allow-extra-manifest-files \
--allow-extra-sbom-files \
--allow-extra-notice-files \
--fail-on-violation
- name: Emit publish-stage artifact guard audit event
if: always()
shell: bash
run: |
set -euo pipefail
python3 scripts/ci/emit_audit_event.py \
--event-type release_artifact_guard_publish \
--input-json artifacts/release-artifact-guard.publish.json \
--output-json artifacts/audit-event-release-artifact-guard-publish.json \
--artifact-name release-artifact-guard-publish \
--retention-days 30
- name: Publish artifact guard summary
shell: bash
run: |
set -euo pipefail
cat artifacts/release-artifact-guard.publish.md >> "$GITHUB_STEP_SUMMARY"
- name: Publish release manifest summary
shell: bash
run: |
set -euo pipefail
cat artifacts/release-manifest.md >> "$GITHUB_STEP_SUMMARY"
- name: Install cosign
uses: sigstore/cosign-installer@faadad0cce49287aee09b3a48701e75088a2c6ad # v4.0.0
- name: Sign artifacts with cosign (keyless)
shell: bash
run: |
set -euo pipefail
while IFS= read -r -d '' file; do
cosign sign-blob --yes \
--bundle="${file}.sigstore.json" \
--output-signature="${file}.sig" \
--output-certificate="${file}.pem" \
"$file"
done < <(find artifacts -type f ! -name '*.sig' ! -name '*.pem' ! -name '*.sigstore.json' -print0)
- name: Compose release-notes supply-chain references
shell: bash
env:
RELEASE_TAG: ${{ needs.prepare.outputs.release_tag }}
run: |
set -euo pipefail
python3 scripts/ci/release_notes_with_supply_chain_refs.py \
--artifacts-dir artifacts \
--repository "${GITHUB_REPOSITORY}" \
--release-tag "${RELEASE_TAG}" \
--output-json artifacts/release-notes-supply-chain.json \
--output-md artifacts/release-notes-supply-chain.md \
--fail-on-missing
- name: Publish release-notes supply-chain summary
shell: bash
run: |
set -euo pipefail
cat artifacts/release-notes-supply-chain.md >> "$GITHUB_STEP_SUMMARY"
- name: Verify GHCR release tag availability
shell: bash
env:
RELEASE_TAG: ${{ needs.prepare.outputs.release_tag }}
run: |
set -euo pipefail
repo="${GITHUB_REPOSITORY,,}"
manifest_url="https://ghcr.io/v2/${repo}/manifests/${RELEASE_TAG}"
accept_header="application/vnd.oci.image.index.v1+json, application/vnd.docker.distribution.manifest.v2+json"
max_attempts=75
sleep_seconds=20
for attempt in $(seq 1 "$max_attempts"); do
token_resp="$(curl -sS "https://ghcr.io/token?scope=repository:${repo}:pull" || true)"
token="$(echo "$token_resp" | sed -n 's/.*"token":"\([^"]*\)".*/\1/p')"
if [ -z "$token" ]; then
code="000"
else
code="$(curl -sS -o /tmp/ghcr-release-manifest.json -w "%{http_code}" \
-H "Authorization: Bearer ${token}" \
-H "Accept: ${accept_header}" \
"${manifest_url}" || true)"
fi
if [ "$code" = "200" ]; then
echo "GHCR release tag is available: ${repo}:${RELEASE_TAG}"
exit 0
fi
if [ "$attempt" -lt "$max_attempts" ]; then
echo "Waiting for GHCR tag ${repo}:${RELEASE_TAG} (attempt ${attempt}/${max_attempts}, HTTP ${code})..."
sleep "$sleep_seconds"
fi
done
echo "::error::GHCR tag ${repo}:${RELEASE_TAG} was not available before release publish timeout."
cat /tmp/ghcr-release-manifest.json || true
exit 1
- name: Create GitHub Release
uses: softprops/action-gh-release@a06a81a03ee405af7f2048a818ed3f03bbf83c7b # v2
with:
tag_name: ${{ needs.prepare.outputs.release_tag }}
draft: ${{ needs.prepare.outputs.draft_release == 'true' }}
body_path: artifacts/release-notes-supply-chain.md
generate_release_notes: true
files: |
artifacts/**/*
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}

165
.github/workflows/pub-scoop.yml vendored Normal file
View File

@ -0,0 +1,165 @@
name: Pub Scoop Manifest
on:
workflow_call:
inputs:
release_tag:
description: "Existing release tag (vX.Y.Z)"
required: true
type: string
dry_run:
description: "Generate manifest only (no push)"
required: false
default: false
type: boolean
secrets:
SCOOP_BUCKET_TOKEN:
required: false
workflow_dispatch:
inputs:
release_tag:
description: "Existing release tag (vX.Y.Z)"
required: true
type: string
dry_run:
description: "Generate manifest only (no push)"
required: false
default: true
type: boolean
concurrency:
group: scoop-publish-${{ github.run_id }}
cancel-in-progress: false
permissions:
contents: read
jobs:
publish-scoop:
name: Update Scoop Manifest
runs-on: ubuntu-latest
env:
RELEASE_TAG: ${{ inputs.release_tag }}
DRY_RUN: ${{ inputs.dry_run }}
SCOOP_BUCKET_REPO: ${{ vars.SCOOP_BUCKET_REPO }}
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Validate and compute metadata
id: meta
shell: bash
run: |
set -euo pipefail
if [[ ! "$RELEASE_TAG" =~ ^v[0-9]+\.[0-9]+\.[0-9]+$ ]]; then
echo "::error::release_tag must be vX.Y.Z format."
exit 1
fi
version="${RELEASE_TAG#v}"
zip_url="https://github.com/${GITHUB_REPOSITORY}/releases/download/${RELEASE_TAG}/zeroclaw-x86_64-pc-windows-msvc.zip"
sums_url="https://github.com/${GITHUB_REPOSITORY}/releases/download/${RELEASE_TAG}/SHA256SUMS"
sha256="$(curl -fsSL "$sums_url" | grep 'zeroclaw-x86_64-pc-windows-msvc.zip' | awk '{print $1}')"
if [[ -z "$sha256" ]]; then
echo "::error::Could not find Windows binary hash in SHA256SUMS for ${RELEASE_TAG}."
exit 1
fi
{
echo "version=$version"
echo "zip_url=$zip_url"
echo "sha256=$sha256"
} >> "$GITHUB_OUTPUT"
{
echo "### Scoop Manifest Metadata"
echo "- version: \`${version}\`"
echo "- zip_url: \`${zip_url}\`"
echo "- sha256: \`${sha256}\`"
} >> "$GITHUB_STEP_SUMMARY"
- name: Generate manifest
id: manifest
shell: bash
env:
VERSION: ${{ steps.meta.outputs.version }}
ZIP_URL: ${{ steps.meta.outputs.zip_url }}
SHA256: ${{ steps.meta.outputs.sha256 }}
run: |
set -euo pipefail
manifest_file="$(mktemp)"
cat > "$manifest_file" <<MANIFEST
{
"version": "${VERSION}",
"description": "Zero overhead. Zero compromise. 100% Rust. The fastest, smallest AI assistant.",
"homepage": "https://github.com/zeroclaw-labs/zeroclaw",
"license": "MIT|Apache-2.0",
"architecture": {
"64bit": {
"url": "${ZIP_URL}",
"hash": "${SHA256}",
"bin": "zeroclaw.exe"
}
},
"checkver": {
"github": "https://github.com/zeroclaw-labs/zeroclaw"
},
"autoupdate": {
"architecture": {
"64bit": {
"url": "https://github.com/zeroclaw-labs/zeroclaw/releases/download/v\$version/zeroclaw-x86_64-pc-windows-msvc.zip"
}
},
"hash": {
"url": "https://github.com/zeroclaw-labs/zeroclaw/releases/download/v\$version/SHA256SUMS",
"regex": "([a-f0-9]{64})\\\\s+zeroclaw-x86_64-pc-windows-msvc\\\\.zip"
}
}
}
MANIFEST
jq '.' "$manifest_file" > "${manifest_file}.formatted"
mv "${manifest_file}.formatted" "$manifest_file"
echo "manifest_file=$manifest_file" >> "$GITHUB_OUTPUT"
echo "### Generated Manifest" >> "$GITHUB_STEP_SUMMARY"
echo '```json' >> "$GITHUB_STEP_SUMMARY"
cat "$manifest_file" >> "$GITHUB_STEP_SUMMARY"
echo '```' >> "$GITHUB_STEP_SUMMARY"
- name: Push to Scoop bucket
if: inputs.dry_run == false
shell: bash
env:
GH_TOKEN: ${{ secrets.SCOOP_BUCKET_TOKEN }}
MANIFEST_FILE: ${{ steps.manifest.outputs.manifest_file }}
VERSION: ${{ steps.meta.outputs.version }}
run: |
set -euo pipefail
if [[ -z "${SCOOP_BUCKET_REPO}" ]]; then
echo "::error::Repository variable SCOOP_BUCKET_REPO is required (e.g. zeroclaw-labs/scoop-zeroclaw)."
exit 1
fi
tmp_dir="$(mktemp -d)"
gh repo clone "${SCOOP_BUCKET_REPO}" "$tmp_dir/bucket" -- --depth=1
mkdir -p "$tmp_dir/bucket/bucket"
cp "$MANIFEST_FILE" "$tmp_dir/bucket/bucket/zeroclaw.json"
cd "$tmp_dir/bucket"
git config user.name "zeroclaw-bot"
git config user.email "bot@zeroclaw.dev"
git add bucket/zeroclaw.json
git commit -m "zeroclaw ${VERSION}"
gh auth setup-git
git push origin HEAD
echo "Scoop manifest updated to ${VERSION}"

View File

@ -0,0 +1,160 @@
name: Auto-sync crates.io
on:
push:
branches: [master]
paths:
- "Cargo.toml"
concurrency:
group: publish-crates-auto
cancel-in-progress: false
permissions:
contents: read
env:
CARGO_TERM_COLOR: always
jobs:
detect-version-change:
name: Detect Version Bump
if: github.repository == 'zeroclaw-labs/zeroclaw'
runs-on: ubuntu-latest
outputs:
changed: ${{ steps.check.outputs.changed }}
version: ${{ steps.check.outputs.version }}
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 2
- name: Check if version changed
id: check
shell: bash
run: |
set -euo pipefail
current=$(sed -n 's/^version = "\([^"]*\)"/\1/p' Cargo.toml | head -1)
previous=$(git show HEAD~1:Cargo.toml 2>/dev/null | sed -n 's/^version = "\([^"]*\)"/\1/p' | head -1 || echo "")
echo "Current version: ${current}"
echo "Previous version: ${previous}"
# Skip if stable release workflow will handle this version
# (indicated by an existing or imminent stable tag)
if git ls-remote --exit-code --tags origin "refs/tags/v${current}" >/dev/null 2>&1; then
echo "Stable tag v${current} exists — stable release workflow handles crates.io"
echo "changed=false" >> "$GITHUB_OUTPUT"
exit 0
fi
if [[ "$current" != "$previous" && -n "$current" ]]; then
echo "changed=true" >> "$GITHUB_OUTPUT"
echo "version=${current}" >> "$GITHUB_OUTPUT"
echo "Version bumped from ${previous} to ${current} — will publish"
else
echo "changed=false" >> "$GITHUB_OUTPUT"
echo "Version unchanged (${current}) — skipping publish"
fi
check-registry:
name: Check if Already Published
needs: [detect-version-change]
if: needs.detect-version-change.outputs.changed == 'true'
runs-on: ubuntu-latest
outputs:
should_publish: ${{ steps.check.outputs.should_publish }}
steps:
- name: Check crates.io for existing version
id: check
shell: bash
env:
VERSION: ${{ needs.detect-version-change.outputs.version }}
run: |
set -euo pipefail
status=$(curl -s -o /dev/null -w "%{http_code}" \
"https://crates.io/api/v1/crates/zeroclawlabs/${VERSION}")
if [[ "$status" == "200" ]]; then
echo "Version ${VERSION} already exists on crates.io — skipping"
echo "should_publish=false" >> "$GITHUB_OUTPUT"
else
echo "Version ${VERSION} not yet published — proceeding"
echo "should_publish=true" >> "$GITHUB_OUTPUT"
fi
publish:
name: Publish to crates.io
needs: [detect-version-change, check-registry]
if: needs.check-registry.outputs.should_publish == 'true'
runs-on: ubuntu-latest
timeout-minutes: 30
steps:
- uses: actions/checkout@v4
- uses: dtolnay/rust-toolchain@stable
with:
toolchain: 1.92.0
- uses: Swatinem/rust-cache@v2
- uses: actions/setup-node@v4
with:
node-version: 22
cache: npm
cache-dependency-path: web/package-lock.json
- name: Build web dashboard
run: cd web && npm ci && npm run build
- name: Clean web build artifacts
run: rm -rf web/node_modules web/src web/package.json web/package-lock.json web/tsconfig*.json web/vite.config.ts web/index.html
- name: Publish aardvark-sys to crates.io
shell: bash
env:
CARGO_REGISTRY_TOKEN: ${{ secrets.CARGO_REGISTRY_TOKEN }}
run: |
OUTPUT=$(cargo publish --locked --allow-dirty --no-verify -p aardvark-sys 2>&1) && exit 0
echo "$OUTPUT"
if echo "$OUTPUT" | grep -q 'already exists'; then
echo "::notice::aardvark-sys already on crates.io — skipping"
exit 0
fi
exit 1
- name: Wait for aardvark-sys to index
run: sleep 15
- name: Publish to crates.io
shell: bash
env:
CARGO_REGISTRY_TOKEN: ${{ secrets.CARGO_REGISTRY_TOKEN }}
VERSION: ${{ needs.detect-version-change.outputs.version }}
run: |
# Publish to crates.io; treat "already exists" as success
# (manual publish or stable workflow may have already published)
OUTPUT=$(cargo publish --locked --allow-dirty --no-verify 2>&1) && exit 0
echo "$OUTPUT"
if echo "$OUTPUT" | grep -q 'already exists'; then
echo "::notice::zeroclawlabs@${VERSION} already on crates.io — skipping"
exit 0
fi
exit 1
- name: Verify published
shell: bash
env:
VERSION: ${{ needs.detect-version-change.outputs.version }}
run: |
echo "Waiting for crates.io to index..."
sleep 15
status=$(curl -s -o /dev/null -w "%{http_code}" \
"https://crates.io/api/v1/crates/zeroclawlabs/${VERSION}")
if [[ "$status" == "200" ]]; then
echo "zeroclawlabs v${VERSION} is live on crates.io"
echo "Install: cargo install zeroclawlabs"
else
echo "::warning::Version may still be indexing — check https://crates.io/crates/zeroclawlabs"
fi

108
.github/workflows/publish-crates.yml vendored Normal file
View File

@ -0,0 +1,108 @@
name: Publish to crates.io
on:
workflow_dispatch:
inputs:
version:
description: "Version to publish (e.g. 0.2.0) — must match Cargo.toml"
required: true
type: string
dry_run:
description: "Dry run (validate without publishing)"
required: false
type: boolean
default: false
concurrency:
group: publish-crates
cancel-in-progress: false
permissions:
contents: read
env:
CARGO_TERM_COLOR: always
jobs:
validate:
name: Validate
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Check version matches Cargo.toml
shell: bash
env:
INPUT_VERSION: ${{ inputs.version }}
run: |
set -euo pipefail
cargo_version=$(sed -n 's/^version = "\([^"]*\)"/\1/p' Cargo.toml | head -1)
if [[ "$cargo_version" != "$INPUT_VERSION" ]]; then
echo "::error::Cargo.toml version (${cargo_version}) does not match input (${INPUT_VERSION})"
exit 1
fi
publish:
name: Publish to crates.io
needs: [validate]
runs-on: ubuntu-latest
timeout-minutes: 30
steps:
- uses: actions/checkout@v4
- uses: dtolnay/rust-toolchain@stable
with:
toolchain: 1.92.0
- uses: Swatinem/rust-cache@v2
- uses: actions/setup-node@v4
with:
node-version: 22
cache: npm
cache-dependency-path: web/package-lock.json
- name: Build web dashboard
run: cd web && npm ci && npm run build
- name: Clean web build artifacts
run: rm -rf web/node_modules web/src web/package.json web/package-lock.json web/tsconfig*.json web/vite.config.ts web/index.html
- name: Publish aardvark-sys to crates.io
if: "!inputs.dry_run"
shell: bash
env:
CARGO_REGISTRY_TOKEN: ${{ secrets.CARGO_REGISTRY_TOKEN }}
run: |
OUTPUT=$(cargo publish --locked --allow-dirty --no-verify -p aardvark-sys 2>&1) && exit 0
echo "$OUTPUT"
if echo "$OUTPUT" | grep -q 'already exists'; then
echo "::notice::aardvark-sys already on crates.io — skipping"
exit 0
fi
exit 1
- name: Wait for aardvark-sys to index
if: "!inputs.dry_run"
run: sleep 15
- name: Publish (dry run)
if: inputs.dry_run
run: cargo publish --dry-run --locked --allow-dirty --no-verify
env:
CARGO_REGISTRY_TOKEN: ${{ secrets.CARGO_REGISTRY_TOKEN }}
- name: Publish to crates.io
if: "!inputs.dry_run"
shell: bash
env:
CARGO_REGISTRY_TOKEN: ${{ secrets.CARGO_REGISTRY_TOKEN }}
VERSION: ${{ inputs.version }}
run: |
# Publish to crates.io; treat "already exists" as success
OUTPUT=$(cargo publish --locked --allow-dirty --no-verify 2>&1) && exit 0
echo "$OUTPUT"
if echo "$OUTPUT" | grep -q 'already exists'; then
echo "::notice::zeroclawlabs@${VERSION} already on crates.io — skipping"
exit 0
fi
exit 1

View File

@ -0,0 +1,458 @@
name: Release Beta
on:
push:
branches: [master]
concurrency:
group: release-beta
cancel-in-progress: true
permissions:
contents: write
packages: write
env:
CARGO_TERM_COLOR: always
REGISTRY: ghcr.io
IMAGE_NAME: ${{ github.repository }}
RELEASE_CARGO_FEATURES: channel-matrix,channel-lark,memory-postgres
jobs:
version:
name: Resolve Version
if: github.repository == 'zeroclaw-labs/zeroclaw'
runs-on: ubuntu-latest
outputs:
version: ${{ steps.ver.outputs.version }}
tag: ${{ steps.ver.outputs.tag }}
skip: ${{ steps.ver.outputs.skip }}
steps:
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
with:
fetch-depth: 2
- name: Compute beta version
id: ver
shell: bash
run: |
set -euo pipefail
base_version=$(sed -n 's/^version = "\([^"]*\)"/\1/p' Cargo.toml | head -1)
# Skip beta if this is a version bump commit (stable release handles it)
commit_msg=$(git log -1 --pretty=format:"%s")
if [[ "$commit_msg" =~ ^chore:\ bump\ version ]]; then
echo "Version bump commit detected — skipping beta release"
echo "skip=true" >> "$GITHUB_OUTPUT"
exit 0
fi
# Skip beta if a stable tag already exists for this version
if git ls-remote --exit-code --tags origin "refs/tags/v${base_version}" >/dev/null 2>&1; then
echo "Stable tag v${base_version} exists — skipping beta release"
echo "skip=true" >> "$GITHUB_OUTPUT"
exit 0
fi
beta_tag="v${base_version}-beta.${GITHUB_RUN_NUMBER}"
echo "version=${base_version}" >> "$GITHUB_OUTPUT"
echo "tag=${beta_tag}" >> "$GITHUB_OUTPUT"
echo "skip=false" >> "$GITHUB_OUTPUT"
echo "Beta release: ${beta_tag}"
release-notes:
name: Generate Release Notes
needs: [version]
if: github.repository == 'zeroclaw-labs/zeroclaw' && needs.version.outputs.skip != 'true'
runs-on: ubuntu-latest
outputs:
notes: ${{ steps.notes.outputs.body }}
features: ${{ steps.notes.outputs.features }}
contributors: ${{ steps.notes.outputs.contributors }}
steps:
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
with:
fetch-depth: 0
- name: Build release notes
id: notes
shell: bash
run: |
set -euo pipefail
# Use a wider range — find the previous stable tag to capture all
# contributors across the full release cycle, not just one beta bump
PREV_TAG=$(git tag --sort=-creatordate \
| grep -vE '\-beta\.' \
| head -1 || echo "")
if [ -z "$PREV_TAG" ]; then
RANGE="HEAD"
else
RANGE="${PREV_TAG}..HEAD"
fi
# Extract features only (feat commits) — skip bug fixes for clean notes
FEATURES=$(git log "$RANGE" --pretty=format:"%s" --no-merges \
| grep -iE '^feat(\(|:)' \
| sed 's/^feat(\([^)]*\)): /\1: /' \
| sed 's/^feat: //' \
| sed 's/ (#[0-9]*)$//' \
| sort -uf \
| while IFS= read -r line; do echo "- ${line}"; done || true)
if [ -z "$FEATURES" ]; then
FEATURES="- Incremental improvements and polish"
fi
# Collect ALL unique contributors: git authors + Co-Authored-By
GIT_AUTHORS=$(git log "$RANGE" --pretty=format:"%an" --no-merges | sort -uf || true)
CO_AUTHORS=$(git log "$RANGE" --pretty=format:"%b" --no-merges \
| grep -ioE 'Co-Authored-By: *[^<]+' \
| sed 's/Co-Authored-By: *//i' \
| sed 's/ *$//' \
| sort -uf || true)
# Merge, deduplicate, and filter out bots
ALL_CONTRIBUTORS=$(printf "%s\n%s" "$GIT_AUTHORS" "$CO_AUTHORS" \
| sort -uf \
| grep -v '^$' \
| grep -viE '\[bot\]$|^dependabot|^github-actions|^copilot|^ZeroClaw Bot|^ZeroClaw Runner|^ZeroClaw Agent|^blacksmith' \
| while IFS= read -r name; do echo "- ${name}"; done || true)
# Build release body
BODY=$(cat <<NOTES_EOF
## What's New
${FEATURES}
## Contributors
${ALL_CONTRIBUTORS}
---
*Full changelog: ${PREV_TAG}...HEAD*
NOTES_EOF
)
# Output multiline values
{
echo "body<<BODY_EOF"
echo "$BODY"
echo "BODY_EOF"
} >> "$GITHUB_OUTPUT"
{
echo "features<<FEAT_EOF"
echo "$FEATURES"
echo "FEAT_EOF"
} >> "$GITHUB_OUTPUT"
{
echo "contributors<<CONTRIB_EOF"
echo "$ALL_CONTRIBUTORS"
echo "CONTRIB_EOF"
} >> "$GITHUB_OUTPUT"
web:
name: Build Web Dashboard
needs: [version]
if: github.repository == 'zeroclaw-labs/zeroclaw' && needs.version.outputs.skip != 'true'
runs-on: ubuntu-latest
timeout-minutes: 10
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: 22
cache: npm
cache-dependency-path: web/package-lock.json
- name: Build web dashboard
run: cd web && npm ci && npm run build
- uses: actions/upload-artifact@v4
with:
name: web-dist
path: web/dist/
retention-days: 1
build:
name: Build ${{ matrix.target }}
needs: [version, web]
runs-on: ${{ matrix.os }}
timeout-minutes: 40
strategy:
fail-fast: false
matrix:
include:
# Use ubuntu-22.04 for Linux builds to link against glibc 2.35,
# ensuring compatibility with Ubuntu 22.04+ (#3573).
- os: ubuntu-22.04
target: x86_64-unknown-linux-gnu
artifact: zeroclaw
ext: tar.gz
- os: ubuntu-22.04
target: aarch64-unknown-linux-gnu
artifact: zeroclaw
ext: tar.gz
cross_compiler: gcc-aarch64-linux-gnu
linker_env: CARGO_TARGET_AARCH64_UNKNOWN_LINUX_GNU_LINKER
linker: aarch64-linux-gnu-gcc
- os: ubuntu-22.04
target: armv7-unknown-linux-gnueabihf
artifact: zeroclaw
ext: tar.gz
cross_compiler: gcc-arm-linux-gnueabihf
linker_env: CARGO_TARGET_ARMV7_UNKNOWN_LINUX_GNUEABIHF_LINKER
linker: arm-linux-gnueabihf-gcc
- os: macos-14
target: aarch64-apple-darwin
artifact: zeroclaw
ext: tar.gz
- os: ubuntu-latest
target: aarch64-linux-android
artifact: zeroclaw
ext: tar.gz
ndk: true
- os: windows-latest
target: x86_64-pc-windows-msvc
artifact: zeroclaw.exe
ext: zip
steps:
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
- uses: dtolnay/rust-toolchain@631a55b12751854ce901bb631d5902ceb48146f7 # stable
with:
toolchain: 1.92.0
targets: ${{ matrix.target }}
- uses: Swatinem/rust-cache@779680da715d629ac1d338a641029a2f4372abb5 # v2
if: runner.os != 'Windows'
with:
prefix-key: ${{ matrix.os }}-${{ matrix.target }}
- uses: actions/download-artifact@v4
with:
name: web-dist
path: web/dist/
- name: Install cross compiler
if: matrix.cross_compiler
run: |
sudo apt-get update -qq
sudo apt-get install -y ${{ matrix.cross_compiler }}
- name: Setup Android NDK
if: matrix.ndk
run: echo "$ANDROID_NDK/toolchains/llvm/prebuilt/linux-x86_64/bin" >> "$GITHUB_PATH"
- name: Build release
shell: bash
run: |
if [ -n "${{ matrix.linker_env || '' }}" ] && [ -n "${{ matrix.linker || '' }}" ]; then
export "${{ matrix.linker_env }}=${{ matrix.linker }}"
fi
cargo build --release --locked --features "${{ env.RELEASE_CARGO_FEATURES }}" --target ${{ matrix.target }}
- name: Package (Unix)
if: runner.os != 'Windows'
run: |
cd target/${{ matrix.target }}/release
tar czf ../../../zeroclaw-${{ matrix.target }}.${{ matrix.ext }} ${{ matrix.artifact }}
- name: Package (Windows)
if: runner.os == 'Windows'
run: |
cd target/${{ matrix.target }}/release
7z a ../../../zeroclaw-${{ matrix.target }}.${{ matrix.ext }} ${{ matrix.artifact }}
- uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4
with:
name: zeroclaw-${{ matrix.target }}
path: zeroclaw-${{ matrix.target }}.${{ matrix.ext }}
retention-days: 7
build-desktop:
name: Build Desktop App (macOS Universal)
needs: [version]
if: needs.version.outputs.skip != 'true'
runs-on: macos-14
timeout-minutes: 40
steps:
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
- uses: dtolnay/rust-toolchain@631a55b12751854ce901bb631d5902ceb48146f7 # stable
with:
toolchain: 1.92.0
targets: aarch64-apple-darwin,x86_64-apple-darwin
- uses: Swatinem/rust-cache@779680da715d629ac1d338a641029a2f4372abb5 # v2
with:
prefix-key: macos-tauri
- uses: actions/setup-node@v4
with:
node-version: 22
- name: Install Tauri CLI
run: cargo install tauri-cli --locked
- name: Sync Tauri version with Cargo.toml
shell: bash
run: |
VERSION=$(sed -n 's/^version = "\([^"]*\)"/\1/p' Cargo.toml | head -1)
cd apps/tauri
if command -v jq >/dev/null 2>&1; then
jq --arg v "$VERSION" '.version = $v' tauri.conf.json > tmp.json && mv tmp.json tauri.conf.json
else
sed -i '' "s/\"version\": \"[^\"]*\"/\"version\": \"$VERSION\"/" tauri.conf.json
fi
echo "Tauri version set to: $VERSION"
- name: Build Tauri app (universal binary)
working-directory: apps/tauri
run: cargo tauri build --target universal-apple-darwin
- name: Prepare desktop release assets
run: |
mkdir -p desktop-assets
find target -name '*.dmg' -exec cp {} desktop-assets/ZeroClaw.dmg \; 2>/dev/null || true
find target -name '*.app.tar.gz' -exec cp {} desktop-assets/ZeroClaw-macos.app.tar.gz \; 2>/dev/null || true
find target -name '*.app.tar.gz.sig' -exec cp {} desktop-assets/ZeroClaw-macos.app.tar.gz.sig \; 2>/dev/null || true
echo "--- Desktop assets ---"
ls -lh desktop-assets/
- uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4
with:
name: desktop-macos
path: desktop-assets/*
retention-days: 7
publish:
name: Publish Beta Release
needs: [version, release-notes, build, build-desktop]
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
- uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4
with:
pattern: zeroclaw-*
path: artifacts
- uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4
with:
name: desktop-macos
path: artifacts/desktop-macos
- name: Generate checksums
run: |
cd artifacts
find . -type f \( -name '*.tar.gz' -o -name '*.zip' -o -name '*.dmg' \) -exec sha256sum {} + | sed 's| \./[^/]*/| |' > SHA256SUMS
cat SHA256SUMS
- name: Collect release assets
run: |
mkdir -p release-assets
find artifacts -type f \( -name '*.tar.gz' -o -name '*.zip' -o -name '*.dmg' -o -name 'SHA256SUMS' \) -exec cp {} release-assets/ \;
cp install.sh release-assets/
echo "--- Assets ---"
ls -lh release-assets/
- name: Write release notes
env:
NOTES: ${{ needs.release-notes.outputs.notes }}
run: printf '%s\n' "$NOTES" > release-notes.md
- name: Create GitHub Release
env:
GH_TOKEN: ${{ secrets.RELEASE_TOKEN }}
TAG: ${{ needs.version.outputs.tag }}
run: |
gh release create "$TAG" release-assets/* \
--repo "${{ github.repository }}" \
--title "$TAG" \
--notes-file release-notes.md \
--prerelease
redeploy-website:
name: Trigger Website Redeploy
needs: [publish]
runs-on: ubuntu-latest
steps:
- name: Trigger website redeploy
env:
PAT: ${{ secrets.WEBSITE_REPO_PAT }}
run: |
curl -fsSL -X POST \
-H "Authorization: token $PAT" \
-H "Accept: application/vnd.github+json" \
https://api.github.com/repos/zeroclaw-labs/zeroclaw-website/dispatches \
-d '{"event_type":"new-release","client_payload":{"install_script_url":"https://raw.githubusercontent.com/zeroclaw-labs/zeroclaw/master/install.sh"}}'
docker:
name: Push Docker Image
needs: [version, build]
runs-on: ubuntu-latest
timeout-minutes: 15
steps:
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
- uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4
with:
name: zeroclaw-x86_64-unknown-linux-gnu
path: artifacts/
- uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4
with:
name: zeroclaw-aarch64-unknown-linux-gnu
path: artifacts/
- name: Prepare Docker context with pre-built binaries
run: |
mkdir -p docker-ctx/bin/amd64 docker-ctx/bin/arm64
tar xzf artifacts/zeroclaw-x86_64-unknown-linux-gnu.tar.gz -C docker-ctx/bin/amd64
tar xzf artifacts/zeroclaw-aarch64-unknown-linux-gnu.tar.gz -C docker-ctx/bin/arm64
mkdir -p docker-ctx/zeroclaw-data/.zeroclaw docker-ctx/zeroclaw-data/workspace
printf '%s\n' \
'workspace_dir = "/zeroclaw-data/workspace"' \
'config_path = "/zeroclaw-data/.zeroclaw/config.toml"' \
'api_key = ""' \
'default_provider = "openrouter"' \
'default_model = "anthropic/claude-sonnet-4-20250514"' \
'default_temperature = 0.7' \
'' \
'[gateway]' \
'port = 42617' \
'host = "[::]"' \
'allow_public_bind = true' \
> docker-ctx/zeroclaw-data/.zeroclaw/config.toml
cp Dockerfile.ci docker-ctx/Dockerfile
cp Dockerfile.debian.ci docker-ctx/Dockerfile.debian
- uses: docker/setup-buildx-action@8d2750c68a42422c14e847fe6c8ac0403b4cbd6f # v3
- uses: docker/login-action@c94ce9fb468520275223c153574b00df6fe4bcc9 # v3
with:
registry: ${{ env.REGISTRY }}
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Build and push
uses: docker/build-push-action@10e90e3645eae34f1e60eeb005ba3a3d33f178e8 # v6
with:
context: docker-ctx
push: true
tags: |
${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ needs.version.outputs.tag }}
${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:beta
platforms: linux/amd64,linux/arm64
- name: Build and push Debian compatibility image
uses: docker/build-push-action@10e90e3645eae34f1e60eeb005ba3a3d33f178e8 # v6
with:
context: docker-ctx
file: docker-ctx/Dockerfile.debian
push: true
tags: |
${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ needs.version.outputs.tag }}-debian
${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:beta-debian
platforms: linux/amd64,linux/arm64
# Tweet removed — only stable releases should tweet (see tweet-release.yml).

View File

@ -1,102 +0,0 @@
name: Production Release Build
on:
push:
branches: ["main"]
tags: ["v*"]
workflow_dispatch:
concurrency:
group: production-release-build-${{ github.ref || github.run_id }}
cancel-in-progress: false
permissions:
contents: read
env:
GIT_CONFIG_COUNT: "1"
GIT_CONFIG_KEY_0: core.hooksPath
GIT_CONFIG_VALUE_0: /dev/null
CARGO_TERM_COLOR: always
jobs:
build-and-test:
name: Build and Test (Linux x86_64)
runs-on: [self-hosted, Linux, X64, aws-india, blacksmith-2vcpu-ubuntu-2404]
timeout-minutes: 120
steps:
- name: Checkout
uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
- name: Ensure C toolchain
shell: bash
run: bash ./scripts/ci/ensure_c_toolchain.sh
- name: Self-heal Rust toolchain cache
shell: bash
run: ./scripts/ci/self_heal_rust_toolchain.sh 1.92.0
- name: Setup Rust
uses: dtolnay/rust-toolchain@631a55b12751854ce901bb631d5902ceb48146f7 # stable
with:
toolchain: 1.92.0
components: rustfmt, clippy
- name: Ensure C toolchain for Rust builds
shell: bash
run: ./scripts/ci/ensure_cc.sh
- name: Ensure cargo component
shell: bash
env:
ENSURE_CARGO_COMPONENT_STRICT: "true"
run: bash ./scripts/ci/ensure_cargo_component.sh 1.92.0
- name: Ensure rustfmt and clippy components
shell: bash
run: rustup component add rustfmt clippy --toolchain 1.92.0
- name: Activate toolchain binaries on PATH
shell: bash
run: |
set -euo pipefail
toolchain_bin="$(dirname "$(rustup which --toolchain 1.92.0 cargo)")"
echo "$toolchain_bin" >> "$GITHUB_PATH"
- name: Cache Cargo registry and target
uses: Swatinem/rust-cache@779680da715d629ac1d338a641029a2f4372abb5 # v3
with:
prefix-key: production-release-build
shared-key: ${{ runner.os }}-${{ hashFiles('Cargo.lock') }}
cache-targets: true
cache-bin: false
- name: Rust quality gates
shell: bash
run: |
set -euo pipefail
./scripts/ci/rust_quality_gate.sh
cargo test --locked --lib --bins --verbose
- name: Build production binary (canonical)
shell: bash
run: cargo build --release --locked
- name: Prepare artifact bundle
shell: bash
run: |
set -euo pipefail
mkdir -p artifacts
cp target/release/zeroclaw artifacts/zeroclaw
sha256sum artifacts/zeroclaw > artifacts/zeroclaw.sha256
- name: Upload production artifact
uses: actions/upload-artifact@b7c566a772e6b6bfb58ed0dc250532a479d7789f # v6.0.0
with:
name: zeroclaw-linux-amd64
path: |
artifacts/zeroclaw
artifacts/zeroclaw.sha256
if-no-files-found: error
retention-days: 21

View File

@ -0,0 +1,570 @@
name: Release Stable
on:
push:
tags:
- "v[0-9]+.[0-9]+.[0-9]+" # stable tags only (no -beta suffix)
workflow_dispatch:
inputs:
version:
description: "Stable version to release (e.g. 0.2.0)"
required: true
type: string
concurrency:
group: promote-release
cancel-in-progress: false
permissions:
contents: write
packages: write
env:
CARGO_TERM_COLOR: always
REGISTRY: ghcr.io
IMAGE_NAME: ${{ github.repository }}
RELEASE_CARGO_FEATURES: channel-matrix,channel-lark,memory-postgres
jobs:
validate:
name: Validate Version
runs-on: ubuntu-latest
outputs:
tag: ${{ steps.check.outputs.tag }}
steps:
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
- name: Validate semver and Cargo.toml match
id: check
shell: bash
env:
INPUT_VERSION: ${{ inputs.version || '' }}
REF_NAME: ${{ github.ref_name }}
EVENT_NAME: ${{ github.event_name }}
run: |
set -euo pipefail
cargo_version=$(sed -n 's/^version = "\([^"]*\)"/\1/p' Cargo.toml | head -1)
# Resolve version from tag push or manual input
if [[ "$EVENT_NAME" == "push" ]]; then
# Tag push: extract version from tag name (v0.5.9 -> 0.5.9)
input_version="${REF_NAME#v}"
else
input_version="$INPUT_VERSION"
fi
if [[ ! "$input_version" =~ ^[0-9]+\.[0-9]+\.[0-9]+$ ]]; then
echo "::error::Version must be semver (X.Y.Z). Got: ${input_version}"
exit 1
fi
if [[ "$cargo_version" != "$input_version" ]]; then
echo "::error::Cargo.toml version (${cargo_version}) does not match input (${input_version}). Bump Cargo.toml first."
exit 1
fi
tag="v${input_version}"
# Only check tag existence for manual dispatch (tag push means it already exists)
if [[ "$EVENT_NAME" != "push" ]]; then
if git ls-remote --exit-code --tags origin "refs/tags/${tag}" >/dev/null 2>&1; then
echo "::error::Tag ${tag} already exists."
exit 1
fi
fi
echo "tag=${tag}" >> "$GITHUB_OUTPUT"
web:
name: Build Web Dashboard
runs-on: ubuntu-latest
timeout-minutes: 10
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with:
node-version: 22
cache: npm
cache-dependency-path: web/package-lock.json
- name: Build web dashboard
run: cd web && npm ci && npm run build
- uses: actions/upload-artifact@v4
with:
name: web-dist
path: web/dist/
retention-days: 1
release-notes:
name: Generate Release Notes
runs-on: ubuntu-latest
outputs:
notes: ${{ steps.notes.outputs.body }}
steps:
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
with:
fetch-depth: 0
- name: Build release notes
id: notes
shell: bash
env:
INPUT_VERSION: ${{ inputs.version }}
run: |
set -euo pipefail
# Find the previous stable tag (exclude beta tags)
PREV_TAG=$(git tag --sort=-creatordate | grep -vE '\-beta\.' | grep -v "^v${INPUT_VERSION}$" | head -1 || echo "")
if [ -z "$PREV_TAG" ]; then
RANGE="HEAD"
else
RANGE="${PREV_TAG}..HEAD"
fi
# Extract features only — skip bug fixes for clean release notes
FEATURES=$(git log "$RANGE" --pretty=format:"%s" --no-merges \
| grep -iE '^feat(\(|:)' \
| sed 's/^feat(\([^)]*\)): /\1: /' \
| sed 's/^feat: //' \
| sed 's/ (#[0-9]*)$//' \
| sort -uf \
| while IFS= read -r line; do echo "- ${line}"; done || true)
if [ -z "$FEATURES" ]; then
FEATURES="- Incremental improvements and polish"
fi
# Collect ALL unique contributors: git authors + Co-Authored-By
GIT_AUTHORS=$(git log "$RANGE" --pretty=format:"%an" --no-merges | sort -uf || true)
CO_AUTHORS=$(git log "$RANGE" --pretty=format:"%b" --no-merges \
| grep -ioE 'Co-Authored-By: *[^<]+' \
| sed 's/Co-Authored-By: *//i' \
| sed 's/ *$//' \
| sort -uf || true)
# Merge, deduplicate, and filter out bots
ALL_CONTRIBUTORS=$(printf "%s\n%s" "$GIT_AUTHORS" "$CO_AUTHORS" \
| sort -uf \
| grep -v '^$' \
| grep -viE '\[bot\]$|^dependabot|^github-actions|^copilot|^ZeroClaw Bot|^ZeroClaw Runner|^ZeroClaw Agent|^blacksmith' \
| while IFS= read -r name; do echo "- ${name}"; done || true)
BODY=$(cat <<NOTES_EOF
## What's New
${FEATURES}
## Contributors
${ALL_CONTRIBUTORS}
---
*Full changelog: ${PREV_TAG}...v${INPUT_VERSION}*
NOTES_EOF
)
{
echo "body<<BODY_EOF"
echo "$BODY"
echo "BODY_EOF"
} >> "$GITHUB_OUTPUT"
build:
name: Build ${{ matrix.target }}
needs: [validate, web]
runs-on: ${{ matrix.os }}
timeout-minutes: 40
strategy:
fail-fast: false
matrix:
include:
# Use ubuntu-22.04 for Linux builds to link against glibc 2.35,
# ensuring compatibility with Ubuntu 22.04+ (#3573).
- os: ubuntu-22.04
target: x86_64-unknown-linux-gnu
artifact: zeroclaw
ext: tar.gz
- os: ubuntu-22.04
target: aarch64-unknown-linux-gnu
artifact: zeroclaw
ext: tar.gz
cross_compiler: gcc-aarch64-linux-gnu
linker_env: CARGO_TARGET_AARCH64_UNKNOWN_LINUX_GNU_LINKER
linker: aarch64-linux-gnu-gcc
- os: ubuntu-22.04
target: armv7-unknown-linux-gnueabihf
artifact: zeroclaw
ext: tar.gz
cross_compiler: gcc-arm-linux-gnueabihf
linker_env: CARGO_TARGET_ARMV7_UNKNOWN_LINUX_GNUEABIHF_LINKER
linker: arm-linux-gnueabihf-gcc
skip_prometheus: true
- os: ubuntu-22.04
target: arm-unknown-linux-gnueabihf
artifact: zeroclaw
ext: tar.gz
cross_compiler: gcc-arm-linux-gnueabihf
linker_env: CARGO_TARGET_ARM_UNKNOWN_LINUX_GNUEABIHF_LINKER
linker: arm-linux-gnueabihf-gcc
skip_prometheus: true
- os: macos-14
target: aarch64-apple-darwin
artifact: zeroclaw
ext: tar.gz
- os: ubuntu-latest
target: aarch64-linux-android
artifact: zeroclaw
ext: tar.gz
ndk: true
- os: windows-latest
target: x86_64-pc-windows-msvc
artifact: zeroclaw.exe
ext: zip
steps:
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
- uses: dtolnay/rust-toolchain@631a55b12751854ce901bb631d5902ceb48146f7 # stable
with:
toolchain: 1.92.0
targets: ${{ matrix.target }}
- uses: Swatinem/rust-cache@779680da715d629ac1d338a641029a2f4372abb5 # v2
if: runner.os != 'Windows'
with:
prefix-key: ${{ matrix.os }}-${{ matrix.target }}
- uses: actions/download-artifact@v4
with:
name: web-dist
path: web/dist/
- name: Install cross compiler
if: matrix.cross_compiler
run: |
sudo apt-get update -qq
sudo apt-get install -y ${{ matrix.cross_compiler }}
- name: Setup Android NDK
if: matrix.ndk
run: echo "$ANDROID_NDK/toolchains/llvm/prebuilt/linux-x86_64/bin" >> "$GITHUB_PATH"
- name: Build release
shell: bash
run: |
if [ -n "${{ matrix.linker_env || '' }}" ] && [ -n "${{ matrix.linker || '' }}" ]; then
export "${{ matrix.linker_env }}=${{ matrix.linker }}"
fi
if [ "${{ matrix.skip_prometheus || 'false' }}" = "true" ]; then
cargo build --release --locked --no-default-features --features "${{ env.RELEASE_CARGO_FEATURES }},channel-nostr,skill-creation" --target ${{ matrix.target }}
else
cargo build --release --locked --features "${{ env.RELEASE_CARGO_FEATURES }}" --target ${{ matrix.target }}
fi
- name: Package (Unix)
if: runner.os != 'Windows'
run: |
cd target/${{ matrix.target }}/release
tar czf ../../../zeroclaw-${{ matrix.target }}.${{ matrix.ext }} ${{ matrix.artifact }}
- name: Package (Windows)
if: runner.os == 'Windows'
run: |
cd target/${{ matrix.target }}/release
7z a ../../../zeroclaw-${{ matrix.target }}.${{ matrix.ext }} ${{ matrix.artifact }}
- uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4
with:
name: zeroclaw-${{ matrix.target }}
path: zeroclaw-${{ matrix.target }}.${{ matrix.ext }}
retention-days: 14
build-desktop:
name: Build Desktop App (macOS Universal)
needs: [validate]
runs-on: macos-14
timeout-minutes: 40
steps:
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
- uses: dtolnay/rust-toolchain@631a55b12751854ce901bb631d5902ceb48146f7 # stable
with:
toolchain: 1.92.0
targets: aarch64-apple-darwin,x86_64-apple-darwin
- uses: Swatinem/rust-cache@779680da715d629ac1d338a641029a2f4372abb5 # v2
with:
prefix-key: macos-tauri
- uses: actions/setup-node@v4
with:
node-version: 22
- name: Install Tauri CLI
run: cargo install tauri-cli --locked
- name: Sync Tauri version with Cargo.toml
shell: bash
run: |
VERSION=$(sed -n 's/^version = "\([^"]*\)"/\1/p' Cargo.toml | head -1)
cd apps/tauri
if command -v jq >/dev/null 2>&1; then
jq --arg v "$VERSION" '.version = $v' tauri.conf.json > tmp.json && mv tmp.json tauri.conf.json
else
sed -i '' "s/\"version\": \"[^\"]*\"/\"version\": \"$VERSION\"/" tauri.conf.json
fi
echo "Tauri version set to: $VERSION"
- name: Build Tauri app (universal binary)
working-directory: apps/tauri
run: cargo tauri build --target universal-apple-darwin
- name: Prepare desktop release assets
run: |
mkdir -p desktop-assets
find target -name '*.dmg' -exec cp {} desktop-assets/ZeroClaw.dmg \; 2>/dev/null || true
find target -name '*.app.tar.gz' -exec cp {} desktop-assets/ZeroClaw-macos.app.tar.gz \; 2>/dev/null || true
find target -name '*.app.tar.gz.sig' -exec cp {} desktop-assets/ZeroClaw-macos.app.tar.gz.sig \; 2>/dev/null || true
echo "--- Desktop assets ---"
ls -lh desktop-assets/
- uses: actions/upload-artifact@ea165f8d65b6e75b540449e92b4886f43607fa02 # v4
with:
name: desktop-macos
path: desktop-assets/*
retention-days: 14
publish:
name: Publish Stable Release
needs: [validate, release-notes, build, build-desktop]
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
- uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4
with:
pattern: zeroclaw-*
path: artifacts
- uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4
with:
name: desktop-macos
path: artifacts/desktop-macos
- name: Generate checksums
run: |
cd artifacts
find . -type f \( -name '*.tar.gz' -o -name '*.zip' -o -name '*.dmg' \) -exec sha256sum {} + | sed 's| \./[^/]*/| |' > SHA256SUMS
cat SHA256SUMS
- name: Collect release assets
run: |
mkdir -p release-assets
find artifacts -type f \( -name '*.tar.gz' -o -name '*.zip' -o -name '*.dmg' -o -name 'SHA256SUMS' \) -exec cp {} release-assets/ \;
cp install.sh release-assets/
echo "--- Assets ---"
ls -lh release-assets/
- name: Write release notes
env:
NOTES: ${{ needs.release-notes.outputs.notes }}
run: printf '%s\n' "$NOTES" > release-notes.md
- name: Create tag if manual dispatch
if: github.event_name == 'workflow_dispatch'
env:
TAG: ${{ needs.validate.outputs.tag }}
run: |
git tag -a "$TAG" -m "zeroclaw $TAG"
git push origin "$TAG"
- name: Create GitHub Release
env:
GH_TOKEN: ${{ secrets.RELEASE_TOKEN }}
TAG: ${{ needs.validate.outputs.tag }}
run: |
gh release create "$TAG" release-assets/* \
--repo "${{ github.repository }}" \
--title "$TAG" \
--notes-file release-notes.md \
--latest
crates-io:
name: Publish to crates.io
needs: [validate, publish]
runs-on: ubuntu-latest
timeout-minutes: 30
steps:
- uses: actions/checkout@v4
- uses: dtolnay/rust-toolchain@stable
with:
toolchain: 1.92.0
- uses: Swatinem/rust-cache@v2
- uses: actions/setup-node@v4
with:
node-version: 22
cache: npm
cache-dependency-path: web/package-lock.json
- name: Build web dashboard
run: cd web && npm ci && npm run build
- name: Clean web build artifacts
run: rm -rf web/node_modules web/src web/package.json web/package-lock.json web/tsconfig*.json web/vite.config.ts web/index.html
- name: Publish aardvark-sys to crates.io
env:
CARGO_REGISTRY_TOKEN: ${{ secrets.CARGO_REGISTRY_TOKEN }}
run: |
OUTPUT=$(cargo publish --locked --allow-dirty --no-verify -p aardvark-sys 2>&1) && exit 0
echo "$OUTPUT"
if echo "$OUTPUT" | grep -q 'already exists'; then
echo "::notice::aardvark-sys already on crates.io — skipping"
exit 0
fi
exit 1
- name: Wait for aardvark-sys to index
run: sleep 15
- name: Publish to crates.io
env:
CARGO_REGISTRY_TOKEN: ${{ secrets.CARGO_REGISTRY_TOKEN }}
VERSION: ${{ inputs.version }}
run: |
# Publish to crates.io; treat "already exists" as success
# (auto-publish workflow may have already published this version)
CRATE_NAME=$(sed -n 's/^name = "\([^"]*\)"/\1/p' Cargo.toml | head -1)
OUTPUT=$(cargo publish --locked --allow-dirty --no-verify 2>&1) && exit 0
echo "$OUTPUT"
if echo "$OUTPUT" | grep -q 'already exists'; then
echo "::notice::${CRATE_NAME}@${VERSION} already on crates.io — skipping"
exit 0
fi
exit 1
redeploy-website:
name: Trigger Website Redeploy
needs: [publish]
runs-on: ubuntu-latest
steps:
- name: Trigger website redeploy
env:
PAT: ${{ secrets.WEBSITE_REPO_PAT }}
run: |
curl -fsSL -X POST \
-H "Authorization: token $PAT" \
-H "Accept: application/vnd.github+json" \
https://api.github.com/repos/zeroclaw-labs/zeroclaw-website/dispatches \
-d '{"event_type":"new-release","client_payload":{"install_script_url":"https://raw.githubusercontent.com/zeroclaw-labs/zeroclaw/master/install.sh"}}'
docker:
name: Push Docker Image
needs: [validate, build]
runs-on: ubuntu-latest
timeout-minutes: 15
steps:
- uses: actions/checkout@34e114876b0b11c390a56381ad16ebd13914f8d5 # v4
- uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4
with:
name: zeroclaw-x86_64-unknown-linux-gnu
path: artifacts/
- uses: actions/download-artifact@d3f86a106a0bac45b974a628896c90dbdf5c8093 # v4
with:
name: zeroclaw-aarch64-unknown-linux-gnu
path: artifacts/
- name: Prepare Docker context with pre-built binaries
run: |
mkdir -p docker-ctx/bin/amd64 docker-ctx/bin/arm64
tar xzf artifacts/zeroclaw-x86_64-unknown-linux-gnu.tar.gz -C docker-ctx/bin/amd64
tar xzf artifacts/zeroclaw-aarch64-unknown-linux-gnu.tar.gz -C docker-ctx/bin/arm64
mkdir -p docker-ctx/zeroclaw-data/.zeroclaw docker-ctx/zeroclaw-data/workspace
printf '%s\n' \
'workspace_dir = "/zeroclaw-data/workspace"' \
'config_path = "/zeroclaw-data/.zeroclaw/config.toml"' \
'api_key = ""' \
'default_provider = "openrouter"' \
'default_model = "anthropic/claude-sonnet-4-20250514"' \
'default_temperature = 0.7' \
'' \
'[gateway]' \
'port = 42617' \
'host = "[::]"' \
'allow_public_bind = true' \
> docker-ctx/zeroclaw-data/.zeroclaw/config.toml
cp Dockerfile.ci docker-ctx/Dockerfile
cp Dockerfile.debian.ci docker-ctx/Dockerfile.debian
- uses: docker/setup-buildx-action@8d2750c68a42422c14e847fe6c8ac0403b4cbd6f # v3
- uses: docker/login-action@c94ce9fb468520275223c153574b00df6fe4bcc9 # v3
with:
registry: ${{ env.REGISTRY }}
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Build and push
uses: docker/build-push-action@10e90e3645eae34f1e60eeb005ba3a3d33f178e8 # v6
with:
context: docker-ctx
push: true
tags: |
${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ needs.validate.outputs.tag }}
${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:latest
platforms: linux/amd64,linux/arm64
- name: Build and push Debian compatibility image
uses: docker/build-push-action@10e90e3645eae34f1e60eeb005ba3a3d33f178e8 # v6
with:
context: docker-ctx
file: docker-ctx/Dockerfile.debian
push: true
tags: |
${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:${{ needs.validate.outputs.tag }}-debian
${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:debian
platforms: linux/amd64,linux/arm64
# ── Post-publish: package manager auto-sync ─────────────────────────
scoop:
name: Update Scoop Manifest
needs: [validate, publish]
if: ${{ !cancelled() && needs.publish.result == 'success' }}
uses: ./.github/workflows/pub-scoop.yml
with:
release_tag: ${{ needs.validate.outputs.tag }}
dry_run: false
secrets: inherit
aur:
name: Update AUR Package
needs: [validate, publish]
if: ${{ !cancelled() && needs.publish.result == 'success' }}
uses: ./.github/workflows/pub-aur.yml
with:
release_tag: ${{ needs.validate.outputs.tag }}
dry_run: false
secrets: inherit
homebrew:
name: Update Homebrew Core
needs: [validate, publish]
if: ${{ !cancelled() && needs.publish.result == 'success' }}
uses: ./.github/workflows/pub-homebrew-core.yml
with:
release_tag: ${{ needs.validate.outputs.tag }}
dry_run: false
secrets: inherit
# ── Post-publish: tweet after release + website are live ──────────────
# Docker push can be slow; don't let it block the tweet.
tweet:
name: Tweet Release
needs: [validate, publish, redeploy-website]
if: ${{ !cancelled() && needs.publish.result == 'success' }}
uses: ./.github/workflows/tweet-release.yml
with:
release_tag: ${{ needs.validate.outputs.tag }}
release_url: https://github.com/zeroclaw-labs/zeroclaw/releases/tag/${{ needs.validate.outputs.tag }}
secrets: inherit

View File

@ -1,61 +0,0 @@
// Enforce at least one human approval on pull requests.
// Used by .github/workflows/ci-run.yml via actions/github-script.
module.exports = async ({ github, context, core }) => {
const owner = context.repo.owner;
const repo = context.repo.repo;
const prNumber = context.payload.pull_request?.number;
if (!prNumber) {
core.setFailed("Missing pull_request context.");
return;
}
const botAllowlist = new Set(
(process.env.HUMAN_REVIEW_BOT_LOGINS || "github-actions[bot],dependabot[bot],coderabbitai[bot]")
.split(",")
.map((value) => value.trim().toLowerCase())
.filter(Boolean),
);
const isBotAccount = (login, accountType) => {
if (!login) return false;
if ((accountType || "").toLowerCase() === "bot") return true;
if (login.endsWith("[bot]")) return true;
return botAllowlist.has(login);
};
const reviews = await github.paginate(github.rest.pulls.listReviews, {
owner,
repo,
pull_number: prNumber,
per_page: 100,
});
const latestReviewByUser = new Map();
const decisiveStates = new Set(["APPROVED", "CHANGES_REQUESTED", "DISMISSED"]);
for (const review of reviews) {
const login = review.user?.login?.toLowerCase();
if (!login) continue;
if (!decisiveStates.has(review.state)) continue;
latestReviewByUser.set(login, {
state: review.state,
type: review.user?.type || "",
});
}
const humanApprovers = [];
for (const [login, review] of latestReviewByUser.entries()) {
if (review.state !== "APPROVED") continue;
if (isBotAccount(login, review.type)) continue;
humanApprovers.push(login);
}
if (humanApprovers.length === 0) {
core.setFailed(
"No human approving review found. At least one non-bot approval is required before merge.",
);
return;
}
core.info(`Human approval check passed. Approver(s): ${humanApprovers.join(", ")}`);
};

View File

@ -1,54 +0,0 @@
// Enforce ownership rules for root license files in PRs.
module.exports = async ({ github, context, core }) => {
const owner = context.repo.owner;
const repo = context.repo.repo;
const prNumber = context.payload.pull_request?.number;
const prAuthor = context.payload.pull_request?.user?.login?.toLowerCase() || "";
if (!prNumber) {
core.setFailed("Missing pull_request context.");
return;
}
const ownerAllowlist = ["willsarg"];
if (ownerAllowlist.length === 0) {
core.setFailed("License owner allowlist is empty.");
return;
}
const protectedFiles = new Set(["LICENSE-APACHE", "LICENSE-MIT"]);
const files = await github.paginate(github.rest.pulls.listFiles, {
owner,
repo,
pull_number: prNumber,
per_page: 100,
});
const changedProtectedFiles = files
.map((file) => file.filename)
.filter((name) => protectedFiles.has(name));
if (changedProtectedFiles.length === 0) {
core.info("No protected root license files changed in this PR.");
return;
}
core.info(`Protected license files changed:\n- ${changedProtectedFiles.join("\n- ")}`);
core.info(`Allowed license file editors: ${ownerAllowlist.join(", ")}`);
if (!prAuthor) {
core.setFailed("Unable to resolve PR author login.");
return;
}
if (!ownerAllowlist.includes(prAuthor)) {
core.setFailed(
`Root license files (${changedProtectedFiles.join(", ")}) can only be changed by ${ownerAllowlist.join(", ")}. PR author is @${prAuthor}.`,
);
return;
}
core.info(`License file edit authorized for PR author: @${prAuthor}`);
};

View File

@ -1,90 +0,0 @@
// Post actionable lint failure summary as a PR comment.
// Used by the lint-feedback CI job via actions/github-script.
//
// Required environment variables:
// RUST_CHANGED — "true" if Rust files changed
// DOCS_CHANGED — "true" if docs files changed
// LINT_RESULT — result of the lint job
// LINT_DELTA_RESULT — result of the strict delta lint job
// DOCS_RESULT — result of the docs-quality job
module.exports = async ({ github, context, core }) => {
const owner = context.repo.owner;
const repo = context.repo.repo;
const issueNumber = context.payload.pull_request?.number;
if (!issueNumber) return;
const marker = "<!-- ci-lint-feedback -->";
const rustChanged = process.env.RUST_CHANGED === "true";
const docsChanged = process.env.DOCS_CHANGED === "true";
const lintResult = process.env.LINT_RESULT || "skipped";
const lintDeltaResult = process.env.LINT_DELTA_RESULT || "skipped";
const docsResult = process.env.DOCS_RESULT || "skipped";
const failures = [];
if (rustChanged && !["success", "skipped"].includes(lintResult)) {
failures.push("`Lint Gate (Format + Clippy)` failed.");
}
if (rustChanged && !["success", "skipped"].includes(lintDeltaResult)) {
failures.push("`Lint Gate (Strict Delta)` failed.");
}
if (docsChanged && !["success", "skipped"].includes(docsResult)) {
failures.push("`Docs Quality` failed.");
}
const comments = await github.paginate(github.rest.issues.listComments, {
owner,
repo,
issue_number: issueNumber,
per_page: 100,
});
const existing = comments.find((comment) => (comment.body || "").includes(marker));
if (failures.length === 0) {
if (existing) {
await github.rest.issues.deleteComment({
owner,
repo,
comment_id: existing.id,
});
}
core.info("No lint/docs gate failures. No feedback comment required.");
return;
}
const runUrl = `${context.serverUrl}/${owner}/${repo}/actions/runs/${context.runId}`;
const body = [
marker,
"### CI lint feedback",
"",
"This PR failed one or more fast lint/documentation gates:",
"",
...failures.map((item) => `- ${item}`),
"",
"Open the failing logs in this run:",
`- ${runUrl}`,
"",
"Local fix commands:",
"- `./scripts/ci/rust_quality_gate.sh`",
"- `./scripts/ci/rust_strict_delta_gate.sh`",
"- `./scripts/ci/docs_quality_gate.sh`",
"",
"After fixes, push a new commit and CI will re-run automatically.",
].join("\n");
if (existing) {
await github.rest.issues.updateComment({
owner,
repo,
comment_id: existing.id,
body,
});
} else {
await github.rest.issues.createComment({
owner,
repo,
issue_number: issueNumber,
body,
});
}
};

View File

@ -1,132 +0,0 @@
// Extracted from pr-auto-response.yml step: Apply contributor tier label for issue author
module.exports = async ({ github, context, core }) => {
const owner = context.repo.owner;
const repo = context.repo.repo;
const issue = context.payload.issue;
const pullRequest = context.payload.pull_request;
const target = issue ?? pullRequest;
async function loadContributorTierPolicy() {
const policyPath = process.env.LABEL_POLICY_PATH || ".github/label-policy.json";
const fallback = {
contributorTierColor: "2ED9FF",
contributorTierRules: [
{ label: "distinguished contributor", minMergedPRs: 50 },
{ label: "principal contributor", minMergedPRs: 20 },
{ label: "experienced contributor", minMergedPRs: 10 },
{ label: "trusted contributor", minMergedPRs: 5 },
],
};
try {
const { data } = await github.rest.repos.getContent({
owner,
repo,
path: policyPath,
ref: context.payload.repository?.default_branch || "main",
});
const json = JSON.parse(Buffer.from(data.content, "base64").toString("utf8"));
const contributorTierRules = (json.contributor_tiers || []).map((entry) => ({
label: String(entry.label || "").trim(),
minMergedPRs: Number(entry.min_merged_prs || 0),
}));
const contributorTierColor = String(json.contributor_tier_color || "").toUpperCase();
if (!contributorTierColor || contributorTierRules.length === 0) {
return fallback;
}
return { contributorTierColor, contributorTierRules };
} catch (error) {
core.warning(`failed to load ${policyPath}, using fallback policy: ${error.message}`);
return fallback;
}
}
const { contributorTierColor, contributorTierRules } = await loadContributorTierPolicy();
const contributorTierLabels = contributorTierRules.map((rule) => rule.label);
const managedContributorLabels = new Set(contributorTierLabels);
const action = context.payload.action;
const changedLabel = context.payload.label?.name;
if (!target) return;
if ((action === "labeled" || action === "unlabeled") && !managedContributorLabels.has(changedLabel)) {
return;
}
const author = target.user;
if (!author || author.type === "Bot") return;
function contributorTierDescription(rule) {
return `Contributor with ${rule.minMergedPRs}+ merged PRs.`;
}
async function ensureContributorTierLabels() {
for (const rule of contributorTierRules) {
const label = rule.label;
const expectedDescription = contributorTierDescription(rule);
try {
const { data: existing } = await github.rest.issues.getLabel({ owner, repo, name: label });
const currentColor = (existing.color || "").toUpperCase();
const currentDescription = (existing.description || "").trim();
if (currentColor !== contributorTierColor || currentDescription !== expectedDescription) {
await github.rest.issues.updateLabel({
owner,
repo,
name: label,
new_name: label,
color: contributorTierColor,
description: expectedDescription,
});
}
} catch (error) {
if (error.status !== 404) throw error;
await github.rest.issues.createLabel({
owner,
repo,
name: label,
color: contributorTierColor,
description: expectedDescription,
});
}
}
}
function selectContributorTier(mergedCount) {
const matchedTier = contributorTierRules.find((rule) => mergedCount >= rule.minMergedPRs);
return matchedTier ? matchedTier.label : null;
}
let contributorTierLabel = null;
try {
const { data: mergedSearch } = await github.rest.search.issuesAndPullRequests({
q: `repo:${owner}/${repo} is:pr is:merged author:${author.login}`,
per_page: 1,
});
const mergedCount = mergedSearch.total_count || 0;
contributorTierLabel = selectContributorTier(mergedCount);
} catch (error) {
core.warning(`failed to evaluate contributor tier status: ${error.message}`);
return;
}
await ensureContributorTierLabels();
const { data: currentLabels } = await github.rest.issues.listLabelsOnIssue({
owner,
repo,
issue_number: target.number,
});
const keepLabels = currentLabels
.map((label) => label.name)
.filter((label) => !contributorTierLabels.includes(label));
if (contributorTierLabel) {
keepLabels.push(contributorTierLabel);
}
await github.rest.issues.setLabels({
owner,
repo,
issue_number: target.number,
labels: [...new Set(keepLabels)],
});
};

View File

@ -1,94 +0,0 @@
// Extracted from pr-auto-response.yml step: Handle label-driven responses
module.exports = async ({ github, context, core }) => {
const label = context.payload.label?.name;
if (!label) return;
const issue = context.payload.issue;
const pullRequest = context.payload.pull_request;
const target = issue ?? pullRequest;
if (!target) return;
const isIssue = Boolean(issue);
const issueNumber = target.number;
const owner = context.repo.owner;
const repo = context.repo.repo;
const rules = [
{
label: "r:support",
close: true,
closeIssuesOnly: true,
closeReason: "not_planned",
message:
"This looks like a usage/support request. Please use README + docs first, then open a focused bug with repro details if behavior is incorrect.",
},
{
label: "r:needs-repro",
close: false,
message:
"Thanks for the report. Please add deterministic repro steps, exact environment, and redacted logs so maintainers can triage quickly.",
},
{
label: "invalid",
close: true,
closeIssuesOnly: true,
closeReason: "not_planned",
message:
"Closing as invalid based on current information. If this is still relevant, open a new issue with updated evidence and reproducible steps.",
},
{
label: "duplicate",
close: true,
closeIssuesOnly: true,
closeReason: "not_planned",
message:
"Closing as duplicate. Please continue discussion in the canonical linked issue/PR.",
},
];
const rule = rules.find((entry) => entry.label === label);
if (!rule) return;
const marker = `<!-- auto-response:${rule.label} -->`;
const comments = await github.paginate(github.rest.issues.listComments, {
owner,
repo,
issue_number: issueNumber,
per_page: 100,
});
const alreadyCommented = comments.some((comment) =>
(comment.body || "").includes(marker)
);
if (!alreadyCommented) {
await github.rest.issues.createComment({
owner,
repo,
issue_number: issueNumber,
body: `${rule.message}\n\n${marker}`,
});
}
if (!rule.close) return;
if (rule.closeIssuesOnly && !isIssue) return;
if (target.state === "closed") return;
if (isIssue) {
await github.rest.issues.update({
owner,
repo,
issue_number: issueNumber,
state: "closed",
state_reason: rule.closeReason || "not_planned",
});
} else {
await github.rest.issues.update({
owner,
repo,
issue_number: issueNumber,
state: "closed",
});
}
};

View File

@ -1,161 +0,0 @@
// Extracted from pr-check-status.yml step: Nudge PRs that need rebase or CI refresh
module.exports = async ({ github, context, core }) => {
const staleHours = Number(process.env.STALE_HOURS || "48");
const ignoreLabels = new Set(["no-stale", "stale", "maintainer", "no-pr-hygiene"]);
const marker = "<!-- pr-hygiene-nudge -->";
const owner = context.repo.owner;
const repo = context.repo.repo;
const openPrs = await github.paginate(github.rest.pulls.list, {
owner,
repo,
state: "open",
per_page: 100,
});
const activePrs = openPrs.filter((pr) => {
if (pr.draft) {
return false;
}
const labels = new Set((pr.labels || []).map((label) => label.name));
return ![...ignoreLabels].some((label) => labels.has(label));
});
core.info(`Scanning ${activePrs.length} open PR(s) for hygiene nudges.`);
let nudged = 0;
let skipped = 0;
for (const pr of activePrs) {
const { data: headCommit } = await github.rest.repos.getCommit({
owner,
repo,
ref: pr.head.sha,
});
const headCommitAt =
headCommit.commit?.committer?.date || headCommit.commit?.author?.date;
if (!headCommitAt) {
skipped += 1;
core.info(`#${pr.number}: missing head commit timestamp, skipping.`);
continue;
}
const ageHours = (Date.now() - new Date(headCommitAt).getTime()) / 3600000;
if (ageHours < staleHours) {
skipped += 1;
continue;
}
const { data: prDetail } = await github.rest.pulls.get({
owner,
repo,
pull_number: pr.number,
});
const isBehindBase = prDetail.mergeable_state === "behind";
const { data: checkRunsData } = await github.rest.checks.listForRef({
owner,
repo,
ref: pr.head.sha,
per_page: 100,
});
const ciGateRuns = (checkRunsData.check_runs || [])
.filter((run) => run.name === "CI Required Gate")
.sort((a, b) => {
const aTime = new Date(a.started_at || a.completed_at || a.created_at).getTime();
const bTime = new Date(b.started_at || b.completed_at || b.created_at).getTime();
return bTime - aTime;
});
let ciState = "missing";
if (ciGateRuns.length > 0) {
const latest = ciGateRuns[0];
if (latest.status !== "completed") {
ciState = "in_progress";
} else if (["success", "neutral", "skipped"].includes(latest.conclusion || "")) {
ciState = "success";
} else {
ciState = String(latest.conclusion || "failure");
}
}
const ciMissing = ciState === "missing";
const ciFailing = !["success", "in_progress", "missing"].includes(ciState);
if (!isBehindBase && !ciMissing && !ciFailing) {
skipped += 1;
continue;
}
const reasons = [];
if (isBehindBase) {
reasons.push("- Branch is behind `main` (please rebase or merge the latest base branch).");
}
if (ciMissing) {
reasons.push("- No `CI Required Gate` run was found for the current head commit.");
}
if (ciFailing) {
reasons.push(`- Latest \`CI Required Gate\` result is \`${ciState}\`.`);
}
const shortSha = pr.head.sha.slice(0, 12);
const body = [
marker,
`Hi @${pr.user.login}, friendly automation nudge from PR hygiene.`,
"",
`This PR has had no new commits for **${Math.floor(ageHours)}h** and still needs an update before merge:`,
"",
...reasons,
"",
"### Recommended next steps",
"1. Rebase your branch on `main`.",
"2. Push the updated branch and re-run checks (or use **Re-run failed jobs**).",
"3. Post fresh validation output in this PR thread.",
"",
"Maintainers: apply `no-stale` to opt out for accepted-but-blocked work.",
`Head SHA: \`${shortSha}\``,
].join("\n");
const { data: comments } = await github.rest.issues.listComments({
owner,
repo,
issue_number: pr.number,
per_page: 100,
});
const existing = comments.find(
(comment) => comment.user?.type === "Bot" && comment.body?.includes(marker),
);
if (existing) {
if (existing.body === body) {
skipped += 1;
continue;
}
await github.rest.issues.updateComment({
owner,
repo,
comment_id: existing.id,
body,
});
} else {
await github.rest.issues.createComment({
owner,
repo,
issue_number: pr.number,
body,
});
}
nudged += 1;
core.info(`#${pr.number}: hygiene nudge posted/updated.`);
}
core.info(`Done. Nudged=${nudged}, skipped=${skipped}`);
};

View File

@ -1,189 +0,0 @@
// Run safe intake checks for PR events and maintain a single sticky comment.
// Used by .github/workflows/pr-intake-checks.yml via actions/github-script.
module.exports = async ({ github, context, core }) => {
const owner = context.repo.owner;
const repo = context.repo.repo;
const pr = context.payload.pull_request;
if (!pr) return;
const marker = "<!-- pr-intake-checks -->";
const legacyMarker = "<!-- pr-intake-sanity -->";
const requiredSections = [
"## Summary",
"## Validation Evidence (required)",
"## Security Impact (required)",
"## Privacy and Data Hygiene (required)",
"## Rollback Plan (required)",
];
const body = pr.body || "";
const missingSections = requiredSections.filter((section) => !body.includes(section));
const missingFields = [];
const requiredFieldChecks = [
["summary problem", /- Problem:\s*\S+/m],
["summary why it matters", /- Why it matters:\s*\S+/m],
["summary what changed", /- What changed:\s*\S+/m],
["validation commands", /Commands and result summary:\s*[\s\S]*```/m],
["security risk/mitigation", /- New permissions\/capabilities\?\s*\(`Yes\/No`\):\s*\S+/m],
["privacy status", /- Data-hygiene status\s*\(`pass\|needs-follow-up`\):\s*\S+/m],
["rollback plan", /- Fast rollback command\/path:\s*\S+/m],
];
for (const [name, pattern] of requiredFieldChecks) {
if (!pattern.test(body)) {
missingFields.push(name);
}
}
const files = await github.paginate(github.rest.pulls.listFiles, {
owner,
repo,
pull_number: pr.number,
per_page: 100,
});
const formatWarnings = [];
const dangerousProblems = [];
for (const file of files) {
const patch = file.patch || "";
if (!patch) continue;
const lines = patch.split("\n");
for (let idx = 0; idx < lines.length; idx += 1) {
const line = lines[idx];
if (!line.startsWith("+") || line.startsWith("+++")) continue;
const added = line.slice(1);
const lineNo = idx + 1;
if (/\t/.test(added)) {
formatWarnings.push(`${file.filename}:patch#${lineNo} contains tab characters`);
}
if (/[ \t]+$/.test(added)) {
formatWarnings.push(`${file.filename}:patch#${lineNo} contains trailing whitespace`);
}
if (/^(<<<<<<<|=======|>>>>>>>)/.test(added)) {
dangerousProblems.push(`${file.filename}:patch#${lineNo} contains merge conflict markers`);
}
}
}
const workflowFilesChanged = files
.map((file) => file.filename)
.filter((name) => name.startsWith(".github/workflows/"));
const advisoryFindings = [];
const blockingFindings = [];
if (missingSections.length > 0) {
advisoryFindings.push(`Missing required PR template sections: ${missingSections.join(", ")}`);
}
if (missingFields.length > 0) {
advisoryFindings.push(`Incomplete required PR template fields: ${missingFields.join(", ")}`);
}
if (formatWarnings.length > 0) {
advisoryFindings.push(`Formatting issues in added lines (${formatWarnings.length})`);
}
if (dangerousProblems.length > 0) {
blockingFindings.push(`Dangerous patch markers found (${dangerousProblems.length})`);
}
const comments = await github.paginate(github.rest.issues.listComments, {
owner,
repo,
issue_number: pr.number,
per_page: 100,
});
const existing = comments.find((comment) => {
const body = comment.body || "";
return body.includes(marker) || body.includes(legacyMarker);
});
if (advisoryFindings.length === 0 && blockingFindings.length === 0) {
if (existing) {
await github.rest.issues.deleteComment({
owner,
repo,
comment_id: existing.id,
});
}
core.info("PR intake sanity checks passed.");
return;
}
const runUrl = `${context.serverUrl}/${owner}/${repo}/actions/runs/${context.runId}`;
const advisoryDetails = [];
if (formatWarnings.length > 0) {
advisoryDetails.push(...formatWarnings.slice(0, 20).map((entry) => `- ${entry}`));
if (formatWarnings.length > 20) {
advisoryDetails.push(`- ...and ${formatWarnings.length - 20} more issue(s)`);
}
}
const blockingDetails = [];
if (dangerousProblems.length > 0) {
blockingDetails.push(...dangerousProblems.slice(0, 20).map((entry) => `- ${entry}`));
if (dangerousProblems.length > 20) {
blockingDetails.push(`- ...and ${dangerousProblems.length - 20} more issue(s)`);
}
}
const isBlocking = blockingFindings.length > 0;
const workflowChangeNote = workflowFilesChanged.length > 0
? [
"",
"Workflow files changed in this PR:",
...workflowFilesChanged.map((name) => `- \`${name}\``),
].join("\n")
: "";
const commentBody = [
marker,
isBlocking
? "### PR intake checks failed (blocking)"
: "### PR intake checks found warnings (non-blocking)",
"",
isBlocking
? "Fast safe checks found blocking safety issues:"
: "Fast safe checks found advisory issues. CI lint/test/build gates still enforce merge quality.",
...(blockingFindings.length > 0 ? blockingFindings.map((entry) => `- ${entry}`) : []),
...(advisoryFindings.length > 0 ? advisoryFindings.map((entry) => `- ${entry}`) : []),
"",
"Action items:",
"1. Complete required PR template sections/fields.",
"2. Remove tabs, trailing whitespace, and merge conflict markers from added lines.",
"4. Re-run local checks before pushing:",
" - `./scripts/ci/rust_quality_gate.sh`",
" - `./scripts/ci/rust_strict_delta_gate.sh`",
" - `./scripts/ci/docs_quality_gate.sh`",
"",
"",
`Run logs: ${runUrl}`,
"",
"Detected blocking line issues (sample):",
...(blockingDetails.length > 0 ? blockingDetails : ["- none"]),
"",
"Detected advisory line issues (sample):",
...(advisoryDetails.length > 0 ? advisoryDetails : ["- none"]),
workflowChangeNote,
].join("\n");
if (existing) {
await github.rest.issues.updateComment({
owner,
repo,
comment_id: existing.id,
body: commentBody,
});
} else {
await github.rest.issues.createComment({
owner,
repo,
issue_number: pr.number,
body: commentBody,
});
}
if (isBlocking) {
core.setFailed("PR intake sanity checks found blocking issues. See sticky comment for details.");
return;
}
core.info("PR intake sanity checks found advisory issues only.");
};

View File

@ -1,805 +0,0 @@
// Apply managed PR labels (size/risk/path/module/contributor tiers).
// Extracted from pr-labeler workflow inline github-script for maintainability.
module.exports = async ({ github, context, core }) => {
const pr = context.payload.pull_request;
const owner = context.repo.owner;
const repo = context.repo.repo;
const action = context.payload.action;
const changedLabel = context.payload.label?.name;
const sizeLabels = ["size: XS", "size: S", "size: M", "size: L", "size: XL"];
const computedRiskLabels = ["risk: low", "risk: medium", "risk: high"];
const manualRiskOverrideLabel = "risk: manual";
const managedEnforcedLabels = new Set([
...sizeLabels,
manualRiskOverrideLabel,
...computedRiskLabels,
]);
if ((action === "labeled" || action === "unlabeled") && !managedEnforcedLabels.has(changedLabel)) {
core.info(`skip non-size/risk label event: ${changedLabel || "unknown"}`);
return;
}
async function loadContributorTierPolicy() {
const policyPath = process.env.LABEL_POLICY_PATH || ".github/label-policy.json";
const fallback = {
contributorTierColor: "2ED9FF",
contributorTierRules: [
{ label: "distinguished contributor", minMergedPRs: 50 },
{ label: "principal contributor", minMergedPRs: 20 },
{ label: "experienced contributor", minMergedPRs: 10 },
{ label: "trusted contributor", minMergedPRs: 5 },
],
};
try {
const { data } = await github.rest.repos.getContent({
owner,
repo,
path: policyPath,
ref: context.payload.repository?.default_branch || "main",
});
const json = JSON.parse(Buffer.from(data.content, "base64").toString("utf8"));
const contributorTierRules = (json.contributor_tiers || []).map((entry) => ({
label: String(entry.label || "").trim(),
minMergedPRs: Number(entry.min_merged_prs || 0),
}));
const contributorTierColor = String(json.contributor_tier_color || "").toUpperCase();
if (!contributorTierColor || contributorTierRules.length === 0) {
return fallback;
}
return { contributorTierColor, contributorTierRules };
} catch (error) {
core.warning(`failed to load ${policyPath}, using fallback policy: ${error.message}`);
return fallback;
}
}
const { contributorTierColor, contributorTierRules } = await loadContributorTierPolicy();
const contributorTierLabels = contributorTierRules.map((rule) => rule.label);
const managedPathLabels = [
"docs",
"dependencies",
"ci",
"core",
"agent",
"channel",
"config",
"cron",
"daemon",
"doctor",
"gateway",
"health",
"heartbeat",
"integration",
"memory",
"observability",
"onboard",
"provider",
"runtime",
"security",
"service",
"skillforge",
"skills",
"tool",
"tunnel",
"tests",
"scripts",
"dev",
];
const managedPathLabelSet = new Set(managedPathLabels);
const moduleNamespaceRules = [
{ root: "src/agent/", prefix: "agent", coreEntries: new Set(["mod.rs"]) },
{ root: "src/channels/", prefix: "channel", coreEntries: new Set(["mod.rs", "traits.rs"]) },
{ root: "src/config/", prefix: "config", coreEntries: new Set(["mod.rs", "schema.rs"]) },
{ root: "src/cron/", prefix: "cron", coreEntries: new Set(["mod.rs"]) },
{ root: "src/daemon/", prefix: "daemon", coreEntries: new Set(["mod.rs"]) },
{ root: "src/doctor/", prefix: "doctor", coreEntries: new Set(["mod.rs"]) },
{ root: "src/gateway/", prefix: "gateway", coreEntries: new Set(["mod.rs"]) },
{ root: "src/health/", prefix: "health", coreEntries: new Set(["mod.rs"]) },
{ root: "src/heartbeat/", prefix: "heartbeat", coreEntries: new Set(["mod.rs"]) },
{ root: "src/integrations/", prefix: "integration", coreEntries: new Set(["mod.rs", "registry.rs"]) },
{ root: "src/memory/", prefix: "memory", coreEntries: new Set(["mod.rs", "traits.rs"]) },
{ root: "src/observability/", prefix: "observability", coreEntries: new Set(["mod.rs", "traits.rs"]) },
{ root: "src/onboard/", prefix: "onboard", coreEntries: new Set(["mod.rs"]) },
{ root: "src/providers/", prefix: "provider", coreEntries: new Set(["mod.rs", "traits.rs"]) },
{ root: "src/runtime/", prefix: "runtime", coreEntries: new Set(["mod.rs", "traits.rs"]) },
{ root: "src/security/", prefix: "security", coreEntries: new Set(["mod.rs"]) },
{ root: "src/service/", prefix: "service", coreEntries: new Set(["mod.rs"]) },
{ root: "src/skillforge/", prefix: "skillforge", coreEntries: new Set(["mod.rs"]) },
{ root: "src/skills/", prefix: "skills", coreEntries: new Set(["mod.rs"]) },
{ root: "src/tools/", prefix: "tool", coreEntries: new Set(["mod.rs", "traits.rs"]) },
{ root: "src/tunnel/", prefix: "tunnel", coreEntries: new Set(["mod.rs"]) },
];
const managedModulePrefixes = [...new Set(moduleNamespaceRules.map((rule) => `${rule.prefix}:`))];
const orderedOtherLabelStyles = [
{ label: "health", color: "8EC9B8" },
{ label: "tool", color: "7FC4B6" },
{ label: "agent", color: "86C4A2" },
{ label: "memory", color: "8FCB99" },
{ label: "channel", color: "7EB6F2" },
{ label: "service", color: "95C7B6" },
{ label: "integration", color: "8DC9AE" },
{ label: "tunnel", color: "9FC8B3" },
{ label: "config", color: "AABCD0" },
{ label: "observability", color: "84C9D0" },
{ label: "docs", color: "8FBBE0" },
{ label: "dev", color: "B9C1CC" },
{ label: "tests", color: "9DC8C7" },
{ label: "skills", color: "BFC89B" },
{ label: "skillforge", color: "C9C39B" },
{ label: "provider", color: "958DF0" },
{ label: "runtime", color: "A3ADD8" },
{ label: "heartbeat", color: "C0C88D" },
{ label: "daemon", color: "C8C498" },
{ label: "doctor", color: "C1CF9D" },
{ label: "onboard", color: "D2BF86" },
{ label: "cron", color: "D2B490" },
{ label: "ci", color: "AEB4CE" },
{ label: "dependencies", color: "9FB1DE" },
{ label: "gateway", color: "B5A8E5" },
{ label: "security", color: "E58D85" },
{ label: "core", color: "C8A99B" },
{ label: "scripts", color: "C9B49F" },
];
const otherLabelDisplayOrder = orderedOtherLabelStyles.map((entry) => entry.label);
const modulePrefixSet = new Set(moduleNamespaceRules.map((rule) => rule.prefix));
const modulePrefixPriority = otherLabelDisplayOrder.filter((label) => modulePrefixSet.has(label));
const pathLabelPriority = [...otherLabelDisplayOrder];
const riskDisplayOrder = ["risk: high", "risk: medium", "risk: low", "risk: manual"];
const sizeDisplayOrder = ["size: XS", "size: S", "size: M", "size: L", "size: XL"];
const contributorDisplayOrder = [
"distinguished contributor",
"principal contributor",
"experienced contributor",
"trusted contributor",
];
const modulePrefixPriorityIndex = new Map(
modulePrefixPriority.map((prefix, index) => [prefix, index])
);
const pathLabelPriorityIndex = new Map(
pathLabelPriority.map((label, index) => [label, index])
);
const riskPriorityIndex = new Map(
riskDisplayOrder.map((label, index) => [label, index])
);
const sizePriorityIndex = new Map(
sizeDisplayOrder.map((label, index) => [label, index])
);
const contributorPriorityIndex = new Map(
contributorDisplayOrder.map((label, index) => [label, index])
);
const otherLabelColors = Object.fromEntries(
orderedOtherLabelStyles.map((entry) => [entry.label, entry.color])
);
const staticLabelColors = {
"size: XS": "E7CDD3",
"size: S": "E1BEC7",
"size: M": "DBB0BB",
"size: L": "D4A2AF",
"size: XL": "CE94A4",
"risk: low": "97D3A6",
"risk: medium": "E4C47B",
"risk: high": "E98E88",
"risk: manual": "B7A4E0",
...otherLabelColors,
};
const staticLabelDescriptions = {
"size: XS": "Auto size: <=80 non-doc changed lines.",
"size: S": "Auto size: 81-250 non-doc changed lines.",
"size: M": "Auto size: 251-500 non-doc changed lines.",
"size: L": "Auto size: 501-1000 non-doc changed lines.",
"size: XL": "Auto size: >1000 non-doc changed lines.",
"risk: low": "Auto risk: docs/chore-only paths.",
"risk: medium": "Auto risk: src/** or dependency/config changes.",
"risk: high": "Auto risk: security/runtime/gateway/tools/workflows.",
"risk: manual": "Maintainer override: keep selected risk label.",
docs: "Auto scope: docs/markdown/template files changed.",
dependencies: "Auto scope: dependency manifest/lock/policy changed.",
ci: "Auto scope: CI/workflow/hook files changed.",
core: "Auto scope: root src/*.rs files changed.",
agent: "Auto scope: src/agent/** changed.",
channel: "Auto scope: src/channels/** changed.",
config: "Auto scope: src/config/** changed.",
cron: "Auto scope: src/cron/** changed.",
daemon: "Auto scope: src/daemon/** changed.",
doctor: "Auto scope: src/doctor/** changed.",
gateway: "Auto scope: src/gateway/** changed.",
health: "Auto scope: src/health/** changed.",
heartbeat: "Auto scope: src/heartbeat/** changed.",
integration: "Auto scope: src/integrations/** changed.",
memory: "Auto scope: src/memory/** changed.",
observability: "Auto scope: src/observability/** changed.",
onboard: "Auto scope: src/onboard/** changed.",
provider: "Auto scope: src/providers/** changed.",
runtime: "Auto scope: src/runtime/** changed.",
security: "Auto scope: src/security/** changed.",
service: "Auto scope: src/service/** changed.",
skillforge: "Auto scope: src/skillforge/** changed.",
skills: "Auto scope: src/skills/** changed.",
tool: "Auto scope: src/tools/** changed.",
tunnel: "Auto scope: src/tunnel/** changed.",
tests: "Auto scope: tests/** changed.",
scripts: "Auto scope: scripts/** changed.",
dev: "Auto scope: dev/** changed.",
};
for (const label of contributorTierLabels) {
staticLabelColors[label] = contributorTierColor;
const rule = contributorTierRules.find((entry) => entry.label === label);
if (rule) {
staticLabelDescriptions[label] = `Contributor with ${rule.minMergedPRs}+ merged PRs.`;
}
}
const modulePrefixColors = Object.fromEntries(
modulePrefixPriority.map((prefix) => [
`${prefix}:`,
otherLabelColors[prefix] || "BFDADC",
])
);
const providerKeywordHints = [
"deepseek",
"moonshot",
"kimi",
"qwen",
"mistral",
"doubao",
"baichuan",
"yi",
"siliconflow",
"vertex",
"azure",
"perplexity",
"venice",
"vercel",
"cloudflare",
"synthetic",
"opencode",
"zai",
"glm",
"minimax",
"bedrock",
"qianfan",
"groq",
"together",
"fireworks",
"novita",
"cohere",
"openai",
"openrouter",
"anthropic",
"gemini",
"ollama",
];
const channelKeywordHints = [
"telegram",
"discord",
"slack",
"whatsapp",
"matrix",
"irc",
"imessage",
"email",
"cli",
];
function isDocsLike(path) {
return (
path.startsWith("docs/") ||
path.endsWith(".md") ||
path.endsWith(".mdx") ||
path === "LICENSE" ||
path === ".markdownlint-cli2.yaml" ||
path === ".github/pull_request_template.md" ||
path.startsWith(".github/ISSUE_TEMPLATE/")
);
}
function normalizeLabelSegment(segment) {
return (segment || "")
.toLowerCase()
.replace(/\.rs$/g, "")
.replace(/[^a-z0-9_-]+/g, "-")
.replace(/^[-_]+|[-_]+$/g, "")
.slice(0, 40);
}
function containsKeyword(text, keyword) {
const escaped = keyword.replace(/[.*+?^${}()|[\]\\]/g, "\\$&");
const pattern = new RegExp(`(^|[^a-z0-9_])${escaped}([^a-z0-9_]|$)`, "i");
return pattern.test(text);
}
function formatModuleLabel(prefix, segment) {
return `${prefix}: ${segment}`;
}
function parseModuleLabel(label) {
if (typeof label !== "string") return null;
const match = label.match(/^([^:]+):\s*(.+)$/);
if (!match) return null;
const prefix = match[1].trim().toLowerCase();
const segment = (match[2] || "").trim().toLowerCase();
if (!prefix || !segment) return null;
return { prefix, segment };
}
function sortByPriority(labels, priorityIndex) {
return [...new Set(labels)].sort((left, right) => {
const leftPriority = priorityIndex.has(left) ? priorityIndex.get(left) : Number.MAX_SAFE_INTEGER;
const rightPriority = priorityIndex.has(right)
? priorityIndex.get(right)
: Number.MAX_SAFE_INTEGER;
if (leftPriority !== rightPriority) return leftPriority - rightPriority;
return left.localeCompare(right);
});
}
function sortModuleLabels(labels) {
return [...new Set(labels)].sort((left, right) => {
const leftParsed = parseModuleLabel(left);
const rightParsed = parseModuleLabel(right);
if (!leftParsed || !rightParsed) return left.localeCompare(right);
const leftPrefixPriority = modulePrefixPriorityIndex.has(leftParsed.prefix)
? modulePrefixPriorityIndex.get(leftParsed.prefix)
: Number.MAX_SAFE_INTEGER;
const rightPrefixPriority = modulePrefixPriorityIndex.has(rightParsed.prefix)
? modulePrefixPriorityIndex.get(rightParsed.prefix)
: Number.MAX_SAFE_INTEGER;
if (leftPrefixPriority !== rightPrefixPriority) {
return leftPrefixPriority - rightPrefixPriority;
}
if (leftParsed.prefix !== rightParsed.prefix) {
return leftParsed.prefix.localeCompare(rightParsed.prefix);
}
const leftIsCore = leftParsed.segment === "core";
const rightIsCore = rightParsed.segment === "core";
if (leftIsCore !== rightIsCore) return leftIsCore ? 1 : -1;
return leftParsed.segment.localeCompare(rightParsed.segment);
});
}
function refineModuleLabels(rawLabels) {
const refined = new Set(rawLabels);
const segmentsByPrefix = new Map();
for (const label of rawLabels) {
const parsed = parseModuleLabel(label);
if (!parsed) continue;
if (!segmentsByPrefix.has(parsed.prefix)) {
segmentsByPrefix.set(parsed.prefix, new Set());
}
segmentsByPrefix.get(parsed.prefix).add(parsed.segment);
}
for (const [prefix, segments] of segmentsByPrefix) {
const hasSpecificSegment = [...segments].some((segment) => segment !== "core");
if (hasSpecificSegment) {
refined.delete(formatModuleLabel(prefix, "core"));
}
}
return refined;
}
function compactModuleLabels(labels) {
const groupedSegments = new Map();
const compactedModuleLabels = new Set();
const forcePathPrefixes = new Set();
for (const label of labels) {
const parsed = parseModuleLabel(label);
if (!parsed) {
compactedModuleLabels.add(label);
continue;
}
if (!groupedSegments.has(parsed.prefix)) {
groupedSegments.set(parsed.prefix, new Set());
}
groupedSegments.get(parsed.prefix).add(parsed.segment);
}
for (const [prefix, segments] of groupedSegments) {
const uniqueSegments = [...new Set([...segments].filter(Boolean))];
if (uniqueSegments.length === 0) continue;
if (uniqueSegments.length === 1) {
compactedModuleLabels.add(formatModuleLabel(prefix, uniqueSegments[0]));
} else {
forcePathPrefixes.add(prefix);
}
}
return {
moduleLabels: compactedModuleLabels,
forcePathPrefixes,
};
}
function colorForLabel(label) {
if (staticLabelColors[label]) return staticLabelColors[label];
const matchedPrefix = Object.keys(modulePrefixColors).find((prefix) => label.startsWith(prefix));
if (matchedPrefix) return modulePrefixColors[matchedPrefix];
return "BFDADC";
}
function descriptionForLabel(label) {
if (staticLabelDescriptions[label]) return staticLabelDescriptions[label];
const parsed = parseModuleLabel(label);
if (parsed) {
if (parsed.segment === "core") {
return `Auto module: ${parsed.prefix} core files changed.`;
}
return `Auto module: ${parsed.prefix}/${parsed.segment} changed.`;
}
return "Auto-managed label.";
}
async function ensureLabel(name, existing = null) {
const expectedColor = colorForLabel(name);
const expectedDescription = descriptionForLabel(name);
try {
const current = existing || (await github.rest.issues.getLabel({ owner, repo, name })).data;
const currentColor = (current.color || "").toUpperCase();
const currentDescription = (current.description || "").trim();
if (currentColor !== expectedColor || currentDescription !== expectedDescription) {
await github.rest.issues.updateLabel({
owner,
repo,
name,
new_name: name,
color: expectedColor,
description: expectedDescription,
});
}
} catch (error) {
if (error.status !== 404) throw error;
await github.rest.issues.createLabel({
owner,
repo,
name,
color: expectedColor,
description: expectedDescription,
});
}
}
function isManagedLabel(label) {
if (label === manualRiskOverrideLabel) return true;
if (sizeLabels.includes(label) || computedRiskLabels.includes(label)) return true;
if (managedPathLabelSet.has(label)) return true;
if (contributorTierLabels.includes(label)) return true;
if (managedModulePrefixes.some((prefix) => label.startsWith(prefix))) return true;
return false;
}
async function ensureManagedRepoLabelsMetadata() {
const repoLabels = await github.paginate(github.rest.issues.listLabelsForRepo, {
owner,
repo,
per_page: 100,
});
for (const existingLabel of repoLabels) {
const labelName = existingLabel.name || "";
if (!isManagedLabel(labelName)) continue;
await ensureLabel(labelName, existingLabel);
}
}
function selectContributorTier(mergedCount) {
const matchedTier = contributorTierRules.find((rule) => mergedCount >= rule.minMergedPRs);
return matchedTier ? matchedTier.label : null;
}
if (context.eventName === "workflow_dispatch") {
const mode = (context.payload.inputs?.mode || "audit").toLowerCase();
const shouldRepair = mode === "repair";
const repoLabels = await github.paginate(github.rest.issues.listLabelsForRepo, {
owner,
repo,
per_page: 100,
});
let managedScanned = 0;
const drifts = [];
for (const existingLabel of repoLabels) {
const labelName = existingLabel.name || "";
if (!isManagedLabel(labelName)) continue;
managedScanned += 1;
const expectedColor = colorForLabel(labelName);
const expectedDescription = descriptionForLabel(labelName);
const currentColor = (existingLabel.color || "").toUpperCase();
const currentDescription = (existingLabel.description || "").trim();
if (currentColor !== expectedColor || currentDescription !== expectedDescription) {
drifts.push({
name: labelName,
currentColor,
expectedColor,
currentDescription,
expectedDescription,
});
if (shouldRepair) {
await ensureLabel(labelName, existingLabel);
}
}
}
core.summary
.addHeading("Managed Label Governance", 2)
.addRaw(`Mode: ${shouldRepair ? "repair" : "audit"}`)
.addEOL()
.addRaw(`Managed labels scanned: ${managedScanned}`)
.addEOL()
.addRaw(`Drifts found: ${drifts.length}`)
.addEOL();
if (drifts.length > 0) {
const sample = drifts.slice(0, 30).map((entry) => [
entry.name,
`${entry.currentColor} -> ${entry.expectedColor}`,
`${entry.currentDescription || "(blank)"} -> ${entry.expectedDescription}`,
]);
core.summary.addTable([
[{ data: "Label", header: true }, { data: "Color", header: true }, { data: "Description", header: true }],
...sample,
]);
if (drifts.length > sample.length) {
core.summary
.addRaw(`Additional drifts not shown: ${drifts.length - sample.length}`)
.addEOL();
}
}
await core.summary.write();
if (!shouldRepair && drifts.length > 0) {
core.info(`Managed-label metadata drifts detected: ${drifts.length}. Re-run with mode=repair to auto-fix.`);
} else if (shouldRepair) {
core.info(`Managed-label metadata repair applied to ${drifts.length} labels.`);
} else {
core.info("No managed-label metadata drift detected.");
}
return;
}
const files = await github.paginate(github.rest.pulls.listFiles, {
owner,
repo,
pull_number: pr.number,
per_page: 100,
});
const detectedModuleLabels = new Set();
for (const file of files) {
const path = (file.filename || "").toLowerCase();
for (const rule of moduleNamespaceRules) {
if (!path.startsWith(rule.root)) continue;
const relative = path.slice(rule.root.length);
if (!relative) continue;
const first = relative.split("/")[0];
const firstStem = first.endsWith(".rs") ? first.slice(0, -3) : first;
let segment = firstStem;
if (rule.coreEntries.has(first) || rule.coreEntries.has(firstStem)) {
segment = "core";
}
segment = normalizeLabelSegment(segment);
if (!segment) continue;
detectedModuleLabels.add(formatModuleLabel(rule.prefix, segment));
}
}
const providerRelevantFiles = files.filter((file) => {
const path = file.filename || "";
return (
path.startsWith("src/providers/") ||
path.startsWith("src/integrations/") ||
path.startsWith("src/onboard/") ||
path.startsWith("src/config/")
);
});
if (providerRelevantFiles.length > 0) {
const searchableText = [
pr.title || "",
pr.body || "",
...providerRelevantFiles.map((file) => file.filename || ""),
...providerRelevantFiles.map((file) => file.patch || ""),
]
.join("\n")
.toLowerCase();
for (const keyword of providerKeywordHints) {
if (containsKeyword(searchableText, keyword)) {
detectedModuleLabels.add(formatModuleLabel("provider", keyword));
}
}
}
const channelRelevantFiles = files.filter((file) => {
const path = file.filename || "";
return (
path.startsWith("src/channels/") ||
path.startsWith("src/onboard/") ||
path.startsWith("src/config/")
);
});
if (channelRelevantFiles.length > 0) {
const searchableText = [
pr.title || "",
pr.body || "",
...channelRelevantFiles.map((file) => file.filename || ""),
...channelRelevantFiles.map((file) => file.patch || ""),
]
.join("\n")
.toLowerCase();
for (const keyword of channelKeywordHints) {
if (containsKeyword(searchableText, keyword)) {
detectedModuleLabels.add(formatModuleLabel("channel", keyword));
}
}
}
const refinedModuleLabels = refineModuleLabels(detectedModuleLabels);
const compactedModuleState = compactModuleLabels(refinedModuleLabels);
const selectedModuleLabels = compactedModuleState.moduleLabels;
const forcePathPrefixes = compactedModuleState.forcePathPrefixes;
const modulePrefixesWithLabels = new Set(
[...selectedModuleLabels]
.map((label) => parseModuleLabel(label)?.prefix)
.filter(Boolean)
);
const { data: currentLabels } = await github.rest.issues.listLabelsOnIssue({
owner,
repo,
issue_number: pr.number,
});
const currentLabelNames = currentLabels.map((label) => label.name);
const currentPathLabels = currentLabelNames.filter((label) => managedPathLabelSet.has(label));
const candidatePathLabels = new Set([...currentPathLabels, ...forcePathPrefixes]);
const dedupedPathLabels = [...candidatePathLabels].filter((label) => {
if (label === "core") return true;
if (forcePathPrefixes.has(label)) return true;
return !modulePrefixesWithLabels.has(label);
});
const excludedLockfiles = new Set(["Cargo.lock"]);
const changedLines = files.reduce((total, file) => {
const path = file.filename || "";
if (isDocsLike(path) || excludedLockfiles.has(path)) {
return total;
}
return total + (file.additions || 0) + (file.deletions || 0);
}, 0);
let sizeLabel = "size: XL";
if (changedLines <= 80) sizeLabel = "size: XS";
else if (changedLines <= 250) sizeLabel = "size: S";
else if (changedLines <= 500) sizeLabel = "size: M";
else if (changedLines <= 1000) sizeLabel = "size: L";
const hasHighRiskPath = files.some((file) => {
const path = file.filename || "";
return (
path.startsWith("src/security/") ||
path.startsWith("src/runtime/") ||
path.startsWith("src/gateway/") ||
path.startsWith("src/tools/") ||
path.startsWith(".github/workflows/")
);
});
const hasMediumRiskPath = files.some((file) => {
const path = file.filename || "";
return (
path.startsWith("src/") ||
path === "Cargo.toml" ||
path === "Cargo.lock" ||
path === "deny.toml" ||
path.startsWith(".githooks/")
);
});
let riskLabel = "risk: low";
if (hasHighRiskPath) {
riskLabel = "risk: high";
} else if (hasMediumRiskPath) {
riskLabel = "risk: medium";
}
await ensureManagedRepoLabelsMetadata();
const labelsToEnsure = new Set([
...sizeLabels,
...computedRiskLabels,
manualRiskOverrideLabel,
...managedPathLabels,
...contributorTierLabels,
...selectedModuleLabels,
]);
for (const label of labelsToEnsure) {
await ensureLabel(label);
}
let contributorTierLabel = null;
const authorLogin = pr.user?.login;
if (authorLogin && pr.user?.type !== "Bot") {
try {
const { data: mergedSearch } = await github.rest.search.issuesAndPullRequests({
q: `repo:${owner}/${repo} is:pr is:merged author:${authorLogin}`,
per_page: 1,
});
const mergedCount = mergedSearch.total_count || 0;
contributorTierLabel = selectContributorTier(mergedCount);
} catch (error) {
core.warning(`failed to compute contributor tier label: ${error.message}`);
}
}
const hasManualRiskOverride = currentLabelNames.includes(manualRiskOverrideLabel);
const keepNonManagedLabels = currentLabelNames.filter((label) => {
if (label === manualRiskOverrideLabel) return true;
if (contributorTierLabels.includes(label)) return false;
if (sizeLabels.includes(label) || computedRiskLabels.includes(label)) return false;
if (managedPathLabelSet.has(label)) return false;
if (managedModulePrefixes.some((prefix) => label.startsWith(prefix))) return false;
return true;
});
const manualRiskSelection =
currentLabelNames.find((label) => computedRiskLabels.includes(label)) || riskLabel;
const moduleLabelList = sortModuleLabels([...selectedModuleLabels]);
const contributorLabelList = contributorTierLabel ? [contributorTierLabel] : [];
const selectedRiskLabels = hasManualRiskOverride
? sortByPriority([manualRiskSelection, manualRiskOverrideLabel], riskPriorityIndex)
: sortByPriority([riskLabel], riskPriorityIndex);
const selectedSizeLabels = sortByPriority([sizeLabel], sizePriorityIndex);
const sortedContributorLabels = sortByPriority(contributorLabelList, contributorPriorityIndex);
const sortedPathLabels = sortByPriority(dedupedPathLabels, pathLabelPriorityIndex);
const sortedKeepNonManagedLabels = [...new Set(keepNonManagedLabels)].sort((left, right) =>
left.localeCompare(right)
);
const nextLabels = [
...new Set([
...selectedRiskLabels,
...selectedSizeLabels,
...sortedContributorLabels,
...moduleLabelList,
...sortedPathLabels,
...sortedKeepNonManagedLabels,
]),
];
await github.rest.issues.setLabels({
owner,
repo,
issue_number: pr.number,
labels: nextLabels,
});
};

View File

@ -1,57 +0,0 @@
// Extracted from test-benchmarks.yml step: Post benchmark summary on PR
module.exports = async ({ github, context, core }) => {
const fs = require('fs');
const output = fs.readFileSync('benchmark_output.txt', 'utf8');
// Extract Criterion result lines
const lines = output.split('\n').filter(l =>
l.includes('time:') || l.includes('change:') || l.includes('Performance')
);
if (lines.length === 0) {
core.info('No benchmark results to post.');
return;
}
const body = [
'## 📊 Benchmark Results',
'',
'```',
lines.join('\n'),
'```',
'',
'<details><summary>Full output</summary>',
'',
'```',
output.substring(0, 60000),
'```',
'</details>',
].join('\n');
// Find and update or create comment
const { data: comments } = await github.rest.issues.listComments({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: context.payload.pull_request.number,
});
const marker = '## 📊 Benchmark Results';
const existing = comments.find(c => c.body && c.body.startsWith(marker));
if (existing) {
await github.rest.issues.updateComment({
owner: context.repo.owner,
repo: context.repo.repo,
comment_id: existing.id,
body,
});
} else {
await github.rest.issues.createComment({
owner: context.repo.owner,
repo: context.repo.repo,
issue_number: context.payload.pull_request.number,
body,
});
}
};

Some files were not shown because too many files have changed in this diff Show More