docs(project): add m4-5 workspace RFI baseline and benchmark harness

This commit is contained in:
argenis de la rosa 2026-02-28 15:58:56 -05:00 committed by Argenis
parent 9e4ecc0ee6
commit 0321741b79
5 changed files with 223 additions and 2 deletions

View File

@ -2,7 +2,7 @@
This file is the canonical table of contents for the documentation system.
Last refreshed: **February 25, 2026**.
Last refreshed: **February 28, 2026**.
## Language Entry
@ -110,5 +110,6 @@ Last refreshed: **February 25, 2026**.
- [project/README.md](project/README.md)
- [project-triage-snapshot-2026-02-18.md](project-triage-snapshot-2026-02-18.md)
- [docs-audit-2026-02-24.md](docs-audit-2026-02-24.md)
- [project/m4-5-rfi-spike-2026-02-28.md](project/m4-5-rfi-spike-2026-02-28.md)
- [i18n-gap-backlog.md](i18n-gap-backlog.md)
- [docs-inventory.md](docs-inventory.md)

View File

@ -2,7 +2,7 @@
This inventory classifies documentation by intent and canonical location.
Last reviewed: **February 24, 2026**.
Last reviewed: **February 28, 2026**.
## Classification Legend
@ -124,6 +124,7 @@ These are valuable context, but **not strict runtime contracts**.
|---|---|
| `docs/project-triage-snapshot-2026-02-18.md` | Snapshot |
| `docs/docs-audit-2026-02-24.md` | Snapshot (docs architecture audit) |
| `docs/project/m4-5-rfi-spike-2026-02-28.md` | Snapshot (M4-5 workspace split RFI baseline and execution plan) |
| `docs/i18n-gap-backlog.md` | Snapshot (i18n depth gap tracking) |
## Maintenance Contract

View File

@ -6,6 +6,7 @@ Time-bound project status snapshots for planning documentation and operations wo
- [../project-triage-snapshot-2026-02-18.md](../project-triage-snapshot-2026-02-18.md)
- [../docs-audit-2026-02-24.md](../docs-audit-2026-02-24.md)
- [m4-5-rfi-spike-2026-02-28.md](m4-5-rfi-spike-2026-02-28.md)
## Scope

View File

@ -0,0 +1,151 @@
# M4-5 Multi-Crate Workspace RFI Spike (2026-02-28)
Status: RFI complete, extraction execution pending.
Issue: https://github.com/zeroclaw-labs/zeroclaw/issues/2263
Linear parent: RMN-243
## Scope
This spike is strictly no-behavior-change planning for the M4-5 workspace split.
Goals:
- capture reproducible compile baseline metrics
- define crate boundary and dependency contract
- define CI/feature-matrix impact and rollback posture
- define stacked PR slicing plan (XS/S/M)
Out of scope:
- broad API redesign
- feature additions bundled with structure work
- one-shot mega-PR extraction
## Baseline Compile Metrics
### Repro command
```bash
scripts/ci/m4_5_rfi_baseline.sh /tmp/zeroclaw-m4rfi-target
```
### Preflight compile blockers observed on `origin/main`
Before timing could run cleanly, two compile blockers were found:
- `src/gateway/mod.rs:2176`: `run_gateway_chat_with_tools` call missing `session_id` argument
- `src/providers/cursor.rs:233`: `ChatResponse` initializer missing `quota_metadata`
RFI includes minimal compile-compat fixes for these two blockers so measurements are reproducible.
### Measured results (Apple Silicon macOS, local workspace)
| Phase | real(s) | status |
|---|---:|---|
| A: cold `cargo check --workspace --locked` | 306.47 | pass |
| B: cold-ish `cargo build --workspace --locked` | 219.07 | pass |
| C: warm `cargo check --workspace --locked` | 0.84 | pass |
| D: incremental `cargo check` after touching `src/main.rs` | 6.19 | pass |
Observations:
- cold check is the dominant iteration tax
- warm-check performance is excellent once target artifacts exist
- incremental behavior is acceptable but sensitive to wide root-crate coupling
## Current Workspace Snapshot
Current workspace members:
- `.` (`zeroclaw` monolith crate)
- `crates/robot-kit`
Code concentration still sits in the monolith. Large hotspots include:
- `src/config/schema.rs`
- `src/channels/mod.rs`
- `src/onboard/wizard.rs`
- `src/agent/loop_.rs`
- `src/gateway/mod.rs`
## Proposed Boundary Contract
Target crate topology for staged extraction:
1. `crates/zeroclaw-types`
- shared DTOs, enums, IDs, lightweight cross-domain traits
- no provider/channel/network dependencies
2. `crates/zeroclaw-core`
- config structs + validation, provider trait contracts, routing primitives, policy helpers
- depends on `zeroclaw-types`
3. `crates/zeroclaw-memory`
- memory traits/backends + hygiene/snapshot plumbing
- depends on `zeroclaw-types`, `zeroclaw-core` contracts only where required
4. `crates/zeroclaw-channels`
- channel adapters + inbound normalization
- depends on `zeroclaw-types`, `zeroclaw-core`, `zeroclaw-memory`
5. `crates/zeroclaw-api`
- gateway/webhook/http orchestration
- depends on `zeroclaw-types`, `zeroclaw-core`, `zeroclaw-memory`, `zeroclaw-channels`
6. `crates/zeroclaw-bin` (or keep root binary package name stable)
- CLI entrypoints + wiring only
Dependency rules:
- no downward imports from foundational crates into higher layers
- channels must not depend on gateway/http crate
- keep provider-specific SDK deps out of `zeroclaw-types`
- maintain feature-flag parity at workspace root during migration
## CI / Feature-Matrix Impact
Required CI adjustments during migration:
- add workspace compile lane (`cargo check --workspace --locked`)
- add package-focused lanes for extracted crates (`-p zeroclaw-types`, `-p zeroclaw-core`, etc.)
- keep existing runtime behavior lanes (`test`, `sec-audit`, `codeql`) unchanged until final convergence
- update path filters so crate-local changes trigger only relevant crate tests plus contract smoke tests
Guardrails:
- changed-line strict-delta lint remains mandatory
- each extraction PR must include no-behavior-change assertion in PR body
- each step must include explicit rollback note
## Rollback Strategy
Per-step rollback (stack-safe):
1. revert latest extraction PR only
2. re-run workspace compile + existing CI matrix
3. keep binary entrypoint and config contract untouched until final extraction stage
Abort criteria:
- unexpected runtime behavior drift
- CI lane expansion causes recurring queue stalls without signal gain
- feature-flag compatibility regressions
## Stacked PR Slicing Plan
### PR-1 (XS)
- add crate shells + workspace wiring (`types/core`), no symbol moves
- objective: establish scaffolding and CI package lanes
### PR-2 (S)
- extract low-churn shared types into `zeroclaw-types`
- add re-export shim layer to preserve existing import paths
### PR-3 (S)
- extract config/provider contracts into `zeroclaw-core`
- keep runtime call sites unchanged via compatibility re-exports
### PR-4 (M)
- extract memory subsystem crate and move wiring boundaries
- run full memory + gateway regression suite
### PR-5 (M)
- extract channels/api orchestration seams
- finalize package ownership and remove temporary re-export shims
## Next Execution Step
Open first no-behavior-change extraction PR from this RFI baseline:
- scope: workspace crate scaffolding + CI package lanes only
- no runtime behavior changes
- explicit rollback command included in PR body

67
scripts/ci/m4_5_rfi_baseline.sh Executable file
View File

@ -0,0 +1,67 @@
#!/usr/bin/env bash
set -euo pipefail
if [[ "${1:-}" == "-h" || "${1:-}" == "--help" ]]; then
cat <<'USAGE'
Usage: scripts/ci/m4_5_rfi_baseline.sh [target_dir]
Run reproducible compile-timing probes for the current workspace.
The script prints a markdown table with real-time seconds and pass/fail status
for each benchmark phase.
USAGE
exit 0
fi
ROOT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")/../.." && pwd)"
TARGET_DIR="${1:-${ROOT_DIR}/target-rfi}"
cd "${ROOT_DIR}"
if [[ ! -f Cargo.toml ]]; then
echo "error: Cargo.toml not found at ${ROOT_DIR}" >&2
exit 1
fi
run_timed() {
local label="$1"
shift
local timing_file
timing_file="$(mktemp)"
local status="pass"
if /usr/bin/time -p "$@" >/dev/null 2>"${timing_file}"; then
status="pass"
else
status="fail"
fi
local real_time
real_time="$(awk '/^real / { print $2 }' "${timing_file}")"
rm -f "${timing_file}"
if [[ -z "${real_time}" ]]; then
real_time="n/a"
fi
printf '| %s | %s | %s |\n' "${label}" "${real_time}" "${status}"
[[ "${status}" == "pass" ]]
}
printf '# M4-5 RFI Baseline\n\n'
printf '- Timestamp (UTC): %s\n' "$(date -u +%Y-%m-%dT%H:%M:%SZ)"
printf '- Commit: `%s`\n' "$(git rev-parse --short HEAD)"
printf '- Target dir: `%s`\n\n' "${TARGET_DIR}"
printf '| Phase | real(s) | status |\n'
printf '|---|---:|---|\n'
rm -rf "${TARGET_DIR}"
set +e
run_timed "A: cold cargo check" env CARGO_TARGET_DIR="${TARGET_DIR}" cargo check --workspace --locked
run_timed "B: cold-ish cargo build" env CARGO_TARGET_DIR="${TARGET_DIR}" cargo build --workspace --locked
run_timed "C: warm cargo check" env CARGO_TARGET_DIR="${TARGET_DIR}" cargo check --workspace --locked
touch src/main.rs
run_timed "D: incremental cargo check after touch src/main.rs" env CARGO_TARGET_DIR="${TARGET_DIR}" cargo check --workspace --locked
set -e