Commit Graph

363 Commits

Author SHA1 Message Date
jordanthejet
5dfa722738 ci: consolidate CI/CD pipeline — 6 Rust jobs → 2, unified cache, frequency optimization
Consolidate redundant Rust compilation jobs to cut PR cycle time from 2+ hours
to ~30 minutes by reducing parallel cold compilations and upgrading runners.

CI Run (ci-run.yml):
- Merge lint + workspace-check + package-check → quality-gate (25min, 8vcpu)
- Merge test + build → test-and-build (30min, 8vcpu)
- Unify cache keys: prefix-key=zeroclaw-ci-v1, shared-key=runner.os-rust
- Update ci-required gate, lint-feedback deps to reference new job names

Security Audit (sec-audit.yml):
- Merge audit + deny + security-regressions → rust-security (25min, 8vcpu)
- Merge sbom + unsafe-debt → compliance (lightweight runner)
- Add fast-path: non-Rust PRs skip Rust compilation entirely

Frequency optimization (off PR path):
- sec-codeql.yml: push-to-main + weekly only (was PR + push)
- ci-reproducible-build.yml: push-to-main + weekly only (was PR + push)
- ci-change-audit.yml: push-to-main only (was PR + push)

Runner upgrades:
- All Rust compilation jobs: 2vcpu → blacksmith-8vcpu-ubuntu-2404
- ci-supply-chain-provenance, test-fuzz: upgraded to 8vcpu
- test-e2e: upgraded to 8vcpu, fixed env indentation bug

Feature matrix (feature-matrix.yml):
- Non-default lanes (whatsapp-web, browser-native, nightly-all-features)
  skip on compile profile, run on nightly only
- resolve-profile + summary jobs use ubuntu-latest (no Rust compilation)

Docs/scripts:
- lint_feedback.js: update job name references for quality-gate
- required-check-mapping.md: document new consolidated job names
- ci-map.md: update trigger map, triage guide, maintenance rules
- self-hosted-runner-remediation.md: update job name reference

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-03-05 15:51:07 -05:00
Argenis
d3c1ee1631
fix(ci): disable auto rust-docs component in test-e2e to fix toolchain 1.92.0 2026-03-05 14:30:17 -05:00
Argenis
bd55bdc14e
fix(ci): move activate toolchain PATH before ensure_cargo_component 2026-03-05 13:50:08 -05:00
Argenis
40a1c0bd3d
fix(ci): clean rewrite release-build.yml - fix YAML syntax error line 49 2026-03-05 13:47:06 -05:00
Argenis
a0945b9df2
fix(ci): move rustfmt install before ensure_cargo_component 2026-03-05 13:41:29 -05:00
Argenis
22ba4fd0ec
fix(ci): pre-install rustfmt/clippy before ensure_cargo_component 2026-03-05 13:36:11 -05:00
Argenis
91bdfcb69d fix(ci): correct runs-on indentation (regex replace) 2026-03-05 13:27:13 -05:00
Argenis
d783c1ca37 fix(ci): fix runs-on indentation in release-build.yml 2026-03-05 13:27:13 -05:00
Argenis
b47a611d52 fix(ci): fix YAML indentation for ENSURE_RUST_COMPONENTS env var 2026-03-05 13:27:13 -05:00
Argenis
22781a0093 fix(ci): remove aws-india label from release-build runner 2026-03-05 13:27:13 -05:00
Argenis
4f60ef6059 feat(ci): add auto-main-release-tag workflow 2026-03-05 13:27:13 -05:00
Argenis
c14e097fe7 fix(ci): set ENSURE_RUST_COMPONENTS to exclude rust-docs in release-build 2026-03-05 13:27:13 -05:00
argenis de la rosa
2dba3b5e57 chore: remove Linear and Hetzner integrations (replay #2809) 2026-03-05 02:17:32 -05:00
Argenis
2e2045b53d chore(codeowners): align main with dev tri-owner approver routing 2026-03-04 11:38:58 -05:00
argenis de la rosa
5471be7c14 ci(docker): publish GHCR image built with all cargo features 2026-03-03 17:36:52 -05:00
argenis de la rosa
f2ba33fce8 ci: enforce cargo component in provenance job 2026-03-03 14:00:29 -05:00
argenis de la rosa
eefeb347b3 ci: skip pub release job for prerelease tags 2026-03-03 13:46:32 -05:00
argenis de la rosa
1c0e5d957a ci: stabilize release/provenance workflow execution 2026-03-03 13:35:12 -05:00
argenis de la rosa
e2ca22052f ci: scope release tests to quick sanity 2026-03-03 12:10:14 -05:00
argenis de la rosa
d689dd7e8f ci: align release quality gate with repo baseline 2026-03-03 12:10:14 -05:00
Argenis
62fdddc690 ci: activate toolchain PATH for cargo fmt/clippy in release build 2026-03-03 09:53:29 -05:00
argenis de la rosa
d214ebfa1a ci: ensure rustfmt/clippy components in production release build 2026-03-03 09:44:44 -05:00
argenis de la rosa
21616689f8 ci: add Blacksmith production release build workflow 2026-03-03 09:06:38 -05:00
Argenis
ad6a10a903 ci: align Blacksmith runner label with repository policy 2026-03-03 09:02:11 -05:00
blacksmith-sh[bot]
426fee3568 Migrate workflows to Blacksmith 2026-03-03 09:02:11 -05:00
xj
da2d0aee08 fix(ci): detect docker api version before buildx 2026-03-03 02:11:49 -08:00
xj
426b3b01c6 fix(release): enable manual GHCR publish for tagged releases 2026-03-03 02:09:04 -08:00
xj
93963566d6 fix(release): add apt lock timeout and retries in pub-release 2026-03-03 01:31:27 -08:00
xj
07ba229a46 fix(release): harden pub-release cross and apt reliability 2026-03-03 01:25:18 -08:00
xj
02f6a5cb98 fix(release): isolate rust toolchain homes in pub-release 2026-03-03 01:19:31 -08:00
xj
e77d9cf8fb fix(ci): pass GH_TOKEN to release trigger guard validation step
The gh CLI was installed with GH_TOKEN but the validate step that
actually calls it was missing the env var, causing auth failure.
2026-03-03 00:19:25 -08:00
xj
a1306384b9 fix(ci): install gh CLI on self-hosted runners for release trigger guard
The release_trigger_guard.py requires gh CLI to verify CI Required Gate
status in publish mode. Self-hosted hetzner runners don't have gh
pre-installed, causing the guard to fail with exit code 3.

Add a gh CLI install step before the guard runs, with a skip if gh is
already available.
2026-03-02 23:50:38 -08:00
xj
dcfb23d711 fix(ci): unblock release builds — bump binary size limit and add cross-compile headers
Two pre-existing issues blocking Pub Release builds:

1. x86_64-unknown-linux-gnu binary grew to 24MB, exceeding the 23MB
   hard limit. Bump Linux safeguard from 23MB to 26MB to accommodate
   recent feature growth. Binary size investigation deferred to follow-up.

2. armv7-unknown-linux-gnueabihf fails compiling ring/aws-lc-sys due to
   missing libc6-dev-armhf-cross headers. Add libc dev package install
   for armv7 and aarch64 cross-compile targets.
2026-03-02 23:45:00 -08:00
xj
8c1366dc00 fix(ci): restore GitHub-hosted runner labels for macOS and Windows release builds
The release safety gates branch inadvertently replaced all matrix os
labels with self-hosted Linux runner arrays, including macOS and Windows
targets that require GitHub-hosted runners. This caused all three
cross-platform builds to fail: macOS builds attempted C compilation with
GNU cc (missing -arch flag), and Windows MSVC builds failed without
lib.exe.

Restore the original GitHub-hosted labels:
- macos-15-intel for x86_64-apple-darwin
- macos-14 for aarch64-apple-darwin
- windows-latest for x86_64-pc-windows-msvc
2026-03-02 23:28:30 -08:00
xj
776e15e381 ci: enforce strict cargo component check for pinned toolchains 2026-03-02 22:45:48 -08:00
xj
8f4a400b60 ci: ensure cargo component before cache and e2e tests 2026-03-02 22:27:54 -08:00
xj
316e38546c fix: address Copilot review feedback on release safety gates
- Dry-run gate: use server-side query params instead of client-side jq
  filtering to avoid pagination issues
- Post-release validation: use artifact contract JSON for expected asset
  count instead of hardcoded magic number
- Post-release validation: use grep -Fq for fixed-string version match
  to avoid regex interpretation
- cut_release_tag.sh: clarify CI gate comment header
2026-03-02 20:53:50 -08:00
xj
d4c24f6a83 fix(ci): address coderabbit review findings
- Split GH_TOKEN away from binary smoke-test step to prevent token
  exfiltration via compromised release artifact
- Wrap gh subprocess calls in try/except FileNotFoundError so the
  guard degrades gracefully when gh CLI is not installed
- Remove stderr suppression from cargo check --locked so diagnostics
  are visible on failure
2026-03-02 20:40:13 -08:00
xj
fbe7a7ed35 ci(release): add automated release safety gates
- release_trigger_guard.py: block publish if CI Required Gate hasn't
  passed on the tag commit; warn if no prior dry-run exists
- cut_release_tag.sh: check CI status via gh api before creating tag;
  run cargo check --locked to catch stale Cargo.lock locally
- ci-post-release-validation.yml: new workflow triggered on release
  publish — validates asset count, SHA256 checksums, and binary version
2026-03-02 20:21:05 -08:00
Chummy
b22dc4875e ci: expose toolchain bin path before cargo test flake gate 2026-03-03 10:12:29 +08:00
Chummy
b21a1a91ac ci: prioritize release branch runs across queue 2026-03-03 00:14:49 +08:00
Chummy
f4df039621 ci: prioritize release codeql with dedicated hetzner lane 2026-03-03 00:14:49 +08:00
Chummy
31426d66db ci: bind codeql to dedicated hetzner lane 2026-03-02 23:57:45 +08:00
Chummy
ca2eb0d466 ci: rebalance lightweight gates to aws-india lane 2026-03-02 23:17:02 +08:00
Chummy
c37ef88d5e ci: whitelist aws light runner labels in actionlint 2026-03-02 22:47:22 +08:00
Chummy
bdb873e743 ci: route lightweight jobs to aws-india cpu40 runners 2026-03-02 22:47:22 +08:00
Chummy
27341a067b ci: offload lightweight workflows from hetzner runner lane 2026-03-02 21:17:09 +08:00
Chummy
4443406311 ci: pin docker api level for self-hosted daemon compatibility 2026-03-02 18:28:28 +08:00
Chummy
04653366b2 ci: use system python on self-hosted runners 2026-03-02 18:28:28 +08:00
Chummy
1e6d4f17f5 ci: route workflows to hetzner self-hosted runner pool 2026-03-02 18:28:28 +08:00