docs(ci): add runbooks and required-check mapping for new lanes

This commit is contained in:
Chummy 2026-02-25 09:28:27 +00:00 committed by Chum Yin
parent 83d5421368
commit 3aed919c47
10 changed files with 286 additions and 11 deletions

View File

@ -28,3 +28,9 @@ Current workflow helper scripts:
- `.github/workflows/scripts/pr_intake_checks.js`
- `.github/workflows/scripts/pr_labeler.js`
- `.github/workflows/scripts/test_benchmarks_pr_comment.js`
Release/CI policy assets introduced for advanced delivery lanes:
- `.github/release/nightly-owner-routing.json`
- `.github/release/canary-policy.json`
- `.github/release/prerelease-stage-gates.json`

View File

@ -13,11 +13,11 @@ Use this with:
| Event | Main workflows |
| --- | --- |
| PR activity (`pull_request_target`) | `pr-intake-checks.yml`, `pr-labeler.yml`, `pr-auto-response.yml` |
| PR activity (`pull_request`) | `ci-run.yml`, `sec-audit.yml`, `sec-codeql.yml` (when Rust/codeql paths change), `main-promotion-gate.yml` (for `main` PRs), plus path-scoped workflows |
| Push to `dev`/`main` | `ci-run.yml`, `sec-audit.yml`, `sec-codeql.yml` (when Rust/codeql paths change), plus path-scoped workflows |
| Merge queue (`merge_group`) | `ci-run.yml`, `sec-audit.yml`, `sec-codeql.yml` |
| Tag push (`v*`) | `pub-release.yml` publish mode, `pub-docker-img.yml` publish job |
| Scheduled/manual | `pub-release.yml` verification mode, `pub-homebrew-core.yml` (manual), `sec-codeql.yml`, `ci-connectivity-probes.yml`, `ci-provider-connectivity.yml`, `ci-reproducible-build.yml`, `ci-supply-chain-provenance.yml`, `ci-change-audit.yml` (manual), `ci-rollback.yml` (weekly/manual), `feature-matrix.yml`, `test-fuzz.yml`, `pr-check-stale.yml`, `pr-check-status.yml`, `sync-contributors.yml`, `test-benchmarks.yml`, `test-e2e.yml` |
| PR activity (`pull_request`) | `ci-run.yml`, `feature-matrix.yml` (Rust/workflow paths), `sec-audit.yml`, `sec-codeql.yml` (when Rust/codeql paths change), `main-promotion-gate.yml` (for `main` PRs), plus path-scoped workflows |
| Push to `dev`/`main` | `ci-run.yml`, `feature-matrix.yml` (Rust/workflow paths), `sec-audit.yml`, `sec-codeql.yml` (when Rust/codeql paths change), plus path-scoped workflows |
| Merge queue (`merge_group`) | `ci-run.yml`, `feature-matrix.yml`, `sec-audit.yml`, `sec-codeql.yml` |
| Tag push (`v*`) | `pub-release.yml` publish mode, `pub-docker-img.yml` publish job, `pub-prerelease.yml` (for `v*-alpha.*`, `v*-beta.*`, `v*-rc.*`) |
| Scheduled/manual | `pub-release.yml` verification mode, `pub-prerelease.yml` (manual), `ci-canary-gate.yml`, `pub-homebrew-core.yml` (manual), `sec-codeql.yml`, `ci-connectivity-probes.yml`, `ci-provider-connectivity.yml`, `ci-reproducible-build.yml`, `ci-supply-chain-provenance.yml`, `ci-change-audit.yml` (manual), `ci-rollback.yml` (weekly/manual), `feature-matrix.yml`, `nightly-all-features.yml`, `docs-deploy.yml` (manual), `test-fuzz.yml`, `pr-check-stale.yml`, `pr-check-status.yml`, `sync-contributors.yml`, `test-benchmarks.yml`, `test-e2e.yml` |
## Runtime and Docker Matrix
@ -55,10 +55,12 @@ Notes:
- `pr-auto-response.yml` runs first-interaction and label routes.
3. `pull_request` CI workflows start:
- `ci-run.yml`
- `feature-matrix.yml` (Rust/workflow path scope)
- `sec-audit.yml`
- `sec-codeql.yml` (if Rust/codeql paths changed)
- path-scoped workflows if matching files changed:
- `pub-docker-img.yml` (Docker build-input paths only)
- `docs-deploy.yml` (docs + README markdown paths)
- `workflow-sanity.yml` (workflow files only)
- `pr-label-policy-check.yml` (label-policy files only)
- `ci-change-audit.yml` (CI/security path changes)
@ -133,14 +135,15 @@ Notes:
1. Commit reaches `dev` or `main` (usually from a merged PR), or merge queue creates a `merge_group` validation commit.
2. `ci-run.yml` runs on `push` and `merge_group`.
3. `sec-audit.yml` runs on `push` and `merge_group`.
4. `sec-codeql.yml` runs on `push`/`merge_group` when Rust/codeql paths change (path-scoped on push).
5. `ci-supply-chain-provenance.yml` runs on push when Rust/build provenance paths change.
6. Path-filtered workflows run only if touched files match their filters.
7. In `ci-run.yml`, push/merge-group behavior differs from PR behavior:
3. `feature-matrix.yml` runs on `push` for Rust/workflow paths and on `merge_group`.
4. `sec-audit.yml` runs on `push` and `merge_group`.
5. `sec-codeql.yml` runs on `push`/`merge_group` when Rust/codeql paths change (path-scoped on push).
6. `ci-supply-chain-provenance.yml` runs on push when Rust/build provenance paths change.
7. Path-filtered workflows run only if touched files match their filters.
8. In `ci-run.yml`, push/merge-group behavior differs from PR behavior:
- Rust path: `lint`, `lint-strict-delta`, `test`, `build` are expected.
- Docs/non-rust paths: fast-path behavior applies.
8. `CI Required Gate` computes overall push/merge-group result.
9. `CI Required Gate` computes overall push/merge-group result.
## Docker Publish Logic
@ -182,6 +185,18 @@ Workflow: `.github/workflows/pub-release.yml`
5. In publish mode, workflow generates SBOM (`CycloneDX` + `SPDX`), `SHA256SUMS`, keyless cosign signatures, and verifies GHCR release-tag availability.
6. In publish mode, workflow creates/updates the GitHub Release for the resolved tag and commit-ish.
Pre-release path:
1. Pre-release tags (`vX.Y.Z-alpha.N`, `vX.Y.Z-beta.N`, `vX.Y.Z-rc.N`) trigger `.github/workflows/pub-prerelease.yml`.
2. `scripts/ci/prerelease_guard.py` enforces stage progression, `origin/main` ancestry, and Cargo version/tag alignment.
3. In publish mode, prerelease assets are attached to a GitHub prerelease for the stage tag.
Canary policy lane:
1. `.github/workflows/ci-canary-gate.yml` runs weekly or manually.
2. `scripts/ci/canary_guard.py` evaluates metrics against `.github/release/canary-policy.json`.
3. Decision output is explicit (`promote`, `hold`, `abort`) with auditable artifacts and optional dispatch signal.
Manual Homebrew formula flow:
1. Run `.github/workflows/pub-homebrew-core.yml` with `release_tag=vX.Y.Z`.

View File

@ -31,6 +31,11 @@ Merge-blocking checks should stay small and deterministic. Optional checks are u
- `.github/workflows/pub-docker-img.yml` (`Docker`)
- Purpose: PR Docker smoke check on `dev`/`main` PRs and publish images on tag pushes (`v*`) only
- `.github/workflows/feature-matrix.yml` (`Feature Matrix`)
- Purpose: compile-time matrix validation for `default`, `whatsapp-web`, `browser-native`, and `nightly-all-features` lanes
- Additional behavior: each lane emits machine-readable result artifacts; summary lane aggregates owner routing from `.github/release/nightly-owner-routing.json`
- `.github/workflows/nightly-all-features.yml` (`Nightly All-Features`)
- Purpose: scheduled high-risk matrix execution with per-lane artifacts and summary rollup for overnight signal quality
- `.github/workflows/sec-audit.yml` (`Security Audit`)
- Purpose: dependency advisories (`rustsec/audit-check`, pinned SHA), policy/license checks (`cargo deny`), gitleaks-based secrets governance (allowlist policy metadata + expiry guard), and SBOM snapshot artifacts (`CycloneDX` + `SPDX`)
- `.github/workflows/sec-codeql.yml` (`CodeQL Analysis`)
@ -54,6 +59,12 @@ Merge-blocking checks should stay small and deterministic. Optional checks are u
- Noise control: excludes common test/fixture paths and test file patterns by default (`include_tests=false`)
- `.github/workflows/pub-release.yml` (`Release`)
- Purpose: build release artifacts in verification mode (manual/scheduled) and publish GitHub releases on tag push or manual publish mode
- `.github/workflows/pub-prerelease.yml` (`Pub Pre-release`)
- Purpose: validate alpha/beta/rc stage transitions, enforce tag/version integrity, and optionally publish GitHub prerelease assets
- `.github/workflows/ci-canary-gate.yml` (`CI Canary Gate`)
- Purpose: evaluate canary metrics against policy thresholds (`promote` / `hold` / `abort`) with auditable artifacts and guarded execute mode
- `.github/workflows/docs-deploy.yml` (`Docs Deploy`)
- Purpose: docs quality checks + preview artifacts + GitHub Pages production deployment lane
- `.github/workflows/pub-homebrew-core.yml` (`Pub Homebrew Core`)
- Purpose: manual, bot-owned Homebrew core formula bump PR flow for tagged releases
- Guardrail: release tag must match `Cargo.toml` version
@ -93,7 +104,12 @@ Merge-blocking checks should stay small and deterministic. Optional checks are u
- `CI`: push to `dev` and `main`, PRs to `dev` and `main`, merge queue `merge_group` for `dev`/`main`
- `Docker`: tag push (`v*`) for publish, matching PRs to `dev`/`main` for smoke build, manual dispatch for smoke only
- `Feature Matrix`: PR/push on Rust + workflow paths, merge queue, weekly schedule, manual dispatch
- `Nightly All-Features`: daily schedule and manual dispatch
- `Release`: tag push (`v*`), weekly schedule (verification-only), manual dispatch (verification or publish)
- `Pub Pre-release`: pre-release tag pushes (`v*-alpha.*`, `v*-beta.*`, `v*-rc.*`) and manual dispatch
- `CI Canary Gate`: weekly schedule (policy check) and manual dispatch (`dry-run` / `execute`)
- `Docs Deploy`: docs/README path changes on PR/push + manual dispatch (`preview` / `production`)
- `Connectivity Probes`: manual dispatch only (legacy wrapper)
- `Pub Homebrew Core`: manual dispatch only
- `Security Audit`: push to `dev` and `main`, PRs to `dev` and `main`, merge queue `merge_group` for `dev`/`main`, weekly schedule
@ -132,6 +148,11 @@ Merge-blocking checks should stay small and deterministic. Optional checks are u
15. Docs failures in CI: inspect `docs-quality` job logs in `.github/workflows/ci-run.yml`.
16. Strict delta lint failures in CI: inspect `lint-strict-delta` job logs and compare with `BASE_SHA` diff scope.
17. Suspected flaky tests: inspect `Test Flake Retry Probe` summary and `test-flake-probe` artifact in `.github/workflows/ci-run.yml`.
18. Feature-combo regressions: inspect `.github/workflows/feature-matrix.yml` summary artifact and lane JSON reports.
19. Nightly integration drift: inspect `.github/workflows/nightly-all-features.yml` summary and lane owner mapping.
20. Pre-release stage gate failures: inspect `.github/workflows/pub-prerelease.yml` guard artifact (`prerelease-guard.json`).
21. Canary gate hold/abort decisions: inspect `.github/workflows/ci-canary-gate.yml` guard artifact (`canary-guard.json`).
22. Docs deploy failures: inspect `.github/workflows/docs-deploy.yml` quality lane + preview/deploy artifacts.
## Maintenance Rules
@ -143,6 +164,9 @@ Merge-blocking checks should stay small and deterministic. Optional checks are u
- Keep gitleaks allowlist governance metadata current in `.github/security/gitleaks-allowlist-governance.json` (owner/reason/expiry/ticket enforced by `secrets_governance_guard.py`).
- Keep audit event schema + retention metadata aligned with `docs/audit-event-schema.md` (`emit_audit_event.py` envelope + workflow artifact policy).
- Keep rollback operations guarded and reversible (`ci-rollback.yml` defaults to `dry-run`; `execute` is manual and policy-gated).
- Keep canary policy thresholds and sample-size rules current in `.github/release/canary-policy.json`.
- Keep pre-release stage transition policy and required checks current in `.github/release/prerelease-stage-gates.json`.
- Keep required check naming stable and documented in `docs/operations/required-check-mapping.md` before changing branch protection settings.
- Follow `docs/release-process.md` for verify-before-publish release cadence and tag discipline.
- Keep merge-blocking rust quality policy aligned across `.github/workflows/ci-run.yml`, `dev/ci.sh`, and `.githooks/pre-push` (`./scripts/ci/rust_quality_gate.sh` + `./scripts/ci/rust_strict_delta_gate.sh`).
- Use `./scripts/ci/rust_strict_delta_gate.sh` (or `./dev/ci.sh lint-delta`) as the incremental strict merge gate for changed Rust lines.

View File

@ -0,0 +1,35 @@
# Canary Gate Runbook
Workflow: `.github/workflows/ci-canary-gate.yml`
Policy: `.github/release/canary-policy.json`
## Inputs
- candidate tag + optional SHA
- observed error rate
- observed crash rate
- observed p95 latency
- observed sample size
## Decision Model
- `promote`: all metrics within configured thresholds
- `hold`: soft breach or policy violations (for example insufficient sample)
- `abort`: hard breach (`>1.5x` threshold)
## Execution Modes
- `dry-run`: generate decision + artifacts only
- `execute`: allow marker tag + optional repository dispatch
## Artifacts
- `canary-guard.json`
- `canary-guard.md`
- `audit-event-canary-guard.json`
## Operational Guidance
1. Use `dry-run` first for every candidate.
2. Never execute with sample size below policy minimum.
3. For `abort`, include root-cause summary in release issue and keep candidate blocked.

View File

@ -0,0 +1,30 @@
# Docs Deploy Runbook
Workflow: `.github/workflows/docs-deploy.yml`
## Lanes
- `Docs Quality Gate`: markdown quality + added-link checks
- `Docs Preview Artifact`: PR/manual preview package
- `Deploy Docs to GitHub Pages`: production deployment lane
## Triggering
- PR/push when docs or README markdown changes
- manual dispatch for preview or production
## Quality Controls
- `scripts/ci/docs_quality_gate.sh`
- `scripts/ci/collect_changed_links.py` + lychee added-link checks
## Deployment Rules
- preview: upload `docs-preview` artifact only
- production: deploy to GitHub Pages on `main` push or manual production dispatch
## Failure Handling
1. Re-run markdown and link gates locally.
2. Fix broken links / markdown regressions first.
3. Re-dispatch production deploy only after preview artifact checks pass.

View File

@ -0,0 +1,31 @@
# Feature Matrix Runbook
This runbook defines the feature matrix CI lanes used to validate key compile combinations.
Workflow: `.github/workflows/feature-matrix.yml`
## Lanes
- `default`: `cargo check --locked`
- `whatsapp-web`: `cargo check --locked --no-default-features --features whatsapp-web`
- `browser-native`: `cargo check --locked --no-default-features --features browser-native`
- `nightly-all-features`: `cargo check --locked --all-features`
## Triggering
- PRs and pushes to `dev` / `main` on Rust + workflow paths
- merge queue (`merge_group`)
- weekly schedule
- manual dispatch
## Artifacts
- Per-lane report: `feature-matrix-<lane>`
- Aggregated report: `feature-matrix-summary` (`feature-matrix-summary.json`, `feature-matrix-summary.md`)
## Failure Triage
1. Open `feature-matrix-summary.md` and identify failed lane(s).
2. Download lane artifact (`nightly-result-<lane>.json`) for exact command + exit code.
3. Reproduce locally using the reported command.
4. Attach reproduction output to the corresponding Linear execution issue.

View File

@ -0,0 +1,31 @@
# Nightly All-Features Runbook
This runbook describes the nightly integration matrix execution and reporting flow.
Workflow: `.github/workflows/nightly-all-features.yml`
## Objective
- Continuously validate high-risk feature combinations overnight.
- Produce machine-readable and human-readable reports for rapid triage.
## Lanes
- `default`
- `whatsapp-web`
- `browser-native`
- `nightly-all-features`
Lane owners are configured in `.github/release/nightly-owner-routing.json`.
## Artifacts
- Per-lane: `nightly-lane-<lane>` with `nightly-result-<lane>.json`
- Aggregate: `nightly-all-features-summary` with `nightly-summary.json` and `nightly-summary.md`
## Failure Handling
1. Inspect `nightly-summary.md` for failed lanes and owners.
2. Download the failed lane artifact and rerun the exact command locally.
3. Capture fix PR + test evidence.
4. Link remediation back to release or CI governance issues.

View File

@ -0,0 +1,30 @@
# Pre-release Stage Gates
Workflow: `.github/workflows/pub-prerelease.yml`
Policy: `.github/release/prerelease-stage-gates.json`
## Stage Model
- `alpha`
- `beta`
- `rc`
- `stable`
## Guard Rules
- Tag format: `vX.Y.Z-(alpha|beta|rc).N`
- Stage transition must follow policy (`alpha -> beta -> rc -> stable`)
- No stage regression allowed for the same semantic version
- Tag commit must be reachable from `origin/main`
- `Cargo.toml` version at tag must match tag version
## Outputs
- `prerelease-guard.json`
- `prerelease-guard.md`
- `audit-event-prerelease-guard.json`
## Publish Contract
- `dry-run`: guard + build + artifact manifest only
- `publish`: create/update GitHub prerelease and attach built assets

View File

@ -0,0 +1,34 @@
# Required Check Mapping
This document maps merge-critical workflows to expected check names.
## Merge to `dev` / `main`
| Required check name | Source workflow | Scope |
| --- | --- | --- |
| `CI Required Gate` | `.github/workflows/ci-run.yml` | core Rust/doc merge gate |
| `Security Audit` | `.github/workflows/sec-audit.yml` | dependencies, secrets, governance |
| `Feature Matrix Summary` | `.github/workflows/feature-matrix.yml` | feature-combination compile matrix |
| `Workflow Sanity` | `.github/workflows/workflow-sanity.yml` | workflow syntax and lint |
## Promotion to `main`
| Required check name | Source workflow | Scope |
| --- | --- | --- |
| `Main Promotion Gate` | `.github/workflows/main-promotion-gate.yml` | branch + actor policy |
| `CI Required Gate` | `.github/workflows/ci-run.yml` | baseline quality gate |
| `Security Audit` | `.github/workflows/sec-audit.yml` | security baseline |
## Release / Pre-release
| Required check name | Source workflow | Scope |
| --- | --- | --- |
| `Verify Artifact Set` | `.github/workflows/pub-release.yml` | release completeness |
| `Pre-release Guard` | `.github/workflows/pub-prerelease.yml` | stage progression + tag integrity |
| `Nightly Summary & Routing` | `.github/workflows/nightly-all-features.yml` | overnight integration signal |
## Notes
- Use pinned `uses:` references for all workflow actions.
- Keep check names stable; renaming check jobs can break branch protection rules.
- Update this mapping whenever merge-critical workflows/jobs are added or renamed.

View File

@ -22,6 +22,8 @@ Last verified: **February 21, 2026**.
Release automation lives in:
- `.github/workflows/pub-release.yml`
- `.github/workflows/pub-prerelease.yml`
- `.github/workflows/ci-canary-gate.yml`
- `.github/workflows/pub-homebrew-core.yml` (manual Homebrew formula PR, bot-owned)
Modes:
@ -29,6 +31,8 @@ Modes:
- Tag push `v*`: publish mode.
- Manual dispatch: verification-only or publish mode.
- Weekly schedule: verification-only mode.
- Pre-release tags (`vX.Y.Z-alpha.N`, `vX.Y.Z-beta.N`, `vX.Y.Z-rc.N`): prerelease publish path.
- Canary gate (weekly/manual): promote/hold/abort decision path.
Publish-mode guardrails:
@ -95,6 +99,35 @@ Expected publish outputs:
2. Verify GHCR tags for the released version (`vX.Y.Z`) and release commit SHA tag (`sha-<12>`).
3. Verify install paths that rely on release assets (for example bootstrap binary download).
### 5.1) Canary gate before broad rollout
Run `CI Canary Gate` (`.github/workflows/ci-canary-gate.yml`) in `dry-run` first, then `execute` when metrics are complete.
Required inputs:
- candidate tag/SHA
- observed error rate
- observed crash rate
- observed p95 latency
- observed sample size
Decision output:
- `promote`: thresholds satisfied
- `hold`: insufficient evidence or soft breach
- `abort`: hard threshold breach
### 5.2) Pre-release stage progression (alpha/beta/rc)
For staged release confidence:
1. Cut and push stage tag (`vX.Y.Z-alpha.N`, then beta, then rc).
2. `Pub Pre-release` validates:
- stage progression
- origin/main ancestry
- Cargo version/tag alignment
3. Publish prerelease assets only after guard passes.
### 6) Publish Homebrew Core formula (bot-owned)
Run `Pub Homebrew Core` manually:
@ -126,6 +159,12 @@ If tag-push release fails after artifacts are validated:
- `release_ref` is automatically pinned to `release_tag` in publish mode
3. Re-validate released assets.
If prerelease/canary lanes fail:
1. Inspect guard artifacts (`prerelease-guard.json`, `canary-guard.json`).
2. Fix stage-policy or quality regressions.
3. Re-run guard in `dry-run` before any execute/publish action.
## Operational Notes
- Keep release changes small and reversible.