feat(config): add configurable pacing controls for slow/local LLM workloads (#3343 )

* feat(config): add configurable pacing controls for slow/local LLM workloads (#2963) Add a new `[pacing]` config section with four opt-in parameters that let users tune timeout and loop-detection behavior for local LLMs (Ollama, llama.cpp, vLLM) without disabling safety features entirely: - `step_timeout_secs`: per-step LLM inference timeout independent of the overall message budget, catching hung model responses early. - `loop_detection_min_elapsed_secs`: time-gated loop detection that only activates after a configurable grace period, avoiding false positives on long-running browser/research workflows. - `loop_ignore_tools`: per-tool loop-detection exclusions so tools like `browser_screenshot` that structurally resemble loops are not counted toward identical-output detection. - `message_timeout_scale_max`: overrides the hardcoded 4x ceiling in the channel message timeout scaling formula. All parameters are strictly optional with no effect when absent, preserving full backwards compatibility. Closes #2963 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(config): add missing pacing fields in tests and call sites * fix(config): add pacing arg to remaining cost-tracking test call sites --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: argenis de la rosa <theonlyhennygod@gmail.com>
Merge pull request #4145 from zeroclaw-labs/feat/gateway-path-prefix
2026-03-21 08:54:08 -04:00 · 2026-03-21 08:48:56 -04:00 · 2026-03-21 08:14:28 -04:00 · 2026-03-21 08:14:28 -04:00
18 changed files with 549 additions and 48 deletions
@@ -122,6 +122,34 @@ tools = ["mcp_browser_*"]
 keywords = ["browse", "navigate", "open url", "screenshot"]
 ```

+## `[pacing]`
+
+Pacing controls for slow/local LLM workloads (Ollama, llama.cpp, vLLM). All keys are optional; when absent, existing behavior is preserved.
+
+| Key | Default | Purpose |
+|---|---|---|
+| `step_timeout_secs` | _none_ | Per-step timeout: maximum seconds for a single LLM inference turn. Catches a truly hung model without terminating the overall task loop |
+| `loop_detection_min_elapsed_secs` | _none_ | Minimum elapsed seconds before loop detection activates. Tasks completing under this threshold get aggressive loop protection; longer-running tasks receive a grace period |
+| `loop_ignore_tools` | `[]` | Tool names excluded from identical-output loop detection. Useful for browser workflows where `browser_screenshot` structurally resembles a loop |
+| `message_timeout_scale_max` | `4` | Override for the hardcoded timeout scaling cap. The channel message timeout budget is `message_timeout_secs * min(max_tool_iterations, message_timeout_scale_max)` |
+
+Notes:
+
+- These settings are intended for local/slow LLM deployments. Cloud-provider users typically do not need them.
+- `step_timeout_secs` operates independently of the total channel message timeout budget. A step timeout abort does not consume the overall budget; the loop simply stops.
+- `loop_detection_min_elapsed_secs` delays loop-detection counting, not the task itself. Loop protection remains fully active for short tasks (the default).
+- `loop_ignore_tools` only suppresses tool-output-based loop detection for the listed tools. Other safety features (max iterations, overall timeout) remain active.
+- `message_timeout_scale_max` must be >= 1. Setting it higher than `max_tool_iterations` has no additional effect (the formula uses `min()`).
+- Example configuration for a slow local Ollama deployment:
+
+```toml
+[pacing]
+step_timeout_secs = 120
+loop_detection_min_elapsed_secs = 60
+loop_ignore_tools = ["browser_screenshot", "browser_navigate"]
+message_timeout_scale_max = 8
+```
+
 ## `[security.otp]`

 | Key | Default | Purpose |
@@ -425,6 +453,12 @@ Notes:
 | `port` | `42617` | gateway listen port |
 | `require_pairing` | `true` | require pairing before bearer auth |
 | `allow_public_bind` | `false` | block accidental public exposure |
+| `path_prefix` | _(none)_ | URL path prefix for reverse-proxy deployments (e.g. `"/zeroclaw"`) |
+
+When deploying behind a reverse proxy that maps ZeroClaw to a sub-path,
+set `path_prefix` to that sub-path (e.g. `"/zeroclaw"`). All gateway
+routes will be served under this prefix. The value must start with `/`
+and must not end with `/`.

 ## `[autonomy]`

@@ -597,7 +631,7 @@ Top-level channel options are configured under `channels_config`.

 | Key | Default | Purpose |
 |---|---|---|
-| `message_timeout_secs` | `300` | Base timeout in seconds for channel message processing; runtime scales this with tool-loop depth (up to 4x) |
+| `message_timeout_secs` | `300` | Base timeout in seconds for channel message processing; runtime scales this with tool-loop depth (up to 4x, overridable via `[pacing].message_timeout_scale_max`) |

 Examples:

@@ -612,7 +646,7 @@ Examples:
 Notes:

 - Default `300s` is optimized for on-device LLMs (Ollama) which are slower than cloud APIs.
- Runtime timeout budget is `message_timeout_secs * scale`, where `scale = min(max_tool_iterations, 4)` and a minimum of `1`.
+- Runtime timeout budget is `message_timeout_secs * scale`, where `scale = min(max_tool_iterations, cap)` and a minimum of `1`. The default cap is `4`; override with `[pacing].message_timeout_scale_max`.
 - This scaling avoids false timeouts when the first LLM turn is slow/retried but later tool-loop turns still need to complete.
 - If using cloud APIs (OpenAI, Anthropic, etc.), you can reduce this to `60` or lower.
 - Values below `30` are clamped to `30` to avoid immediate timeout churn.
@@ -2331,6 +2331,7 @@ pub(crate) async fn agent_turn(
        dedup_exempt_tools,
        activated_tools,
        model_switch_callback,
+        &crate::config::PacingConfig::default(),
    )
    .await
 }
@@ -2640,6 +2641,7 @@ pub(crate) async fn run_tool_call_loop(
    dedup_exempt_tools: &[String],
    activated_tools: Option<&std::sync::Arc<std::sync::Mutex<crate::tools::ActivatedToolSet>>>,
    model_switch_callback: Option<ModelSwitchCallback>,
+    pacing: &crate::config::PacingConfig,
 ) -> Result<String> {
    let max_iterations = if max_tool_iterations == 0 {
        DEFAULT_MAX_TOOL_ITERATIONS
@@ -2648,6 +2650,14 @@ pub(crate) async fn run_tool_call_loop(
    };

    let turn_id = Uuid::new_v4().to_string();
+    let loop_started_at = Instant::now();
+    let loop_ignore_tools: HashSet<&str> = pacing
+        .loop_ignore_tools
+        .iter()
+        .map(String::as_str)
+        .collect();
+    let mut consecutive_identical_outputs: usize = 0;
+    let mut last_tool_output_hash: Option<u64> = None;

    for iteration in 0..max_iterations {
        let mut seen_tool_signatures: HashSet<(String, String)> = HashSet::new();
@@ -2777,13 +2787,43 @@ pub(crate) async fn run_tool_call_loop(
            temperature,
        );

-        let chat_result = if let Some(token) = cancellation_token.as_ref() {
-            tokio::select! {
-                () = token.cancelled() => return Err(ToolLoopCancelled.into()),
-                result = chat_future => result,
+        // Wrap the LLM call with an optional per-step timeout from pacing config.
+        // This catches a truly hung model response without terminating the overall
+        // task loop (the per-message budget handles that separately).
+        let chat_result = match pacing.step_timeout_secs {
+            Some(step_secs) if step_secs > 0 => {
+                let step_timeout = Duration::from_secs(step_secs);
+                if let Some(token) = cancellation_token.as_ref() {
+                    tokio::select! {
+                        () = token.cancelled() => return Err(ToolLoopCancelled.into()),
+                        result = tokio::time::timeout(step_timeout, chat_future) => {
+                            match result {
+                                Ok(inner) => inner,
+                                Err(_) => anyhow::bail!(
+                                    "LLM inference step timed out after {step_secs}s (step_timeout_secs)"
+                                ),
+                            }
+                        },
+                    }
+                } else {
+                    match tokio::time::timeout(step_timeout, chat_future).await {
+                        Ok(inner) => inner,
+                        Err(_) => anyhow::bail!(
+                            "LLM inference step timed out after {step_secs}s (step_timeout_secs)"
+                        ),
+                    }
+                }
+            }
+            _ => {
+                if let Some(token) = cancellation_token.as_ref() {
+                    tokio::select! {
+                        () = token.cancelled() => return Err(ToolLoopCancelled.into()),
+                        result = chat_future => result,
+                    }
+                } else {
+                    chat_future.await
+                }
            }
-        } else {
-            chat_future.await
        };

        let (response_text, parsed_text, tool_calls, assistant_history_content, native_tool_calls) =
@@ -3282,7 +3322,13 @@ pub(crate) async fn run_tool_call_loop(
            ordered_results[*idx] = Some((call.name.clone(), call.tool_call_id.clone(), outcome));
        }

+        // Collect tool results and build per-tool output for loop detection.
+        // Only non-ignored tool outputs contribute to the identical-output hash.
+        let mut detection_relevant_output = String::new();
        for (tool_name, tool_call_id, outcome) in ordered_results.into_iter().flatten() {
+            if !loop_ignore_tools.contains(tool_name.as_str()) {
+                detection_relevant_output.push_str(&outcome.output);
+            }
            individual_results.push((tool_call_id, outcome.output.clone()));
            let _ = writeln!(
                tool_results,
@@ -3291,6 +3337,53 @@ pub(crate) async fn run_tool_call_loop(
            );
        }

+        // ── Time-gated loop detection ──────────────────────────
+        // When pacing.loop_detection_min_elapsed_secs is set, identical-output
+        // loop detection activates after the task has been running that long.
+        // This avoids false-positive aborts on long-running browser/research
+        // workflows while keeping aggressive protection for quick tasks.
+        // When not configured, identical-output detection is disabled (preserving
+        // existing behavior where only max_iterations prevents runaway loops).
+        let loop_detection_active = match pacing.loop_detection_min_elapsed_secs {
+            Some(min_secs) => loop_started_at.elapsed() >= Duration::from_secs(min_secs),
+            None => false, // disabled when not configured (backwards compatible)
+        };
+
+        if loop_detection_active && !detection_relevant_output.is_empty() {
+            use std::hash::{Hash, Hasher};
+            let mut hasher = std::collections::hash_map::DefaultHasher::new();
+            detection_relevant_output.hash(&mut hasher);
+            let current_hash = hasher.finish();
+
+            if last_tool_output_hash == Some(current_hash) {
+                consecutive_identical_outputs += 1;
+            } else {
+                consecutive_identical_outputs = 0;
+                last_tool_output_hash = Some(current_hash);
+            }
+
+            // Bail if we see 3+ consecutive identical tool outputs (clear runaway).
+            if consecutive_identical_outputs >= 3 {
+                runtime_trace::record_event(
+                    "tool_loop_identical_output_abort",
+                    Some(channel_name),
+                    Some(provider_name),
+                    Some(model),
+                    Some(&turn_id),
+                    Some(false),
+                    Some("identical tool output detected 3 consecutive times"),
+                    serde_json::json!({
+                        "iteration": iteration + 1,
+                        "consecutive_identical": consecutive_identical_outputs,
+                    }),
+                );
+                anyhow::bail!(
+                    "Agent loop aborted: identical tool output detected {} consecutive times",
+                    consecutive_identical_outputs
+                );
+            }
+        }
+
        // Add assistant message with tool calls + tool results to history.
        // Native mode: use JSON-structured messages so convert_messages() can
        // reconstruct proper OpenAI-format tool_calls and tool result messages.
@@ -3840,6 +3933,7 @@ pub async fn run(
                &config.agent.tool_call_dedup_exempt,
                activated_handle.as_ref(),
                Some(model_switch_callback.clone()),
+                &config.pacing,
            )
            .await
            {
@@ -4067,6 +4161,7 @@ pub async fn run(
                    &config.agent.tool_call_dedup_exempt,
                    activated_handle.as_ref(),
                    Some(model_switch_callback.clone()),
+                    &config.pacing,
                )
                .await
                {
@@ -4964,6 +5059,7 @@ mod tests {
            &[],
            None,
            None,
+            &crate::config::PacingConfig::default(),
        )
        .await
        .expect_err("provider without vision support should fail");
@@ -5014,6 +5110,7 @@ mod tests {
            &[],
            None,
            None,
+            &crate::config::PacingConfig::default(),
        )
        .await
        .expect_err("oversized payload must fail");
@@ -5058,6 +5155,7 @@ mod tests {
            &[],
            None,
            None,
+            &crate::config::PacingConfig::default(),
        )
        .await
        .expect("valid multimodal payload should pass");
@@ -5188,6 +5286,7 @@ mod tests {
            &[],
            None,
            None,
+            &crate::config::PacingConfig::default(),
        )
        .await
        .expect("parallel execution should complete");
@@ -5258,6 +5357,7 @@ mod tests {
            &[],
            None,
            None,
+            &crate::config::PacingConfig::default(),
        )
        .await
        .expect("cron_add delivery defaults should be injected");
@@ -5320,6 +5420,7 @@ mod tests {
            &[],
            None,
            None,
+            &crate::config::PacingConfig::default(),
        )
        .await
        .expect("explicit delivery mode should be preserved");
@@ -5377,6 +5478,7 @@ mod tests {
            &[],
            None,
            None,
+            &crate::config::PacingConfig::default(),
        )
        .await
        .expect("loop should finish after deduplicating repeated calls");
@@ -5446,6 +5548,7 @@ mod tests {
            &[],
            None,
            None,
+            &crate::config::PacingConfig::default(),
        )
        .await
        .expect("non-interactive shell should succeed for low-risk command");
@@ -5506,6 +5609,7 @@ mod tests {
            &exempt,
            None,
            None,
+            &crate::config::PacingConfig::default(),
        )
        .await
        .expect("loop should finish with exempt tool executing twice");
@@ -5586,6 +5690,7 @@ mod tests {
            &exempt,
            None,
            None,
+            &crate::config::PacingConfig::default(),
        )
        .await
        .expect("loop should complete");
@@ -5643,6 +5748,7 @@ mod tests {
            &[],
            None,
            None,
+            &crate::config::PacingConfig::default(),
        )
        .await
        .expect("native fallback id flow should complete");
@@ -5724,6 +5830,7 @@ mod tests {
            &[],
            None,
            None,
+            &crate::config::PacingConfig::default(),
        )
        .await
        .expect("native tool-call text should be relayed through on_delta");
@@ -7709,6 +7816,7 @@ Let me check the result."#;
            &[],
            None,
            None,
+            &crate::config::PacingConfig::default(),
        )
        .await
        .expect("tool loop should complete");
@@ -7856,6 +7964,7 @@ Let me check the result."#;
                    &[],
                    None,
                    None,
+                    &crate::config::PacingConfig::default(),
                ),
            )
            .await
@@ -7934,6 +8043,7 @@ Let me check the result."#;
                    &[],
                    None,
                    None,
+                    &crate::config::PacingConfig::default(),
                ),
            )
            .await
@@ -7988,6 +8098,7 @@ Let me check the result."#;
            &[],
            None,
            None,
+            &crate::config::PacingConfig::default(),
        )
        .await
        .expect("should succeed without cost scope");
@@ -222,9 +222,21 @@ fn effective_channel_message_timeout_secs(configured: u64) -> u64 {
 fn channel_message_timeout_budget_secs(
    message_timeout_secs: u64,
    max_tool_iterations: usize,
+) -> u64 {
+    channel_message_timeout_budget_secs_with_cap(
+        message_timeout_secs,
+        max_tool_iterations,
+        CHANNEL_MESSAGE_TIMEOUT_SCALE_CAP,
+    )
+}
+
+fn channel_message_timeout_budget_secs_with_cap(
+    message_timeout_secs: u64,
+    max_tool_iterations: usize,
+    scale_cap: u64,
 ) -> u64 {
    let iterations = max_tool_iterations.max(1) as u64;
-    let scale = iterations.min(CHANNEL_MESSAGE_TIMEOUT_SCALE_CAP);
+    let scale = iterations.min(scale_cap);
    message_timeout_secs.saturating_mul(scale)
 }

@@ -362,6 +374,7 @@ struct ChannelRuntimeContext {
    approval_manager: Arc<ApprovalManager>,
    activated_tools: Option<std::sync::Arc<std::sync::Mutex<crate::tools::ActivatedToolSet>>>,
    cost_tracking: Option<ChannelCostTrackingState>,
+    pacing: crate::config::PacingConfig,
 }

 #[derive(Clone)]
@@ -2402,8 +2415,15 @@ async fn process_channel_message(
    }

    let model_switch_callback = get_model_switch_state();
-    let timeout_budget_secs =
-        channel_message_timeout_budget_secs(ctx.message_timeout_secs, ctx.max_tool_iterations);
+    let scale_cap = ctx
+        .pacing
+        .message_timeout_scale_max
+        .unwrap_or(CHANNEL_MESSAGE_TIMEOUT_SCALE_CAP);
+    let timeout_budget_secs = channel_message_timeout_budget_secs_with_cap(
+        ctx.message_timeout_secs,
+        ctx.max_tool_iterations,
+        scale_cap,
+    );
    let cost_tracking_context = ctx.cost_tracking.clone().map(|state| {
        crate::agent::loop_::ToolLoopCostTrackingContext::new(state.tracker, state.prices)
    });
@@ -2445,6 +2465,7 @@ async fn process_channel_message(
                    ctx.tool_call_dedup_exempt.as_ref(),
                    ctx.activated_tools.as_ref(),
                    Some(model_switch_callback.clone()),
+                    &ctx.pacing,
                ),
                ),
            ) => LlmExecutionResult::Completed(result),
@@ -4641,6 +4662,7 @@ pub async fn start_channels(config: Config) -> Result<()> {
            tracker,
            prices: Arc::new(config.cost.prices.clone()),
        }),
+        pacing: config.pacing.clone(),
    });

    // Hydrate in-memory conversation histories from persisted JSONL session files.
@@ -4737,6 +4759,49 @@ mod tests {
        );
    }

+    #[test]
+    fn channel_message_timeout_budget_with_custom_scale_cap() {
+        assert_eq!(
+            channel_message_timeout_budget_secs_with_cap(300, 8, 8),
+            300 * 8
+        );
+        assert_eq!(
+            channel_message_timeout_budget_secs_with_cap(300, 20, 8),
+            300 * 8
+        );
+        assert_eq!(
+            channel_message_timeout_budget_secs_with_cap(300, 10, 1),
+            300
+        );
+    }
+
+    #[test]
+    fn pacing_config_defaults_preserve_existing_behavior() {
+        let pacing = crate::config::PacingConfig::default();
+        assert!(pacing.step_timeout_secs.is_none());
+        assert!(pacing.loop_detection_min_elapsed_secs.is_none());
+        assert!(pacing.loop_ignore_tools.is_empty());
+        assert!(pacing.message_timeout_scale_max.is_none());
+    }
+
+    #[test]
+    fn pacing_message_timeout_scale_max_overrides_default_cap() {
+        // Custom cap of 8 scales budget proportionally
+        assert_eq!(
+            channel_message_timeout_budget_secs_with_cap(300, 10, 8),
+            300 * 8
+        );
+        // Default cap produces the standard behavior
+        assert_eq!(
+            channel_message_timeout_budget_secs_with_cap(
+                300,
+                10,
+                CHANNEL_MESSAGE_TIMEOUT_SCALE_CAP
+            ),
+            300 * CHANNEL_MESSAGE_TIMEOUT_SCALE_CAP
+        );
+    }
+
    #[test]
    fn context_window_overflow_error_detector_matches_known_messages() {
        let overflow_err = anyhow::anyhow!(
@@ -4941,6 +5006,7 @@ mod tests {
            )),
            activated_tools: None,
            cost_tracking: None,
+            pacing: crate::config::PacingConfig::default(),
        };

        assert!(compact_sender_history(&ctx, &sender));
@@ -5057,6 +5123,7 @@ mod tests {
            )),
            activated_tools: None,
            cost_tracking: None,
+            pacing: crate::config::PacingConfig::default(),
        };

        append_sender_turn(&ctx, &sender, ChatMessage::user("hello"));
@@ -5129,6 +5196,7 @@ mod tests {
            )),
            activated_tools: None,
            cost_tracking: None,
+            pacing: crate::config::PacingConfig::default(),
        };

        assert!(rollback_orphan_user_turn(&ctx, &sender, "pending"));
@@ -5220,6 +5288,7 @@ mod tests {
            )),
            activated_tools: None,
            cost_tracking: None,
+            pacing: crate::config::PacingConfig::default(),
        };

        assert!(rollback_orphan_user_turn(
@@ -5761,6 +5830,7 @@ BTC is currently around $65,000 based on latest tool output."#
            )),
            activated_tools: None,
            cost_tracking: None,
+            pacing: crate::config::PacingConfig::default(),
        });

        process_channel_message(
@@ -5842,6 +5912,7 @@ BTC is currently around $65,000 based on latest tool output."#
            )),
            activated_tools: None,
            cost_tracking: None,
+            pacing: crate::config::PacingConfig::default(),
        });

        process_channel_message(
@@ -5937,6 +6008,7 @@ BTC is currently around $65,000 based on latest tool output."#
            )),
            activated_tools: None,
            cost_tracking: None,
+            pacing: crate::config::PacingConfig::default(),
        });

        process_channel_message(
@@ -6017,6 +6089,7 @@ BTC is currently around $65,000 based on latest tool output."#
            )),
            activated_tools: None,
            cost_tracking: None,
+            pacing: crate::config::PacingConfig::default(),
        });

        process_channel_message(
@@ -6107,6 +6180,7 @@ BTC is currently around $65,000 based on latest tool output."#
            )),
            activated_tools: None,
            cost_tracking: None,
+            pacing: crate::config::PacingConfig::default(),
        });

        process_channel_message(
@@ -6218,6 +6292,7 @@ BTC is currently around $65,000 based on latest tool output."#
            )),
            activated_tools: None,
            cost_tracking: None,
+            pacing: crate::config::PacingConfig::default(),
        });

        process_channel_message(
@@ -6310,6 +6385,7 @@ BTC is currently around $65,000 based on latest tool output."#
            )),
            activated_tools: None,
            cost_tracking: None,
+            pacing: crate::config::PacingConfig::default(),
        });

        process_channel_message(
@@ -6417,6 +6493,7 @@ BTC is currently around $65,000 based on latest tool output."#
            )),
            activated_tools: None,
            cost_tracking: None,
+            pacing: crate::config::PacingConfig::default(),
        });

        process_channel_message(
@@ -6509,6 +6586,7 @@ BTC is currently around $65,000 based on latest tool output."#
            )),
            activated_tools: None,
            cost_tracking: None,
+            pacing: crate::config::PacingConfig::default(),
        });

        process_channel_message(
@@ -6591,6 +6669,7 @@ BTC is currently around $65,000 based on latest tool output."#
            )),
            activated_tools: None,
            cost_tracking: None,
+            pacing: crate::config::PacingConfig::default(),
        });

        process_channel_message(
@@ -6788,6 +6867,7 @@ BTC is currently around $65,000 based on latest tool output."#
            )),
            activated_tools: None,
            cost_tracking: None,
+            pacing: crate::config::PacingConfig::default(),
        });

        let (tx, rx) = tokio::sync::mpsc::channel::<traits::ChannelMessage>(4);
@@ -6890,6 +6970,7 @@ BTC is currently around $65,000 based on latest tool output."#
            )),
            activated_tools: None,
            cost_tracking: None,
+            pacing: crate::config::PacingConfig::default(),
        });

        let (tx, rx) = tokio::sync::mpsc::channel::<traits::ChannelMessage>(8);
@@ -7007,6 +7088,7 @@ BTC is currently around $65,000 based on latest tool output."#
            activated_tools: None,
            cost_tracking: None,
            query_classification: crate::config::QueryClassificationConfig::default(),
+            pacing: crate::config::PacingConfig::default(),
        });

        let (tx, rx) = tokio::sync::mpsc::channel::<traits::ChannelMessage>(8);
@@ -7121,6 +7203,7 @@ BTC is currently around $65,000 based on latest tool output."#
            )),
            activated_tools: None,
            cost_tracking: None,
+            pacing: crate::config::PacingConfig::default(),
        });

        let (tx, rx) = tokio::sync::mpsc::channel::<traits::ChannelMessage>(8);
@@ -7217,6 +7300,7 @@ BTC is currently around $65,000 based on latest tool output."#
            )),
            activated_tools: None,
            cost_tracking: None,
+            pacing: crate::config::PacingConfig::default(),
        });

        process_channel_message(
@@ -7297,6 +7381,7 @@ BTC is currently around $65,000 based on latest tool output."#
            )),
            activated_tools: None,
            cost_tracking: None,
+            pacing: crate::config::PacingConfig::default(),
        });

        process_channel_message(
@@ -8063,6 +8148,7 @@ BTC is currently around $65,000 based on latest tool output."#
            )),
            activated_tools: None,
            cost_tracking: None,
+            pacing: crate::config::PacingConfig::default(),
        });

        process_channel_message(
@@ -8194,6 +8280,7 @@ BTC is currently around $65,000 based on latest tool output."#
            )),
            activated_tools: None,
            cost_tracking: None,
+            pacing: crate::config::PacingConfig::default(),
        });

        process_channel_message(
@@ -8365,6 +8452,7 @@ BTC is currently around $65,000 based on latest tool output."#
            )),
            activated_tools: None,
            cost_tracking: None,
+            pacing: crate::config::PacingConfig::default(),
        });

        process_channel_message(
@@ -8473,6 +8561,7 @@ BTC is currently around $65,000 based on latest tool output."#
            )),
            activated_tools: None,
            cost_tracking: None,
+            pacing: crate::config::PacingConfig::default(),
        });

        process_channel_message(
@@ -9045,6 +9134,7 @@ This is an example JSON object for profile settings."#;
            )),
            activated_tools: None,
            cost_tracking: None,
+            pacing: crate::config::PacingConfig::default(),
        });

        // Simulate a photo attachment message with [IMAGE:] marker.
@@ -9132,6 +9222,7 @@ This is an example JSON object for profile settings."#;
            )),
            activated_tools: None,
            cost_tracking: None,
+            pacing: crate::config::PacingConfig::default(),
        });

        process_channel_message(
@@ -9294,6 +9385,7 @@ This is an example JSON object for profile settings."#;
            )),
            activated_tools: None,
            cost_tracking: None,
+            pacing: crate::config::PacingConfig::default(),
        });

        process_channel_message(
@@ -9405,6 +9497,7 @@ This is an example JSON object for profile settings."#;
            )),
            activated_tools: None,
            cost_tracking: None,
+            pacing: crate::config::PacingConfig::default(),
        });

        process_channel_message(
@@ -9508,6 +9601,7 @@ This is an example JSON object for profile settings."#;
            )),
            activated_tools: None,
            cost_tracking: None,
+            pacing: crate::config::PacingConfig::default(),
        });

        process_channel_message(
@@ -9631,6 +9725,7 @@ This is an example JSON object for profile settings."#;
            )),
            activated_tools: None,
            cost_tracking: None,
+            pacing: crate::config::PacingConfig::default(),
        });

        process_channel_message(
@@ -9892,6 +9987,7 @@ This is an example JSON object for profile settings."#;
            )),
            activated_tools: None,
            cost_tracking: None,
+            pacing: crate::config::PacingConfig::default(),
        });

        let (tx, rx) = tokio::sync::mpsc::channel::<traits::ChannelMessage>(8);
@@ -21,7 +21,7 @@ pub use schema::{
    MatrixConfig, McpConfig, McpServerConfig, McpTransport, MemoryConfig, Microsoft365Config,
    ModelRouteConfig, MultimodalConfig, NextcloudTalkConfig, NodeTransportConfig, NodesConfig,
    NotionConfig, ObservabilityConfig, OpenAiSttConfig, OpenAiTtsConfig, OpenVpnTunnelConfig,
-    OtpConfig, OtpMethod, PeripheralBoardConfig, PeripheralsConfig, PluginsConfig,
+    OtpConfig, OtpMethod, PacingConfig, PeripheralBoardConfig, PeripheralsConfig, PluginsConfig,
    ProjectIntelConfig, ProxyConfig, ProxyScope, QdrantConfig, QueryClassificationConfig,
    ReliabilityConfig, ResourceLimitsConfig, RuntimeConfig, SandboxBackend, SandboxConfig,
    SchedulerConfig, SecretsConfig, SecurityConfig, SecurityOpsConfig, SkillCreationConfig,
@@ -165,6 +165,10 @@ pub struct Config {
    #[serde(default)]
    pub agent: AgentConfig,

+    /// Pacing controls for slow/local LLM workloads (`[pacing]`).
+    #[serde(default)]
+    pub pacing: PacingConfig,
+
    /// Skills loading and community repository behavior (`[skills]`).
    #[serde(default)]
    pub skills: SkillsConfig,
@@ -1277,6 +1281,43 @@ impl Default for AgentConfig {
    }
 }

+// ── Pacing ────────────────────────────────────────────────────────
+
+/// Pacing controls for slow/local LLM workloads (`[pacing]` section).
+///
+/// All fields are optional and default to values that preserve existing
+/// behavior. When set, they extend — not replace — the existing timeout
+/// and loop-detection subsystems.
+#[derive(Debug, Clone, Default, Serialize, Deserialize, JsonSchema)]
+pub struct PacingConfig {
+    /// Per-step timeout in seconds: the maximum time allowed for a single
+    /// LLM inference turn, independent of the total message budget.
+    /// `None` means no per-step timeout (existing behavior).
+    #[serde(default)]
+    pub step_timeout_secs: Option<u64>,
+
+    /// Minimum elapsed seconds before loop detection activates.
+    /// Tasks completing under this threshold get aggressive loop protection;
+    /// longer-running tasks receive a grace period before the detector starts
+    /// counting. `None` means loop detection is always active (existing behavior).
+    #[serde(default)]
+    pub loop_detection_min_elapsed_secs: Option<u64>,
+
+    /// Tool names excluded from identical-output / alternating-pattern loop
+    /// detection. Useful for browser workflows where `browser_screenshot`
+    /// structurally resembles a loop even when making progress.
+    #[serde(default)]
+    pub loop_ignore_tools: Vec<String>,
+
+    /// Override for the hardcoded timeout scaling cap (default: 4).
+    /// The channel message timeout budget is computed as:
+    ///   `message_timeout_secs * min(max_tool_iterations, message_timeout_scale_max)`
+    /// Raising this value lets long multi-step tasks with slow local models
+    /// receive a proportionally larger budget without inflating the base timeout.
+    #[serde(default)]
+    pub message_timeout_scale_max: Option<u64>,
+}
+
 /// Skills loading configuration (`[skills]` section).
 #[derive(Debug, Clone, Copy, PartialEq, Eq, Serialize, Deserialize, JsonSchema, Default)]
 #[serde(rename_all = "snake_case")]
@@ -6727,6 +6768,7 @@ impl Default for Config {
            reliability: ReliabilityConfig::default(),
            scheduler: SchedulerConfig::default(),
            agent: AgentConfig::default(),
+            pacing: PacingConfig::default(),
            skills: SkillsConfig::default(),
            model_routes: Vec::new(),
            embedding_routes: Vec::new(),
@@ -9673,6 +9715,7 @@ default_temperature = 0.7
            google_workspace: GoogleWorkspaceConfig::default(),
            proxy: ProxyConfig::default(),
            agent: AgentConfig::default(),
+            pacing: PacingConfig::default(),
            identity: IdentityConfig::default(),
            cost: CostConfig::default(),
            peripherals: PeripheralsConfig::default(),
@@ -9944,6 +9987,47 @@ tool_dispatcher = "xml"
        assert_eq!(parsed.agent.tool_dispatcher, "xml");
    }

+    #[test]
+    async fn pacing_config_defaults_are_all_none_or_empty() {
+        let cfg = PacingConfig::default();
+        assert!(cfg.step_timeout_secs.is_none());
+        assert!(cfg.loop_detection_min_elapsed_secs.is_none());
+        assert!(cfg.loop_ignore_tools.is_empty());
+        assert!(cfg.message_timeout_scale_max.is_none());
+    }
+
+    #[test]
+    async fn pacing_config_deserializes_from_toml() {
+        let raw = r#"
+default_temperature = 0.7
+[pacing]
+step_timeout_secs = 120
+loop_detection_min_elapsed_secs = 60
+loop_ignore_tools = ["browser_screenshot", "browser_navigate"]
+message_timeout_scale_max = 8
+"#;
+        let parsed: Config = toml::from_str(raw).unwrap();
+        assert_eq!(parsed.pacing.step_timeout_secs, Some(120));
+        assert_eq!(parsed.pacing.loop_detection_min_elapsed_secs, Some(60));
+        assert_eq!(
+            parsed.pacing.loop_ignore_tools,
+            vec!["browser_screenshot", "browser_navigate"]
+        );
+        assert_eq!(parsed.pacing.message_timeout_scale_max, Some(8));
+    }
+
+    #[test]
+    async fn pacing_config_absent_preserves_defaults() {
+        let raw = r#"
+default_temperature = 0.7
+"#;
+        let parsed: Config = toml::from_str(raw).unwrap();
+        assert!(parsed.pacing.step_timeout_secs.is_none());
+        assert!(parsed.pacing.loop_detection_min_elapsed_secs.is_none());
+        assert!(parsed.pacing.loop_ignore_tools.is_empty());
+        assert!(parsed.pacing.message_timeout_scale_max.is_none());
+    }
+
    #[tokio::test]
    async fn sync_directory_handles_existing_directory() {
        let dir = std::env::temp_dir().join(format!(
@@ -10012,6 +10096,7 @@ tool_dispatcher = "xml"
            google_workspace: GoogleWorkspaceConfig::default(),
            proxy: ProxyConfig::default(),
            agent: AgentConfig::default(),
+            pacing: PacingConfig::default(),
            identity: IdentityConfig::default(),
            cost: CostConfig::default(),
            peripherals: PeripheralsConfig::default(),
@@ -1438,6 +1438,7 @@ mod tests {
            session_backend: None,
            device_registry: None,
            pending_pairings: None,
+            path_prefix: String::new(),
        }
    }

@@ -348,6 +348,8 @@ pub struct AppState {
    pub shutdown_tx: tokio::sync::watch::Sender<bool>,
    /// Registry of dynamically connected nodes
    pub node_registry: Arc<nodes::NodeRegistry>,
+    /// Path prefix for reverse-proxy deployments (empty string = no prefix)
+    pub path_prefix: String,
    /// Session backend for persisting gateway WS chat sessions
    pub session_backend: Option<Arc<dyn SessionBackend>>,
    /// Device registry for paired device management
@@ -673,6 +675,13 @@ pub async fn run_gateway(host: &str, port: u16, config: Config) -> Result<()> {
        idempotency_max_keys,
    ));

+    // Resolve optional path prefix for reverse-proxy deployments.
+    let path_prefix: Option<&str> = config
+        .gateway
+        .path_prefix
+        .as_deref()
+        .filter(|p| !p.is_empty());
+
    // ── Tunnel ────────────────────────────────────────────────
    let tunnel = crate::tunnel::create_tunnel(&config.tunnel)?;
    let mut tunnel_url: Option<String> = None;
@@ -691,18 +700,19 @@ pub async fn run_gateway(host: &str, port: u16, config: Config) -> Result<()> {
        }
    }

-    println!("🦀 ZeroClaw Gateway listening on http://{display_addr}");
+    let pfx = path_prefix.unwrap_or("");
+    println!("🦀 ZeroClaw Gateway listening on http://{display_addr}{pfx}");
    if let Some(ref url) = tunnel_url {
        println!("  🌐 Public URL: {url}");
    }
-    println!("  🌐 Web Dashboard: http://{display_addr}/");
+    println!("  🌐 Web Dashboard: http://{display_addr}{pfx}/");
    if let Some(code) = pairing.pairing_code() {
        println!();
        println!("  🔐 PAIRING REQUIRED — use this one-time code:");
        println!("     ┌──────────────┐");
        println!("     │  {code}  │");
        println!("     └──────────────┘");
-        println!();
+        println!("     Send: POST {pfx}/pair with header X-Pairing-Code: {code}");
    } else if pairing.require_pairing() {
        println!("  🔒 Pairing: ACTIVE (bearer token required)");
        println!("     To pair a new device: zeroclaw gateway get-paircode --new");
@@ -711,29 +721,29 @@ pub async fn run_gateway(host: &str, port: u16, config: Config) -> Result<()> {
        println!("  ⚠️  Pairing: DISABLED (all requests accepted)");
        println!();
    }
-    println!("  POST /pair      — pair a new client (X-Pairing-Code header)");
-    println!("  POST /webhook   — {{\"message\": \"your prompt\"}}");
+    println!("  POST {pfx}/pair      — pair a new client (X-Pairing-Code header)");
+    println!("  POST {pfx}/webhook   — {{\"message\": \"your prompt\"}}");
    if whatsapp_channel.is_some() {
-        println!("  GET  /whatsapp  — Meta webhook verification");
-        println!("  POST /whatsapp  — WhatsApp message webhook");
+        println!("  GET  {pfx}/whatsapp  — Meta webhook verification");
+        println!("  POST {pfx}/whatsapp  — WhatsApp message webhook");
    }
    if linq_channel.is_some() {
-        println!("  POST /linq      — Linq message webhook (iMessage/RCS/SMS)");
+        println!("  POST {pfx}/linq      — Linq message webhook (iMessage/RCS/SMS)");
    }
    if wati_channel.is_some() {
-        println!("  GET  /wati      — WATI webhook verification");
-        println!("  POST /wati      — WATI message webhook");
+        println!("  GET  {pfx}/wati      — WATI webhook verification");
+        println!("  POST {pfx}/wati      — WATI message webhook");
    }
    if nextcloud_talk_channel.is_some() {
-        println!("  POST /nextcloud-talk — Nextcloud Talk bot webhook");
+        println!("  POST {pfx}/nextcloud-talk — Nextcloud Talk bot webhook");
    }
-    println!("  GET  /api/*     — REST API (bearer token required)");
-    println!("  GET  /ws/chat   — WebSocket agent chat");
+    println!("  GET  {pfx}/api/*     — REST API (bearer token required)");
+    println!("  GET  {pfx}/ws/chat   — WebSocket agent chat");
    if config.nodes.enabled {
-        println!("  GET  /ws/nodes  — WebSocket node discovery");
+        println!("  GET  {pfx}/ws/nodes  — WebSocket node discovery");
    }
-    println!("  GET  /health    — health check");
-    println!("  GET  /metrics   — Prometheus metrics");
+    println!("  GET  {pfx}/health    — health check");
+    println!("  GET  {pfx}/metrics   — Prometheus metrics");
    println!("  Press Ctrl+C to stop.\n");

    crate::health::mark_component_ok("gateway");
@@ -799,6 +809,7 @@ pub async fn run_gateway(host: &str, port: u16, config: Config) -> Result<()> {
        session_backend,
        device_registry,
        pending_pairings,
+        path_prefix: path_prefix.unwrap_or("").to_string(),
    };

    // Config PUT needs larger body limit (1MB)
@@ -807,7 +818,7 @@ pub async fn run_gateway(host: &str, port: u16, config: Config) -> Result<()> {
        .layer(RequestBodyLimitLayer::new(1_048_576));

    // Build router with middleware
-    let app = Router::new()
+    let inner = Router::new()
        // ── Admin routes (for CLI management) ──
        .route("/admin/shutdown", post(handle_admin_shutdown))
        .route("/admin/paircode", get(handle_admin_paircode))
@@ -867,12 +878,12 @@ pub async fn run_gateway(host: &str, port: u16, config: Config) -> Result<()> {

    // ── Plugin management API (requires plugins-wasm feature) ──
    #[cfg(feature = "plugins-wasm")]
-    let app = app.route(
+    let inner = inner.route(
        "/api/plugins",
        get(api_plugins::plugin_routes::list_plugins),
    );

-    let app = app
+    let inner = inner
        // ── SSE event stream ──
        .route("/api/events", get(sse::handle_sse_events))
        // ── WebSocket agent chat ──
@@ -883,14 +894,27 @@ pub async fn run_gateway(host: &str, port: u16, config: Config) -> Result<()> {
        .route("/_app/{*path}", get(static_files::handle_static))
        // ── Config PUT with larger body limit ──
        .merge(config_put_router)
+        // ── SPA fallback: non-API GET requests serve index.html ──
+        .fallback(get(static_files::handle_spa_fallback))
        .with_state(state)
        .layer(RequestBodyLimitLayer::new(MAX_BODY_SIZE))
        .layer(TimeoutLayer::with_status_code(
            StatusCode::REQUEST_TIMEOUT,
            Duration::from_secs(gateway_request_timeout_secs()),
-        ))
-        // ── SPA fallback: non-API GET requests serve index.html ──
-        .fallback(get(static_files::handle_spa_fallback));
+        ));
+
+    // Nest under path prefix when configured (axum strips prefix before routing).
+    // nest() at "/prefix" handles both "/prefix" and "/prefix/*" but not "/prefix/"
+    // with a trailing slash, so we add a fallback redirect for that case.
+    let app = if let Some(prefix) = path_prefix {
+        let redirect_target = prefix.to_string();
+        Router::new().nest(prefix, inner).route(
+            &format!("{prefix}/"),
+            get(|| async move { axum::response::Redirect::permanent(&redirect_target) }),
+        )
+    } else {
+        inner
+    };

    // Run the server with graceful shutdown
    axum::serve(
@@ -1982,6 +2006,7 @@ mod tests {
            event_tx: tokio::sync::broadcast::channel(16).0,
            shutdown_tx: tokio::sync::watch::channel(false).0,
            node_registry: Arc::new(nodes::NodeRegistry::new(16)),
+            path_prefix: String::new(),
            session_backend: None,
            device_registry: None,
            pending_pairings: None,
@@ -2037,6 +2062,7 @@ mod tests {
            event_tx: tokio::sync::broadcast::channel(16).0,
            shutdown_tx: tokio::sync::watch::channel(false).0,
            node_registry: Arc::new(nodes::NodeRegistry::new(16)),
+            path_prefix: String::new(),
            session_backend: None,
            device_registry: None,
            pending_pairings: None,
@@ -2421,6 +2447,7 @@ mod tests {
            event_tx: tokio::sync::broadcast::channel(16).0,
            shutdown_tx: tokio::sync::watch::channel(false).0,
            node_registry: Arc::new(nodes::NodeRegistry::new(16)),
+            path_prefix: String::new(),
            session_backend: None,
            device_registry: None,
            pending_pairings: None,
@@ -2490,6 +2517,7 @@ mod tests {
            event_tx: tokio::sync::broadcast::channel(16).0,
            shutdown_tx: tokio::sync::watch::channel(false).0,
            node_registry: Arc::new(nodes::NodeRegistry::new(16)),
+            path_prefix: String::new(),
            session_backend: None,
            device_registry: None,
            pending_pairings: None,
@@ -2571,6 +2599,7 @@ mod tests {
            event_tx: tokio::sync::broadcast::channel(16).0,
            shutdown_tx: tokio::sync::watch::channel(false).0,
            node_registry: Arc::new(nodes::NodeRegistry::new(16)),
+            path_prefix: String::new(),
            session_backend: None,
            device_registry: None,
            pending_pairings: None,
@@ -2624,6 +2653,7 @@ mod tests {
            event_tx: tokio::sync::broadcast::channel(16).0,
            shutdown_tx: tokio::sync::watch::channel(false).0,
            node_registry: Arc::new(nodes::NodeRegistry::new(16)),
+            path_prefix: String::new(),
            session_backend: None,
            device_registry: None,
            pending_pairings: None,
@@ -2682,6 +2712,7 @@ mod tests {
            event_tx: tokio::sync::broadcast::channel(16).0,
            shutdown_tx: tokio::sync::watch::channel(false).0,
            node_registry: Arc::new(nodes::NodeRegistry::new(16)),
+            path_prefix: String::new(),
            session_backend: None,
            device_registry: None,
            pending_pairings: None,
@@ -2745,6 +2776,7 @@ mod tests {
            event_tx: tokio::sync::broadcast::channel(16).0,
            shutdown_tx: tokio::sync::watch::channel(false).0,
            node_registry: Arc::new(nodes::NodeRegistry::new(16)),
+            path_prefix: String::new(),
            session_backend: None,
            device_registry: None,
            pending_pairings: None,
@@ -2804,6 +2836,7 @@ mod tests {
            event_tx: tokio::sync::broadcast::channel(16).0,
            shutdown_tx: tokio::sync::watch::channel(false).0,
            node_registry: Arc::new(nodes::NodeRegistry::new(16)),
+            path_prefix: String::new(),
            session_backend: None,
            device_registry: None,
            pending_pairings: None,
@@ -3,11 +3,14 @@
 //! Uses `rust-embed` to bundle the `web/dist/` directory into the binary at compile time.

 use axum::{
+    extract::State,
    http::{header, StatusCode, Uri},
    response::{IntoResponse, Response},
 };
 use rust_embed::Embed;

+use super::AppState;
+
 #[derive(Embed)]
 #[folder = "web/dist/"]
 struct WebAssets;
@@ -23,16 +26,41 @@ pub async fn handle_static(uri: Uri) -> Response {
    serve_embedded_file(path)
 }

-/// SPA fallback: serve index.html for any non-API, non-static GET request
-pub async fn handle_spa_fallback() -> Response {
-    if WebAssets::get("index.html").is_none() {
+/// SPA fallback: serve index.html for any non-API, non-static GET request.
+/// Injects `window.__ZEROCLAW_BASE__` so the frontend knows the path prefix.
+pub async fn handle_spa_fallback(State(state): State<AppState>) -> Response {
+    let Some(content) = WebAssets::get("index.html") else {
        return (
            StatusCode::SERVICE_UNAVAILABLE,
            "Web dashboard not available. Build it with: cd web && npm ci && npm run build",
        )
            .into_response();
-    }
-    serve_embedded_file("index.html")
+    };
+
+    let html = String::from_utf8_lossy(&content.data);
+
+    // Inject path prefix for the SPA and rewrite asset paths in the HTML
+    let html = if state.path_prefix.is_empty() {
+        html.into_owned()
+    } else {
+        let pfx = &state.path_prefix;
+        // JSON-encode the prefix to safely embed in a <script> block
+        let json_pfx = serde_json::to_string(pfx).unwrap_or_else(|_| "\"\"".to_string());
+        let script = format!("<script>window.__ZEROCLAW_BASE__={json_pfx};</script>");
+        // Rewrite absolute /_app/ references so the browser requests {prefix}/_app/...
+        html.replace("/_app/", &format!("{pfx}/_app/"))
+            .replace("<head>", &format!("<head>{script}"))
+    };
+
+    (
+        StatusCode::OK,
+        [
+            (header::CONTENT_TYPE, "text/html; charset=utf-8".to_string()),
+            (header::CACHE_CONTROL, "no-cache".to_string()),
+        ],
+        html,
+    )
+        .into_response()
 }

 fn serve_embedded_file(path: &str) -> Response {
@@ -154,6 +154,7 @@ pub async fn run_wizard(force: bool) -> Result<Config> {
        reliability: crate::config::ReliabilityConfig::default(),
        scheduler: crate::config::schema::SchedulerConfig::default(),
        agent: crate::config::schema::AgentConfig::default(),
+        pacing: crate::config::PacingConfig::default(),
        skills: crate::config::SkillsConfig::default(),
        model_routes: Vec::new(),
        embedding_routes: Vec::new(),
@@ -576,6 +577,7 @@ async fn run_quick_setup_with_home(
        reliability: crate::config::ReliabilityConfig::default(),
        scheduler: crate::config::schema::SchedulerConfig::default(),
        agent: crate::config::schema::AgentConfig::default(),
+        pacing: crate::config::PacingConfig::default(),
        skills: crate::config::SkillsConfig::default(),
        model_routes: Vec::new(),
        embedding_routes: Vec::new(),
@@ -530,6 +530,7 @@ impl DelegateTool {
                &[],
                None,
                None,
+                &crate::config::PacingConfig::default(),
            ),
        )
        .await;
@@ -100,6 +100,10 @@ fn gateway_config_defaults_are_secure() {
        !gw.trust_forwarded_headers,
        "forwarded headers should be untrusted by default"
    );
+    assert!(
+        gw.path_prefix.is_none(),
+        "path_prefix should default to None"
+    );
 }

 #[test]
@@ -124,6 +128,7 @@ fn gateway_config_toml_roundtrip() {
        host: "0.0.0.0".into(),
        require_pairing: false,
        pair_rate_limit_per_minute: 5,
+        path_prefix: Some("/zeroclaw".into()),
        ..Default::default()
    };

@@ -134,6 +139,7 @@ fn gateway_config_toml_roundtrip() {
    assert_eq!(parsed.host, "0.0.0.0");
    assert!(!parsed.require_pairing);
    assert_eq!(parsed.pair_rate_limit_per_minute, 5);
+    assert_eq!(parsed.path_prefix.as_deref(), Some("/zeroclaw"));
 }

 #[test]
@@ -163,6 +169,93 @@ port = 9090
    assert_eq!(parsed.gateway.pair_rate_limit_per_minute, 10);
 }

+// ─────────────────────────────────────────────────────────────────────────────
+// GatewayConfig path_prefix validation
+// ─────────────────────────────────────────────────────────────────────────────
+
+#[test]
+fn gateway_path_prefix_rejects_missing_leading_slash() {
+    let mut config = Config::default();
+    config.gateway.path_prefix = Some("zeroclaw".into());
+    let err = config.validate().unwrap_err();
+    assert!(
+        err.to_string().contains("must start with '/'"),
+        "expected leading-slash error, got: {err}"
+    );
+}
+
+#[test]
+fn gateway_path_prefix_rejects_trailing_slash() {
+    let mut config = Config::default();
+    config.gateway.path_prefix = Some("/zeroclaw/".into());
+    let err = config.validate().unwrap_err();
+    assert!(
+        err.to_string().contains("must not end with '/'"),
+        "expected trailing-slash error, got: {err}"
+    );
+}
+
+#[test]
+fn gateway_path_prefix_rejects_bare_slash() {
+    let mut config = Config::default();
+    config.gateway.path_prefix = Some("/".into());
+    let err = config.validate().unwrap_err();
+    assert!(
+        err.to_string().contains("must not end with '/'"),
+        "expected bare-slash error, got: {err}"
+    );
+}
+
+#[test]
+fn gateway_path_prefix_accepts_valid_prefixes() {
+    for prefix in ["/zeroclaw", "/apps/zeroclaw", "/api/hassio_ingress/abc123"] {
+        let mut config = Config::default();
+        config.gateway.path_prefix = Some(prefix.into());
+        config
+            .validate()
+            .unwrap_or_else(|e| panic!("prefix {prefix:?} should be valid, got: {e}"));
+    }
+}
+
+#[test]
+fn gateway_path_prefix_rejects_unsafe_characters() {
+    for prefix in [
+        "/zero claw",
+        "/zero<claw",
+        "/zero>claw",
+        "/zero\"claw",
+        "/zero?query",
+        "/zero#frag",
+    ] {
+        let mut config = Config::default();
+        config.gateway.path_prefix = Some(prefix.into());
+        let err = config.validate().unwrap_err();
+        assert!(
+            err.to_string().contains("invalid character"),
+            "prefix {prefix:?} should be rejected, got: {err}"
+        );
+    }
+    // Leading/trailing whitespace is rejected by the starts_with('/') or
+    // invalid-character check — either way it must not pass validation.
+    for prefix in [" /zeroclaw ", " /zeroclaw"] {
+        let mut config = Config::default();
+        config.gateway.path_prefix = Some(prefix.into());
+        assert!(
+            config.validate().is_err(),
+            "whitespace-padded prefix {prefix:?} should be rejected"
+        );
+    }
+}
+
+#[test]
+fn gateway_path_prefix_accepts_none() {
+    let config = Config::default();
+    assert!(config.gateway.path_prefix.is_none());
+    config
+        .validate()
+        .expect("absent path_prefix should be valid");
+}
+
 // ─────────────────────────────────────────────────────────────────────────────
 // SecurityConfig boundary tests
 // ─────────────────────────────────────────────────────────────────────────────
@@ -16,6 +16,7 @@ import Pairing from './pages/Pairing';
 import { AuthProvider, useAuth } from './hooks/useAuth';
 import { DraftContext, useDraftStore } from './hooks/useDraft';
 import { setLocale, type Locale } from './lib/i18n';
+import { basePath } from './lib/basePath';
 import { getAdminPairCode } from './lib/api';

 // Locale context
@@ -131,7 +132,7 @@ function PairingDialog({ onPair }: { onPair: (code: string) => Promise<void> })

        <div className="text-center mb-8">
          <img
-            src="/_app/zeroclaw-trans.png"
+            src={`${basePath}/_app/zeroclaw-trans.png`}
            alt="ZeroClaw"
            className="h-20 w-20 rounded-2xl object-cover mx-auto mb-4 animate-float"
            onError={(e) => { e.currentTarget.style.display = 'none'; }}
@@ -1,4 +1,5 @@
 import { NavLink } from 'react-router-dom';
+import { basePath } from '../../lib/basePath';
 import {
  LayoutDashboard,
  MessageSquare,
@@ -34,7 +35,7 @@ export default function Sidebar() {
        <div className="relative shrink-0">
          <div className="absolute -inset-1.5 rounded-xl" style={{ background: 'linear-gradient(135deg, rgba(var(--pc-accent-rgb), 0.15), rgba(var(--pc-accent-rgb), 0.05))' }} />
          <img
-            src="/_app/zeroclaw-trans.png"
+            src={`${basePath}/_app/zeroclaw-trans.png`}
            alt="ZeroClaw"
            className="relative h-9 w-9 rounded-xl object-cover"
            onError={(e) => {
@@ -11,6 +11,7 @@ import type {
  HealthSnapshot,
 } from '../types/api';
 import { clearToken, getToken, setToken } from './auth';
+import { basePath } from './basePath';

 // ---------------------------------------------------------------------------
 // Base fetch wrapper
@@ -42,7 +43,7 @@ export async function apiFetch<T = unknown>(
    headers.set('Content-Type', 'application/json');
  }

-  const response = await fetch(path, { ...options, headers });
+  const response = await fetch(`${basePath}${path}`, { ...options, headers });

  if (response.status === 401) {
    clearToken();
@@ -78,7 +79,7 @@ function unwrapField<T>(value: T | Record<string, T>, key: string): T {
 // ---------------------------------------------------------------------------

 export async function pair(code: string): Promise<{ token: string }> {
-  const response = await fetch('/pair', {
+  const response = await fetch(`${basePath}/pair`, {
    method: 'POST',
    headers: { 'X-Pairing-Code': code },
  });
@@ -106,7 +107,7 @@ export async function getAdminPairCode(): Promise<{ pairing_code: string | null;
 // ---------------------------------------------------------------------------

 export async function getPublicHealth(): Promise<{ require_pairing: boolean; paired: boolean }> {
-  const response = await fetch('/health');
+  const response = await fetch(`${basePath}/health`);
  if (!response.ok) {
    throw new Error(`Health check failed (${response.status})`);
  }
@@ -0,0 +1,11 @@
+// Runtime base path injected by the Rust gateway into index.html.
+// Allows the SPA to work under a reverse-proxy path prefix.
+
+declare global {
+  interface Window {
+    __ZEROCLAW_BASE__?: string;
+  }
+}
+
+/** Gateway path prefix (e.g. "/zeroclaw"), or empty string when served at root. */
+export const basePath: string = (window.__ZEROCLAW_BASE__ ?? '').replace(/\/+$/, '');
@@ -1,5 +1,6 @@
 import type { SSEEvent } from '../types/api';
 import { getToken } from './auth';
+import { basePath } from './basePath';

 export type SSEEventHandler = (event: SSEEvent) => void;
 export type SSEErrorHandler = (error: Event | Error) => void;
@@ -41,7 +42,7 @@ export class SSEClient {
  private readonly autoReconnect: boolean;

  constructor(options: SSEClientOptions = {}) {
-    this.path = options.path ?? '/api/events';
+    this.path = options.path ?? `${basePath}/api/events`;
    this.reconnectDelay = options.reconnectDelay ?? DEFAULT_RECONNECT_DELAY;
    this.maxReconnectDelay = options.maxReconnectDelay ?? MAX_RECONNECT_DELAY;
    this.autoReconnect = options.autoReconnect ?? true;
@@ -1,5 +1,6 @@
 import type { WsMessage } from '../types/api';
 import { getToken } from './auth';
+import { basePath } from './basePath';
 import { generateUUID } from './uuid';

 export type WsMessageHandler = (msg: WsMessage) => void;
@@ -69,7 +70,7 @@ export class WebSocketClient {
    const params = new URLSearchParams();
    if (token) params.set('token', token);
    params.set('session_id', sessionId);
-    const url = `${this.baseUrl}/ws/chat?${params.toString()}`;
+    const url = `${this.baseUrl}${basePath}/ws/chat?${params.toString()}`;

    const protocols: string[] = ['zeroclaw.v1'];
    if (token) protocols.push(`bearer.${token}`);
@@ -2,12 +2,13 @@ import React from 'react';
 import ReactDOM from 'react-dom/client';
 import { BrowserRouter } from 'react-router-dom';
 import App from './App';
+import { basePath } from './lib/basePath';
 import './index.css';

 ReactDOM.createRoot(document.getElementById('root')!).render(
  <React.StrictMode>
-    {/* Vite base '/_app/' scopes static asset URLs only; app routes stay rooted at '/' for SPA fallback. */}
-    <BrowserRouter basename="/">
+    {/* basePath is injected by the Rust gateway at serve time for reverse-proxy prefix support. */}
+    <BrowserRouter basename={basePath || '/'}>
      <App />
    </BrowserRouter>
  </React.StrictMode>
Author	SHA1	Message	Date
SimianAstronaut7	87b5bca449	feat(config): add configurable pacing controls for slow/local LLM workloads (#3343 ) * feat(config): add configurable pacing controls for slow/local LLM workloads (#2963) Add a new `[pacing]` config section with four opt-in parameters that let users tune timeout and loop-detection behavior for local LLMs (Ollama, llama.cpp, vLLM) without disabling safety features entirely: - `step_timeout_secs`: per-step LLM inference timeout independent of the overall message budget, catching hung model responses early. - `loop_detection_min_elapsed_secs`: time-gated loop detection that only activates after a configurable grace period, avoiding false positives on long-running browser/research workflows. - `loop_ignore_tools`: per-tool loop-detection exclusions so tools like `browser_screenshot` that structurally resemble loops are not counted toward identical-output detection. - `message_timeout_scale_max`: overrides the hardcoded 4x ceiling in the channel message timeout scaling formula. All parameters are strictly optional with no effect when absent, preserving full backwards compatibility. Closes #2963 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> * fix(config): add missing pacing fields in tests and call sites * fix(config): add pacing arg to remaining cost-tracking test call sites --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com> Co-authored-by: argenis de la rosa <theonlyhennygod@gmail.com>	2026-03-21 08:54:08 -04:00
Argenis	be40c0c5a5	Merge pull request #4145 from zeroclaw-labs/feat/gateway-path-prefix feat(gateway): add path_prefix for reverse-proxy deployments	2026-03-21 08:48:56 -04:00
argenis de la rosa	6527871928	fix: add path_prefix to test AppState in gateway/api.rs	2026-03-21 08:14:28 -04:00
argenis de la rosa	0bda80de9c	feat(gateway): add path_prefix for reverse-proxy deployments Adopted from #3709 by @slayer with minor cleanup. Supersedes #3709	2026-03-21 08:14:28 -04:00