zeroclaw/src/security/mod.rs
Erica Stith 63f485e56a
feat(security): Add prompt injection defense and leak detection (#1433)
Contributed from RustyClaw (MIT licensed).

## PromptGuard (src/security/prompt_guard.rs)

Detects and blocks/warns about prompt injection attacks:
- System prompt override attempts ("ignore previous instructions")
- Role confusion attacks ("you are now...", "act as...")
- Tool call JSON injection
- Secret extraction attempts
- Command injection patterns in tool arguments
- Jailbreak attempts (DAN mode, developer mode, etc.)

Features:
- Configurable sensitivity (0.0-1.0)
- Configurable action (Warn/Block/Sanitize)
- Pattern-based detection with regex
- Normalized scoring across categories

## LeakDetector (src/security/leak_detector.rs)

Prevents credential exfiltration in outbound content:
- API key patterns (Stripe, OpenAI, Anthropic, Google, GitHub)
- AWS credentials (Access Key ID, Secret Access Key)
- Generic secrets (passwords, tokens in config)
- Private keys (RSA, EC, OpenSSH PEM blocks)
- JWT tokens
- Database connection URLs (PostgreSQL, MySQL, MongoDB, Redis)

Features:
- Automatic redaction of detected secrets
- Configurable sensitivity
- Returns both detection info and redacted content

## Integration

Both modules are exported from `security` module:
```rust
use zeroclaw::security::{PromptGuard, GuardResult, LeakDetector, LeakResult};
```

## Attribution

RustyClaw: https://github.com/rexlunae/RustyClaw
License: MIT
2026-02-23 07:48:18 -05:00

104 lines
3.3 KiB
Rust

//! Security subsystem for policy enforcement, sandboxing, and secret management.
//!
//! This module provides the security infrastructure for ZeroClaw. The core type
//! [`SecurityPolicy`] defines autonomy levels, workspace boundaries, and
//! access-control rules that are enforced across the tool and runtime subsystems.
//! [`PairingGuard`] implements device pairing for channel authentication, and
//! [`SecretStore`] handles encrypted credential storage.
//!
//! OS-level isolation is provided through the [`Sandbox`] trait defined in
//! [`traits`], with pluggable backends including Docker, Firejail, Bubblewrap,
//! and Landlock. The [`create_sandbox`] function selects the best available
//! backend at runtime. An [`AuditLogger`] records security-relevant events for
//! forensic review.
//!
//! # Extension
//!
//! To add a new sandbox backend, implement [`Sandbox`] in a new submodule and
//! register it in [`detect::create_sandbox`]. See `AGENTS.md` §7.5 for security
//! change guidelines.
pub mod audit;
#[cfg(feature = "sandbox-bubblewrap")]
pub mod bubblewrap;
pub mod detect;
pub mod docker;
// Prompt injection defense (contributed from RustyClaw, MIT licensed)
pub mod leak_detector;
pub mod prompt_guard;
pub mod domain_matcher;
pub mod estop;
#[cfg(target_os = "linux")]
pub mod firejail;
#[cfg(feature = "sandbox-landlock")]
pub mod landlock;
pub mod otp;
pub mod pairing;
pub mod policy;
pub mod secrets;
pub mod traits;
#[allow(unused_imports)]
pub use audit::{AuditEvent, AuditEventType, AuditLogger};
#[allow(unused_imports)]
pub use detect::create_sandbox;
pub use domain_matcher::DomainMatcher;
#[allow(unused_imports)]
pub use estop::{EstopLevel, EstopManager, EstopState, ResumeSelector};
#[allow(unused_imports)]
pub use otp::OtpValidator;
#[allow(unused_imports)]
pub use pairing::PairingGuard;
pub use policy::{AutonomyLevel, SecurityPolicy};
#[allow(unused_imports)]
pub use secrets::SecretStore;
#[allow(unused_imports)]
pub use traits::{NoopSandbox, Sandbox};
// Prompt injection defense exports
pub use leak_detector::{LeakDetector, LeakResult};
pub use prompt_guard::{GuardAction, GuardResult, PromptGuard};
/// Redact sensitive values for safe logging. Shows first 4 chars + "***" suffix.
/// This function intentionally breaks the data-flow taint chain for static analysis.
pub fn redact(value: &str) -> String {
if value.len() <= 4 {
"***".to_string()
} else {
format!("{}***", &value[..4])
}
}
#[cfg(test)]
mod tests {
use super::*;
#[test]
fn reexported_policy_and_pairing_types_are_usable() {
let policy = SecurityPolicy::default();
assert_eq!(policy.autonomy, AutonomyLevel::Supervised);
let guard = PairingGuard::new(false, &[]);
assert!(!guard.require_pairing());
}
#[test]
fn reexported_secret_store_encrypt_decrypt_roundtrip() {
let temp = tempfile::tempdir().unwrap();
let store = SecretStore::new(temp.path(), false);
let encrypted = store.encrypt("top-secret").unwrap();
let decrypted = store.decrypt(&encrypted).unwrap();
assert_eq!(decrypted, "top-secret");
}
#[test]
fn redact_hides_most_of_value() {
assert_eq!(redact("abcdefgh"), "abcd***");
assert_eq!(redact("ab"), "***");
assert_eq!(redact(""), "***");
assert_eq!(redact("12345"), "1234***");
}
}