NVIDIA's NIM API (integrate.api.nvidia.com) does not support the
OpenAI Responses API endpoint. When chat completions returns a
non-success status, the fallback to /v1/responses also fails with
404, producing a confusing double-failure error.
Use `new_no_responses_fallback()` for the NVIDIA provider, matching
the approach already used for GLM and other chat-completions-only
providers.
Fixes#1282
Add Osaurus (https://github.com/dinoki-ai/osaurus) as a named provider,
following the established LM Studio / vLLM pattern with
OpenAiCompatibleProvider and Bearer auth.
Osaurus is a unified AI edge runtime for macOS (Apple Silicon) that goes
beyond traditional local inference servers:
- Local MLX inference (Llama, Qwen, Gemma, GLM, Phi, Nemotron, etc.)
- Cloud provider proxying through a single endpoint
- Multi-API: OpenAI, Anthropic, Ollama, and Open Responses simultaneously
- Built-in MCP (Model Context Protocol) support for tool/context servers
Provider wiring:
- Provider ID: "osaurus", default endpoint: http://localhost:1337/v1
- API key defaults to "osaurus" but is fully optional (keyless access)
- Credential env var: OSAURUS_API_KEY
- Registered as local provider in list_providers()
Onboard wizard:
- Added to all 10 wizard functions (auth, models, endpoints, env vars)
- Curated model list: qwen3-30b-a3b, gemma-3n-e4b, phi-4-mini-reasoning
- Tier 4 local provider with interactive endpoint/key prompts
Tests:
- factory_osaurus, factory_osaurus_uses_default_key_when_none
- factory_osaurus_custom_url, resolve_provider_credential_osaurus_env
- resilient_fallback_includes_osaurus
- Added to factory_all_providers_create_successfully array
Documentation:
- providers-reference.md: table row + Osaurus Server Notes section
- README.md: Osaurus Server Endpoint section
Parse "provider:profile" entries (e.g. "openai-codex:second") in the
fallback chain so multiple OAuth profiles of the same provider can be
rotated on 429. The profile override is propagated via
auth_profile_override in ProviderRuntimeOptions.
Entries prefixed with "custom:" or "anthropic-custom:" are left
untouched since the colon is part of the URL scheme.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add native vLLM provider support to ZeroClaw
- First-class `vllm` provider with local endpoint defaults (`http://localhost:8000/v1`)
- Optional `VLLM_API_KEY` support
- Onboarding wizard integration (tier menu, endpoint prompt, model discovery, keyless local usage)
- Updated provider/docs references and command documentation
Update the hardcoded synthetic provider base URL from https://api.synthetic.com
to https://api.synthetic.new/openai/v1 to match the actual API endpoint.
The user verified locally that the old URL doesn't work and confirmed the fix
works by using the custom provider syntax as a workaround:
default_provider = "custom:https://api.synthetic.new/openai/v1"
This change makes the synthetic provider work out of the box without requiring
users to use the custom provider workaround.
- Extract hard-coded test vector keys into named constants in bedrock.rs
and linq.rs to resolve rust/hard-coded-cryptographic-value alerts
- Replace derived Debug impls with manual impls that redact sensitive
fields (access_token, refresh_token, credential, api_key) on
QwenOauthCredentials, QwenOauthProviderContext, and
ResolvedEmbeddingConfig to resolve rust/cleartext-logging alerts
- Redact Matrix user_id and device_id hints in tracing::warn! diagnostic
messages via crate::security::redact() to resolve cleartext-logging
alert in matrix.rs
Addresses CodeQL alerts: #77, #95-106
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Each major subsystem mod.rs now includes a //! doc block explaining the
subsystem purpose, trait-driven architecture, factory registration pattern,
and extension guidance. This improves the generated rustdoc experience for
developers navigating ZeroClaw's modular architecture.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Replace the non-functional OpenAI-compatible stub with a purpose-built
Bedrock provider that implements AWS SigV4 signing from first principles
using hmac/sha2/hex crates — no AWS SDK dependency.
Key capabilities:
- SigV4 authentication (AKSK + optional session token)
- Converse API with native tool calling support
- Prompt caching via cachePoint heuristics
- Proper URI encoding for model IDs containing colons
- Resilient response parsing with unknown block type fallback
Also updates:
- Factory wiring and credential resolution bypass for AKSK auth
- Onboard wizard with Bedrock-specific model selection and guidance
- Provider reference docs with auth, region, and model ID details
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Fallback providers in create_resilient_provider_with_options() were
created via create_provider_with_options() which passed the primary
provider's api_key as credential_override. This caused
resolve_provider_credential() to short-circuit on the override and
never check the fallback provider's own env var (e.g. DEEPSEEK_API_KEY
for a deepseek fallback), resulting in auth failures (401) when the
primary and fallback use different API services.
Switch to create_provider_with_url(fallback, None, None) so each
fallback resolves its own credential via provider-specific env vars.
This also enables custom: URL prefixes (e.g.
custom:http://host.docker.internal:1234/v1) to work as fallback
entries, which was previously impossible through the options path.
Add three focused tests covering independent credential resolution,
custom URL fallbacks, and mixed fallback chains.
MiniMax API rejects role: system in the messages array with error
2013 (invalid message role: system). In channel mode, the history
builder prepends a system message and optionally appends a second
one for delivery instructions, causing 400 errors on every channel
turn.
Additionally, MiniMax reasoning models embed chain-of-thought in
the content field as <think>...</think> blocks rather than using
the separate reasoning_content field, causing raw thinking output
to leak into user-visible responses.
Changes:
- Add merge_system_into_user flag to OpenAiCompatibleProvider;
when set, all system messages are concatenated and prepended to
the first user message before sending to the API
- Add new_merge_system_into_user() constructor used by MiniMax
- Add strip_think_tags() helper that removes <think>...</think>
blocks from response content before returning to the caller
- Apply strip_think_tags in effective_content() and
effective_content_optional() so all non-streaming paths are covered
- Update MiniMax factory registration to use new_merge_system_into_user
- Fix pre-existing rustfmt violation on apply_auth_header call
All other providers continue to use the default path unchanged.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Route OVHcloud through OpenAiProvider (with proper tool_call_id
serialization) instead of OpenAiCompatibleProvider, fixing tool-call
round-trips against vLLM-based endpoints.
- Add base_url field and with_base_url() constructor to OpenAiProvider
- Replace all hardcoded api.openai.com URLs with self.base_url
- Pass api_url through for the openai provider arm
- Register ovhcloud/ovh provider with env var OVH_AI_ENDPOINTS_ACCESS_TOKEN
- Add "astrai" to factory_all_providers_create_successfully test
- Add "astrai" => "ASTRAI_API_KEY" in provider_env_var() for onboarding
- Add Astrai to onboarding provider selection list (Gateway tier)
- Add provider_env_var("astrai") assertion in known_providers test
Addresses review comments from @chumyin on #486.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add `zeroclaw providers` CLI command that lists all 28 supported AI providers
- Each entry shows: config ID, display name, local/cloud tag, active marker, and aliases
- Also shows `custom:<URL>` and `anthropic-custom:<URL>` escape hatches at the bottom
Previously users had no way to discover available providers without reading source code. The
unknown-provider error message suggests `run zeroclaw onboard --interactive` but doesn't list
options. This command gives immediate visibility.
Integrate cloud endpoint behavior into existing ollama provider flow, avoid a separate standalone doc, and keep configuration minimal via api_url/api_key.
Also align reply_target and memory trait call sites needed for current baseline compatibility.
* fix(providers): add CN/global endpoint variants for Chinese vendors
* fix(onboard): deduplicate provider key-url match arms
* chore(i18n): normalize non-English literals to English
The existing Copilot provider passes a static Bearer token, but the
Copilot API requires short-lived session tokens obtained via GitHub's
OAuth device code flow, plus mandatory editor headers.
This replaces the stub with a dedicated CopilotProvider that:
- Runs the OAuth device code flow on first use (same client ID as VS Code)
- Exchanges the OAuth token for a Copilot API key via
api.github.com/copilot_internal/v2/token
- Sends required Editor-Version/Editor-Plugin-Version headers
- Caches tokens to disk (~/.config/zeroclaw/copilot/) with auto-refresh
- Uses Mutex to prevent concurrent refresh races / duplicate device prompts
- Writes token files with 0600 permissions (owner-only)
- Respects GitHub's polling interval and code expiry from device flow
- Sanitizes error messages to prevent token leakage
- Uses async filesystem I/O (tokio::fs) throughout
- Optionally accepts a pre-supplied GitHub token via config api_key
Fixes: 403 'Access to this endpoint is forbidden'
Fixes: 400 'missing Editor-Version header for IDE auth'
Add Astrai (https://as-trai.com) as a first-class OpenAI-compatible
provider. Astrai is an AI inference router with built-in cost
optimization, PII stripping, and compliance logging.
- Register ASTRAI_API_KEY env var in resolve_api_key
- Add "astrai" entry in provider factory → as-trai.com/v1
- Add factory_astrai unit test
- Add Astrai to compatible provider test list
- Update README provider count (22+ → 23+) and list
Co-authored-by: Maya Walcher <maya.walcher@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
- Add `lmstudio` / `lm-studio` as a built-in provider alias for local LM Studio instances
(`http://localhost:1234/v1`)
- Uses a dummy API key when none is provided, since LM Studio does not require authentication
- Users can connect to remote LM Studio instances via `custom:http://<ip>:1234/v1`