Fixes two related issues with Gemini OAuth:
1. CLI command `zeroclaw auth refresh --provider gemini` was hardcoded to
only support OpenAI Codex, making manual token refresh impossible for
Gemini profiles. Extended the CLI handler to support both providers.
2. GeminiProvider.build_generate_content_request() was missing bearer token
for ManagedOAuth auth type. The method applied OAuth bearer token only
for CLI OAuth (GeminiAuth::OAuthToken), but not for managed profiles
(GeminiAuth::ManagedOAuth), causing 401 Unauthorized errors even after
successful token refresh.
Changes:
- src/main.rs: AuthCommands::Refresh now handles both openai-codex and
gemini providers via pattern match
- src/providers/gemini.rs: Extended OAuth bearer token handling to include
GeminiAuth::ManagedOAuth case (line 837)
Verification:
- Manual test: zeroclaw auth refresh --provider gemini --profile second
- E2E test: echo "hello" | zeroclaw agent --provider gemini --model gemini-2.5-pro
- Unit tests: cargo test providers::gemini (38 passed)
Risk: Low (isolated auth flow changes, no API contract changes)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
* fix(gateway): switch default port to 42617 across runtime and docs
* docs(changelog): record 42617 default port migration
* chore(release): bump crate version to 0.1.1
* fix(build): sync Cargo.lock with v0.1.1 manifest
Add Osaurus (https://github.com/dinoki-ai/osaurus) as a named provider,
following the established LM Studio / vLLM pattern with
OpenAiCompatibleProvider and Bearer auth.
Osaurus is a unified AI edge runtime for macOS (Apple Silicon) that goes
beyond traditional local inference servers:
- Local MLX inference (Llama, Qwen, Gemma, GLM, Phi, Nemotron, etc.)
- Cloud provider proxying through a single endpoint
- Multi-API: OpenAI, Anthropic, Ollama, and Open Responses simultaneously
- Built-in MCP (Model Context Protocol) support for tool/context servers
Provider wiring:
- Provider ID: "osaurus", default endpoint: http://localhost:1337/v1
- API key defaults to "osaurus" but is fully optional (keyless access)
- Credential env var: OSAURUS_API_KEY
- Registered as local provider in list_providers()
Onboard wizard:
- Added to all 10 wizard functions (auth, models, endpoints, env vars)
- Curated model list: qwen3-30b-a3b, gemma-3n-e4b, phi-4-mini-reasoning
- Tier 4 local provider with interactive endpoint/key prompts
Tests:
- factory_osaurus, factory_osaurus_uses_default_key_when_none
- factory_osaurus_custom_url, resolve_provider_credential_osaurus_env
- resilient_fallback_includes_osaurus
- Added to factory_all_providers_create_successfully array
Documentation:
- providers-reference.md: table row + Osaurus Server Notes section
- README.md: Osaurus Server Endpoint section
Parse "provider:profile" entries (e.g. "openai-codex:second") in the
fallback chain so multiple OAuth profiles of the same provider can be
rotated on 429. The profile override is propagated via
auth_profile_override in ProviderRuntimeOptions.
Entries prefixed with "custom:" or "anthropic-custom:" are left
untouched since the colon is part of the URL scheme.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Address clippy lints (redundant continue, as-cast, match arms, elided
lifetimes, format vs write!) and reformat long cfg attributes and assert
macros to pass `cargo fmt --check` and `cargo clippy -D warnings`.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add native vLLM provider support to ZeroClaw
- First-class `vllm` provider with local endpoint defaults (`http://localhost:8000/v1`)
- Optional `VLLM_API_KEY` support
- Onboarding wizard integration (tier menu, endpoint prompt, model discovery, keyless local usage)
- Updated provider/docs references and command documentation
Parse provider-specific usage fields from API responses:
- Anthropic: input_tokens/output_tokens from usage object
- Gemini: promptTokenCount/candidatesTokenCount from usageMetadata
- Ollama: prompt_eval_count/eval_count from response root
- Bedrock: inputTokens/outputTokens from camelCase usage object
Gemini required refactoring send_generate_content to return
(String, Option<TokenUsage>) tuple, plus a chat() override to
thread usage into ChatResponse.
Add UsageInfo deserialization structs and wire usage data from API
responses through to ChatResponse for OpenRouter, OpenAI, Compatible,
and Copilot providers. All four share the OpenAI response format with
prompt_tokens/completion_tokens fields.
Add a lightweight TokenUsage struct to providers::traits with
input_tokens and output_tokens fields. Add usage: Option<TokenUsage>
to ChatResponse and update all construction sites across providers
and agent modules with usage: None.
This is the first step toward capturing token usage data from LLM
API responses. Currently all sites set usage: None — subsequent
commits will parse actual usage from each provider's response format.
- Remove duplicate `chat` method in reliable.rs (E0201)
- Fix `futures` → `futures_util` imports in agent.rs and loop_.rs (E0433)
- Gate PostgresMemory behind `memory-postgres` feature in cli.rs (E0433)
- Fix regex backreference in XML tool parser (unsupported by regex crate)
- Add missing `skills_prompt_mode` argument in test
- Apply rustfmt to files with formatting issues on main
Previously, BedrockProvider only read credentials from environment
variables (AWS_ACCESS_KEY_ID / AWS_SECRET_ACCESS_KEY). When running
on EC2 with an IAM instance role, the env vars are not set, causing
all Bedrock calls to fail with 'credentials not set'.
Changes:
- Add AwsCredentials::from_imds(): fetches temporary credentials from
EC2 IMDSv2 (PUT token → get role name → get credentials → get region)
- Add AwsCredentials::resolve(): tries env vars first, falls back to IMDS
- Add BedrockProvider::resolve_credentials(): async method called per
request, so expired instance role tokens are automatically refreshed
- chat() and chat_with_system() now call resolve_credentials() instead
of require_credentials(), enabling seamless EC2 instance role auth
Gemini thinking models (e.g. gemini-3-pro-preview) return response parts
with `thought: true` for internal reasoning and `thoughtSignature` for
opaque signatures. The previous extraction logic blindly took the first
part, which was the thinking part, returning reasoning text instead of the
actual answer.
- Add `thought` field to `ResponsePart` to detect reasoning parts
- Add `effective_text()` on `CandidateContent` to skip thinking/signature
parts and extract only the real answer (falls back to thinking text if
no non-thinking content is available)
- Make `Candidate.content` optional to guard against candidates with no
content (e.g. safety-filtered responses)
- Add 7 focused tests covering thinking, non-thinking, fallback, empty,
multi-part, signature-only, and internal API responses
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Remove duplicate chat method in ReliableProvider impl (E0201)
The second chat fn (lines 662-769) was an exact duplicate of the
first (lines 540-647) in the same impl block.
- Gate PostgresMemory usage in memory CLI behind memory-postgres feature (E0433)
super::PostgresMemory is only exported when the feature is enabled;
the Postgres match arm now compiles to an explicit bail when the
feature is off.
- Replace utures::future::join_all with utures_util::future::join_all (E0433)
The crate depends on utures-util, not utures. Fixed in both
agent.rs and loop_.rs.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- channels/telegram.rs: support photo messages in parse_update_message;
add resolve_photo_data_uri() to fetch, download and resize images to
512px via Telegram getFile API before base64 encoding
- providers/bedrock.rs: add parse_user_content_blocks() to extract
[IMAGE:data:...] markers and build proper Bedrock image content blocks;
apply to both chat() and chat_with_system() paths; set vision: true
in provider capabilities
- Cargo.toml: add image crate v0.25 (jpeg/png) for server-side resize
Update the hardcoded synthetic provider base URL from https://api.synthetic.com
to https://api.synthetic.new/openai/v1 to match the actual API endpoint.
The user verified locally that the old URL doesn't work and confirmed the fix
works by using the custom provider syntax as a workaround:
default_provider = "custom:https://api.synthetic.new/openai/v1"
This change makes the synthetic provider work out of the box without requiring
users to use the custom provider workaround.
ReliableProvider was missing a chat() override, causing it to fall through
to the default Provider::chat() trait implementation. The default
implementation delegates to chat_with_history() which returns a plain
String and wraps it in ChatResponse with tool_calls: Vec::new() — so
native tool calling was completely broken through the retry/failover
wrapper even though the underlying provider properly supports it.
Changes:
- Add chat() with full retry/backoff/failover logic matching existing
chat_with_system(), chat_with_history(), and chat_with_tools() overrides
- Include context_window_exceeded early-exit matching other method patterns
- Add 7 focused tests: delegation with tool calls, retry recovery,
supports_native_tools propagation, aggregated error reporting,
model failover, non-retryable error skip, and system prompt zero-XML
verification
rotate_key() selects the next key in the round-robin but never applies
it to the underlying provider (Provider trait has no set_api_key
method). The previous info-level log implied rotation was working.
Change to warn-level and explicitly state the key is not applied,
making the limitation visible to operators instead of silently
pretending rotation works.
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Three issues prevented the Gemini OAuth path from working end-to-end:
1. Missing `project` field — the internal API returns 500 without it.
Added project field to InternalGenerateContentRequest and
resolve_oauth_project() to fetch it via loadCodeAssist endpoint.
2. No token refresh — stale access_token was read at construction time
and never refreshed. Google OAuth tokens expire after ~1 hour,
breaking long-lived daemon processes. Added runtime token refresh
with OAuthTokenState (Arc<Mutex>) that checks expiry before each
request and refreshes proactively (60s buffer).
3. Wrong response format — internal API nests candidates under a
`response` field. Added InternalGenerateContentResponse wrapper
and conditional deserialization in send_generate_content().
Also fixes OAuth warmup to call resolve_oauth_project() instead of
listing models on the public endpoint (which rejects OAuth tokens).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
AnthropicProvider declared supports_native_tools() = true but did not
override chat_with_tools(). The default trait implementation drops all
conversation history (sends only system + last user message), breaking
multi-turn conversations on Telegram and other channels.
Changes:
- Override chat_with_tools() in AnthropicProvider: converts OpenAI-format
tool JSON to ToolSpec and delegates to chat() which preserves full
message history
- Skip build_tool_instructions() XML protocol when provider supports
native tools (saves ~12k chars in system prompt)
- Remove duplicate Tool Use Protocol section from build_system_prompt()
for native-tool providers
- Update Your Task section to encourage conversational follow-ups
instead of XML tool_call tags when using native tools
- Add tracing::warn for malformed tool definitions in chat_with_tools
- Extract hard-coded test vector keys into named constants in bedrock.rs
and linq.rs to resolve rust/hard-coded-cryptographic-value alerts
- Replace derived Debug impls with manual impls that redact sensitive
fields (access_token, refresh_token, credential, api_key) on
QwenOauthCredentials, QwenOauthProviderContext, and
ResolvedEmbeddingConfig to resolve rust/cleartext-logging alerts
- Redact Matrix user_id and device_id hints in tracing::warn! diagnostic
messages via crate::security::redact() to resolve cleartext-logging
alert in matrix.rs
Addresses CodeQL alerts: #77, #95-106
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>