* feat(config): support ${ENV_VAR} substitution in config.yaml
* fix: extend env var expansion to CLI and gateway config loaders
The original PR (#2680) only wired _expand_env_vars into load_config(),
which is used by 'hermes tools' and 'hermes setup'. The two primary
config paths were missed:
- load_cli_config() in cli.py (interactive CLI)
- Module-level _cfg in gateway/run.py (gateway — bridges api_keys to env vars)
Also:
- Remove redundant 'import re' (already imported at module level)
- Add missing blank lines between top-level functions (PEP 8)
- Add tests for load_cli_config() expansion
---------
Co-authored-by: teyrebaz33 <hakanerten02@hotmail.com>
Cherry-picked from PR #2576 by ereid7, plus read-side fix from 173a5c62.
Both fixes were originally landed in 173a5c62 but were inadvertently
reverted by commit 34be3f8b (a squash-merge that bundled unrelated
tools_config.py changes).
Save side (_save_platform_tools): exclude platform default toolset
names (hermes-cli, hermes-telegram) from preserved entries so they
don't silently re-enable everything.
Read side (_get_platform_tools): when the saved list contains explicit
configurable keys, use direct membership instead of subset inference.
The subset approach is broken when composite toolsets like hermes-cli
resolve to ALL tools.
Based on PR #2454 by @kshitijk4poor (reimplemented lean — 127 lines
vs original 715).
Type @ in the CLI input to get autocomplete suggestions for context
references:
- Static: @diff, @staged, @file:, @folder:, @git:, @url:
- @file:path and @folder:path browse the filesystem
- Bare @ or @partial shows matching files/folders from cwd
Dropped from original: .hermesignore walking, custom shell tokenizer,
PathToken dataclass, fuzzy matching, token estimates. Kept: all
user-facing functionality.
Reads auxiliary.vision.timeout from config.yaml (default: 30s) and
passes it to async_call_llm. Useful for slow local vision models
that need more than 30 seconds.
Setting is in config.yaml (not .env) since it's not a secret:
auxiliary:
vision:
timeout: 120
Based on PR #2306.
Co-authored-by: kshitijk4poor <kshitijk4poor@users.noreply.github.com>
* fix: respect DashScope v1 runtime mode for alibaba
Remove the hardcoded Alibaba branch from resolve_runtime_provider()
that forced api_mode='anthropic_messages' regardless of the base URL.
Alibaba now goes through the generic API-key provider path, which
auto-detects the protocol from the URL:
- /apps/anthropic → anthropic_messages (via endswith check)
- /v1 → chat_completions (default)
This fixes Alibaba setup with OpenAI-compatible DashScope endpoints
(e.g. coding-intl.dashscope.aliyuncs.com/v1) that were broken because
runtime always forced Anthropic mode even when setup saved a /v1 URL.
Based on PR #2024 by @kshitijk4poor.
* docs(skill): add split, merge, search examples to ocr-and-documents skill
Adds pymupdf examples for PDF splitting, merging, and text search
to the existing ocr-and-documents skill. No new dependencies — pymupdf
already covers all three operations natively.
* fix: replace all production print() calls with logger in rl_training_tool
Replace all bare print() calls in production code paths with proper logger calls.
- Add `import logging` and module-level `logger = logging.getLogger(__name__)`
- Replace print() in _start_training_run() with logger.info()
- Replace print() in _stop_training_run() with logger.info()
- Replace print(Warning/Note) calls with logger.warning() and logger.info()
Using the logging framework allows log level filtering, proper formatting,
and log routing instead of always printing to stdout.
* fix(gateway): process /queue'd messages after agent completion
/queue stored messages in adapter._pending_messages but never consumed
them after normal (non-interrupted) completion. The consumption path
at line 5219 only checked pending messages when result.get('interrupted')
was True — since /queue deliberately doesn't interrupt, queued messages
were silently dropped.
Now checks adapter._pending_messages after both interrupted AND normal
completion. For queued messages (non-interrupt), the first response is
delivered before recursing to process the queued follow-up. Skips the
direct send when streaming already delivered the response.
Reported by GhostMode on Discord.
* chore: add minimax/minimax-m2.7 to OpenRouter and MiniMax model catalogs
---------
Co-authored-by: kshitijk4poor <kshitijk4poor@users.noreply.github.com>
Co-authored-by: memosr.eth <96793918+memosr@users.noreply.github.com>
Reverts the sanitizer addition from PR #2466 (originally #2129).
We already have _empty_content_retries handling for reasoning-only
responses. The trailing strip risks silently eating valid messages
and is redundant with existing empty-content handling.
Add hermes mcp add/remove/list/test/configure CLI for managing MCP
server connections interactively. Discovery-first 'add' flow connects,
discovers tools, and lets users select which to enable via curses checklist.
Add OAuth 2.1 PKCE authentication for MCP HTTP servers (RFC 7636).
Supports browser-based and manual (headless) authorization, token
caching with 0600 permissions, automatic refresh. Zero external deps.
Add ${ENV_VAR} interpolation in MCP server config values, resolved
from os.environ + ~/.hermes/.env at load time.
Core OAuth module from PR #2021 by @imnotdev25. CLI and mcp_tool
wiring rewritten against current main. Closes#497, #690.
* docs: add Gemini OAuth provider implementation plan
Planning doc for a standard-route Gemini provider using Google OAuth
(Authorization Code + PKCE) with the OpenAI-compatible endpoint at
generativelanguage.googleapis.com. Covers OAuth flow, token lifecycle,
file list, and estimated scope (~700 lines).
Replaces the Node.js bridge approach from PR #2042.
* chore: update OpenRouter model list
- Add xiaomi/mimo-v2-pro
- Add nvidia/nemotron-3-super-120b-a12b (paid, higher rate limits)
- Remove openrouter/hunter-alpha and openrouter/healer-alpha (discontinued)
Closes#2453
The DEFAULT_CONFIG was hardcoding google/gemini-3-flash-preview as the
summary_model for context compression. This caused unexpected OpenRouter
charges for users who configured a different provider/model, because the
compression task would silently fall back to gemini via OpenRouter even
when the user's main model was on a different provider.
Fix: change summary_model default to empty string. When empty,
call_llm() resolves the model through the standard auto-detection chain
(auxiliary.compression config -> env vars -> main provider), which
correctly uses the user's configured provider and model.
Users who want a dedicated cheap model for compression can still
explicitly set compression.summary_model in their config.yaml.
Remove the hardcoded Alibaba branch from resolve_runtime_provider()
that forced api_mode='anthropic_messages' regardless of the base URL.
Alibaba now goes through the generic API-key provider path, which
auto-detects the protocol from the URL:
- /apps/anthropic → anthropic_messages (via endswith check)
- /v1 → chat_completions (default)
This fixes Alibaba setup with OpenAI-compatible DashScope endpoints
(e.g. coding-intl.dashscope.aliyuncs.com/v1) that were broken because
runtime always forced Anthropic mode even when setup saved a /v1 URL.
Based on PR #2024 by @kshitijk4poor.
Co-authored-by: kshitijk4poor <kshitijk4poor@users.noreply.github.com>
Cherry-picked from PR #2122 by @AtlasMeridia.
1. do_inspect bytes crash: bundle.files returns bytes for official
skills, .split() expected str. Added decode guard.
2. GitHub redirects: three httpx.get calls missing follow_redirects=True,
causing silent 301 failures on renamed orgs.
3. Skill discovery fallback: scan repo root directories when standard
paths (skills/, .agents/skills/, .claude/skills/) miss.
4. tap list KeyError: t['repo'] crashes for local taps. Use safe .get().
auth_type was "***" instead of "api_key" and api_key_env_vars was
("OPEN...",) instead of ("OPENCODE_GO_API_KEY",). This was introduced
in 35d948b6 when a secret redaction tool masked these values during
the Kilo Code provider commit. OpenCode Go provider was completely
broken as a result.
When 'hermes update' stashes local changes and the restore hits
conflicts, the previous behavior silently ran 'git reset --hard HEAD'
to clean up. This could surprise users who didn't realize their
working tree was being nuked.
Now the conflict handler:
- Lists the specific conflicted files
- Reassures the user their stash is preserved
- Asks before resetting (interactive mode)
- Auto-resets in non-interactive mode (prompt_user=False)
- If declined, leaves the working tree as-is with guidance
When `hermes update` stashes local changes and the subsequent
`git stash apply` fails or leaves unmerged files, the conflict markers
(<<<<<<< etc.) were left in the working tree, making Hermes unrunnable
until manually cleaned up.
Now the update command runs `git reset --hard HEAD` to restore a clean
working tree before exiting, and also detects unmerged files even when
git stash apply reports success.
Closes#2348
Only honor config.model.base_url for Anthropic resolution when
config.model.provider is actually "anthropic". This prevents a Codex
(or other provider) base_url from leaking into Anthropic runtime and
auxiliary client paths, which would send requests to the wrong
endpoint.
Closes#2384
Add has_usable_secret() to reject empty, short (<4 char), and common
placeholder API key values (changeme, your_api_key, placeholder, etc.)
throughout the auth/runtime resolution chain.
Update list_available_providers() to use provider-specific auth status
via get_auth_status() instead of resolve_runtime_provider(), preventing
cross-provider key fallback from making providers appear available when
they aren't actually configured.
Preserve keyless custom endpoint support by checking via base URL.
Cherry-picked from PR #2121 by aashizpoudel.
- Add resolve_config_path(): checks $HERMES_HOME/honcho.json first,
falls back to ~/.honcho/config.json. Enables isolated Hermes instances
with independent Honcho credentials and settings.
- Update CLI and doctor to use resolved path instead of hardcoded global.
- Change default session_strategy from per-session to per-directory.
Part 1 of #1962 by @erosika.
Cherry-picked from PR #2319 by @itenev.
When the gateway fails to connect (e.g. PrivilegedIntentsRequired,
missing token), systemd's default RestartSec=10 with no start rate
limit causes rapid reconnect storms flooding logs and triggering
platform-side rate limits.
- StartLimitIntervalSec=600 + StartLimitBurst=5 in [Unit] (max 5
restarts per 10 min)
- RestartSec: 10 → 30
- Applied to both templates in gateway.py and scripts/hermes-gateway
Same bug as opencode-zen/go — alibaba fell through to the OpenRouter
model list instead of using _setup_provider_model_selection() which
probes the provider's own /models endpoint.
All user-selectable providers now have correct model selection routing.
After selecting OpenCode Zen or Go as provider in hermes setup, the
model selection page showed OpenRouter models because these providers
weren't in the list that routes to _setup_provider_model_selection().
They fell through to the else branch which shows the OpenRouter catalog.
Users ended up with an OpenCode API key but an OpenRouter model name,
causing 'Provider resolver returned an empty API key' on first use.
Fix: add opencode-zen and opencode-go to the provider list that uses
_setup_provider_model_selection() for live /models detection.
Fresh installs without pull.rebase configured hit a git error when
running hermes update because git doesn't know how to reconcile
divergent branches. --ff-only is the right strategy: it works for the
normal case (local branch is behind remote) and fails cleanly if the
user somehow has local commits, rather than silently rebasing them.
The top-level 'toolsets' key in config.yaml was never read at runtime.
Tool selection uses platform_toolsets (per-platform) or the --toolsets
CLI flag. The key existed in load_cli_config() defaults and the example
config as 'toolsets: [all]', misleading users into thinking it
controlled tool availability.
- Remove from load_cli_config() hardcoded defaults
- Remove from hermes config show output
- Replace in cli-config.yaml.example with deprecation note pointing
to platform_toolsets and hermes tools
Two bugs in the save/load roundtrip for platform_toolsets:
1. _save_platform_tools preserved composite toolset entries (hermes-cli,
hermes-telegram, etc.) because they weren't in configurable_keys.
These composites include ALL _HERMES_CORE_TOOLS, so having hermes-cli
in the saved list alongside individual keys negated any disables —
the subset check always found the disabled toolset's tools via the
composite entry.
Fix: also filter out known TOOLSETS keys from preserved entries. Only
truly unknown entries (MCP server names, custom entries) are kept.
2. _get_platform_tools used reverse subset inference to determine which
configurable toolsets were enabled. This is inherently broken when
tools appear in multiple toolsets (e.g. HA tools in both the
homeassistant toolset and _HERMES_CORE_TOOLS).
Fix: when the saved list contains explicit configurable keys (meaning
the user has configured this platform), use direct membership instead
of subset inference. The fallback path still handles legacy configs
that only have a composite entry like hermes-cli.
- Changed the ANSI escape code for gold color in cli.py and banner.py to use true-color format (#FFD700) for better visual consistency.
- Enhanced the _on_tool_progress method in HermesCLI to update the TUI spinner with tool execution status, improving user feedback during operations.
These changes improve the visual representation and user experience in the command-line interface.
Co-authored-by: Test <test@test.com>
Adds /queue <prompt> (alias /q) that queues a message for the next
turn while the agent is busy, without interrupting the current run.
- CLI: /queue <prompt> puts it in _pending_input for the next turn
- Gateway: /queue <prompt> creates a pending MessageEvent on the
adapter, picked up after the current agent run finishes
- Enter still interrupts as usual (no behavior change)
- /queue with no prompt shows usage
- /queue when agent is idle tells user to just type normally
Co-authored-by: Test <test@test.com>
Replace the fragile hardcoded context length system with a multi-source
resolution chain that correctly identifies context windows per provider.
Key changes:
- New agent/models_dev.py: Fetches and caches the models.dev registry
(3800+ models across 100+ providers with per-provider context windows).
In-memory cache (1hr TTL) + disk cache for cold starts.
- Rewritten get_model_context_length() resolution chain:
0. Config override (model.context_length)
1. Custom providers per-model context_length
2. Persistent disk cache
3. Endpoint /models (local servers)
4. Anthropic /v1/models API (max_input_tokens, API-key only)
5. OpenRouter live API (existing, unchanged)
6. Nous suffix-match via OpenRouter (dot/dash normalization)
7. models.dev registry lookup (provider-aware)
8. Thin hardcoded defaults (broad family patterns)
9. 128K fallback (was 2M)
- Provider-aware context: same model now correctly resolves to different
context windows per provider (e.g. claude-opus-4.6: 1M on Anthropic,
128K on GitHub Copilot). Provider name flows through ContextCompressor.
- DEFAULT_CONTEXT_LENGTHS shrunk from 80+ entries to ~16 broad patterns.
models.dev replaces the per-model hardcoding.
- CONTEXT_PROBE_TIERS changed from [2M, 1M, 512K, 200K, 128K, 64K, 32K]
to [128K, 64K, 32K, 16K, 8K]. Unknown models no longer start at 2M.
- hermes model: prompts for context_length when configuring custom
endpoints. Supports shorthand (32k, 128K). Saved to custom_providers
per-model config.
- custom_providers schema extended with optional models dict for
per-model context_length (backward compatible).
- Nous Portal: suffix-matches bare IDs (claude-opus-4-6) against
OpenRouter's prefixed IDs (anthropic/claude-opus-4.6) with dot/dash
normalization. Handles all 15 current Nous models.
- Anthropic direct: queries /v1/models for max_input_tokens. Only works
with regular API keys (sk-ant-api*), not OAuth tokens. Falls through
to models.dev for OAuth users.
Tests: 5574 passed (18 new tests for models_dev + updated probe tiers)
Docs: Updated configuration.md context length section, AGENTS.md
Co-authored-by: Test <test@test.com>
Based on PR #1859 by @magi-morph (too stale to cherry-pick, reimplemented).
GPT-5.x models reject tool calls + reasoning_effort on
/v1/chat/completions with a 400 error directing to /v1/responses.
This auto-detects api.openai.com in the base URL and switches to
codex_responses mode in three places:
- AIAgent.__init__: upgrades chat_completions → codex_responses
- _try_activate_fallback(): same routing for fallback model
- runtime_provider.py: _detect_api_mode_for_url() for both custom
provider and openrouter runtime resolution paths
Also extracts _is_direct_openai_url() helper to replace the inline
check in _max_tokens_param().
Cherry-picked from PR #2120 by @unclebumpy.
- from_env() now reads HONCHO_BASE_URL and enables Honcho when base_url
is set, even without an API key
- from_global_config() reads baseUrl from config root with
HONCHO_BASE_URL env var as fallback
- get_honcho_client() guard relaxed to allow base_url without api_key
for no-auth local instances
- Added HONCHO_BASE_URL to OPTIONAL_ENV_VARS registry
Result: Setting HONCHO_BASE_URL=http://localhost:8000 in ~/.hermes/.env
now correctly routes the Honcho client to a local instance.
Show complete session IDs in 'hermes sessions list' instead of
truncating to 20 characters. Widens title column from 20→30 chars
and adjusts header widths accordingly.
Fixes#2068. Based on PR #2085 by @Nebula037 with a correction
to preserve the no-titles layout (the original PR accidentally
replaced the Preview/Src header with a duplicate Title/Preview header).
MiniMax's default base URL was /v1 which caused runtime_provider to
default to chat_completions mode (OpenAI-style Authorization: Bearer
header). MiniMax rejects this with a 401 because they require the
Anthropic-style x-api-key header.
Changes:
- auth.py: Change default inference_base_url for minimax and minimax-cn
from /v1 to /anthropic
- runtime_provider.py: Auto-correct stale /v1 URLs from existing .env
files to /anthropic, and always default minimax/minimax-cn providers
to anthropic_messages mode
- Update tests to reflect new defaults, add tests for stale URL
auto-correction and explicit api_mode override
Based on PR #2100 by @devorun. Fixes#2094.
Co-authored-by: Test <test@test.com>
Two issues with /model preventing proper provider switching:
1. Bare provider names not detected: typing '/model nous' treated 'nous'
as a model name instead of triggering a provider switch. Fixed by adding
step 0 in detect_provider_for_model() that checks if the input matches
a known provider name/alias (excluding 'custom'/'openrouter' which need
explicit model names) and returns that provider's default model.
2. Custom endpoint details hidden: /model (no args) showed '[custom]' with
just a usage hint but no endpoint URL or model name. Now displays the
configured base_url for custom providers in both CLI and gateway.
Note: config base_url and OPENAI_BASE_URL are intentionally NOT cleared on
provider switch — dedicated provider paths (nous, anthropic, codex) have
their own credential resolution that ignores these, and clearing them would
destroy the user's custom endpoint config, preventing switching back.
Co-authored-by: Test <test@test.com>
* fix: detect context length for custom model endpoints via fuzzy matching + config override
Custom model endpoints (non-OpenRouter, non-known-provider) were silently
falling back to 2M tokens when the model name didn't exactly match what the
endpoint's /v1/models reported. This happened because:
1. Endpoint metadata lookup used exact match only — model name mismatches
(e.g. 'qwen3.5:9b' vs 'Qwen3.5-9B-Q4_K_M.gguf') caused a miss
2. Single-model servers (common for local inference) required exact name
match even though only one model was loaded
3. No user escape hatch to manually set context length
Changes:
- Add fuzzy matching for endpoint model metadata: single-model servers
use the only available model regardless of name; multi-model servers
try substring matching in both directions
- Add model.context_length config override (highest priority) so users
can explicitly set their model's context length in config.yaml
- Log an informative message when falling back to 2M probe, telling
users about the config override option
- Thread config_context_length through ContextCompressor and AIAgent init
Tests: 6 new tests covering fuzzy match, single-model fallback, config
override (including zero/None edge cases).
* fix: auto-detect local model name and context length for local servers
Cherry-picked from PR #2043 by sudoingX.
- Auto-detect model name from local server's /v1/models when only one
model is loaded (no manual model name config needed)
- Add n_ctx_train and n_ctx to context length detection keys for llama.cpp
- Query llama.cpp /props endpoint for actual allocated context (not just
training context from GGUF metadata)
- Strip .gguf suffix from display in banner and status bar
- _auto_detect_local_model() in runtime_provider.py for CLI init
Co-authored-by: sudo <sudoingx@users.noreply.github.com>
* fix: revert accidental summary_target_tokens change + add docs for context_length config
- Revert summary_target_tokens from 2500 back to 500 (accidental change
during patching)
- Add 'Context Length Detection' section to Custom & Self-Hosted docs
explaining model.context_length config override
---------
Co-authored-by: Test <test@test.com>
Co-authored-by: sudo <sudoingx@users.noreply.github.com>
The gateway approval system previously intercepted bare 'yes'/'no' text
from the user's next message to approve/deny dangerous commands. This was
fragile and dangerous — if the agent asked a clarify question and the user
said 'yes' to answer it, the gateway would execute the pending dangerous
command instead. (Fixes#1888)
Changes:
- Remove bare text matching ('yes', 'y', 'approve', 'ok', etc.) from
_handle_message approval check
- Add /approve and /deny as gateway-only slash commands in the command
registry
- /approve supports scoping: /approve (one-time), /approve session,
/approve always (permanent)
- Add 5-minute timeout for stale approvals
- Gateway appends structured instructions to the agent response when a
dangerous command is pending, telling the user exactly how to respond
- 9 tests covering approve, deny, timeout, scoping, and verification
that bare 'yes' no longer triggers execution
Credit to @solo386 and @FlyByNight69420 for identifying and reporting
this security issue in PR #1971 and issue #1888.
Co-authored-by: Test <test@test.com>
After #1675 removed ANTHROPIC_BASE_URL env var support, the Anthropic
provider base URL was hardcoded to https://api.anthropic.com. Now reads
model.base_url from config.yaml as an override, falling back to the
default when not set. Also applies to the auxiliary client.
Cherry-picked from PR #1949 by @rivercrab26.
Co-authored-by: rivercrab26 <rivercrab26@users.noreply.github.com>
Three bugs prevented providers like MiniMax from using their
Anthropic-compatible endpoints (e.g. api.minimax.io/anthropic):
1. _VALID_API_MODES was missing 'anthropic_messages', so explicit
api_mode config was silently rejected and defaulted to
chat_completions.
2. API-key provider resolution hardcoded api_mode to 'chat_completions'
without checking model config or detecting Anthropic-compatible URLs.
3. run_agent.py auto-detection only recognized api.anthropic.com, not
third-party endpoints using the /anthropic URL convention.
Fixes:
- Add 'anthropic_messages' to _VALID_API_MODES
- API-key providers now check model config api_mode and auto-detect
URLs ending in /anthropic
- run_agent.py and fallback logic detect /anthropic URL convention
- 5 new tests covering all scenarios
Users can now either:
- Set MINIMAX_BASE_URL=https://api.minimax.io/anthropic (auto-detected)
- Set api_mode: anthropic_messages in model config (explicit)
- Use custom_providers with api_mode: anthropic_messages
Co-authored-by: Test <test@test.com>
When provider: custom is set in config.yaml with base_url and api_key,
those values are now used instead of falling back to OPENAI_BASE_URL and
OPENAI_API_KEY env vars. Also reads the 'api' field as an alternative to
'api_key' for config compatibility.
Cherry-picked from PR #1762 by crazywriter1.
Co-authored-by: crazywriter1 <53251494+crazywriter1@users.noreply.github.com>
The previous copilot_model_api_mode() checked the catalog's
supported_endpoints first and picked /chat/completions when a model
supported both endpoints. This is wrong — GPT-5+ models should use
the Responses API even when the catalog lists both.
Replicate opencode's shouldUseCopilotResponsesApi() logic:
- GPT-5+ models (gpt-5.4, gpt-5.3-codex, etc.) → Responses API
- gpt-5-mini → Chat Completions (explicit exception)
- Everything else (gpt-4o, claude, gemini, etc.) → Chat Completions
- Model ID pattern is the primary signal, catalog is secondary
The catalog fallback now only matters for non-GPT-5 models that might
exclusively support /v1/messages (e.g. Claude via Copilot).
Models are auto-detected from the live catalog at
api.githubcopilot.com/models — no hardcoded list required for
supported models, only a static fallback for when the API is
unreachable.
Adds /statusbar (alias /sb) to show/hide the bottom status bar that
displays model name, context usage, and session duration.
Uses ConditionalContainer so the bar takes zero space when hidden
rather than leaving a blank line.
- Add anthropic/claude-haiku-4.5
- Move gpt-5.4-pro and gpt-5.4-nano to bottom
- Fix minimax/minimax-m2.7 → minimax-m2.5 (m2.7 not on OpenRouter)
- Tag hunter-alpha and healer-alpha as free
- Place hunter/healer-alpha right below gpt-5.4-mini
Builds on PR #1879's Copilot integration with critical auth improvements
modeled after opencode's implementation:
- Add hermes_cli/copilot_auth.py with:
- OAuth device code flow (copilot_device_code_login) using the same
client_id (Ov23li8tweQw6odWQebz) as opencode and Copilot CLI
- Token type validation: reject classic PATs (ghp_*) with a clear
error message explaining supported token types
- Proper env var priority: COPILOT_GITHUB_TOKEN > GH_TOKEN > GITHUB_TOKEN
(matching Copilot CLI documentation)
- copilot_request_headers() with Openai-Intent, x-initiator, and
Copilot-Vision-Request headers (matching opencode)
- Update auth.py:
- PROVIDER_REGISTRY copilot entry uses correct env var order
- _resolve_api_key_provider_secret delegates to copilot_auth for
the copilot provider with proper token validation
- Update models.py:
- copilot_default_headers() now includes Openai-Intent and x-initiator
- Update main.py:
- _model_flow_copilot offers OAuth device code login when no token
is found, with manual token entry as fallback
- Shows supported vs unsupported token types
- 22 new tests covering token validation, env var priority, header
generation, and integration with existing auth infrastructure
- Strip '_tools' suffix from internal toolset identifiers in the banner
(e.g. 'web_tools' -> 'web', 'homeassistant_tools' -> 'homeassistant')
- Stop appending '_tools' to unavailable toolset names
- Replace 6 hardcoded hex colors (#B8860B, #FFBF00, #FFF8DC) in toolset
rows, overflow line, and MCP server rows with the skin variables
(dim, accent, text) already resolved at the top of the function
Inspired by PR #1871 by @kshitijk4poor.
Adds 4 tests.
* fix: banner skill count now respects disabled skills and platform filtering
The banner's get_available_skills() was doing a raw rglob scan of
~/.hermes/skills/ without checking:
- Whether skills are disabled (skills.disabled config)
- Whether skills match the current platform (platforms: frontmatter)
This caused the banner to show inflated skill counts (e.g. '100 skills'
when many are disabled) and list macOS-only skills on Linux.
Fix: delegate to _find_all_skills() from tools/skills_tool which already
handles both platform gating and disabled-skill filtering.
* fix: system prompt and slash commands now respect disabled skills
Two more places where disabled skills were still surfaced:
1. build_skills_system_prompt() in prompt_builder.py — disabled skills
appeared in the <available_skills> system prompt section, causing
the agent to suggest/load them despite being disabled.
2. scan_skill_commands() in skill_commands.py — disabled skills still
registered as /skill-name slash commands in CLI help and could be
invoked.
Both now load _get_disabled_skill_names() and filter accordingly.
* fix: skill_view blocks disabled skills
skill_view() checked platform compatibility but not disabled state,
so the agent could still load and read disabled skills directly.
Now returns a clear error when a disabled skill is requested, telling
the user to enable it via hermes skills or inspect the files manually.
---------
Co-authored-by: Test <test@test.com>
Recognize hermes_cli/main.py gateway command lines in gateway
process detection and PID validation so --replace reliably finds
existing gateway instances.
Adds a regression test covering script-style cmdline detection.
Closes#1830
Add _wait_for_gateway_exit() that polls get_running_pid() to confirm
the old gateway process has actually exited before starting a new one.
If the process doesn't exit within 5s, sends SIGKILL to the specific
PID. Uses the saved PID from gateway.pid (not launchd labels) so it
works correctly with multiple gateway instances under separate
HERMES_HOME directories.
Applied to both launchd_restart() and the manual restart path (replaces
the blind time.sleep(2)).
Inspired by PR #1881 by @AzothZephyr (race condition diagnosis).
Adds 4 tests.
MiniMax: Add M2.7 and M2.7-highspeed as new defaults across provider
model lists, auxiliary client, metadata, setup wizard, RL training tool,
fallback tests, and docs. Retain M2.5/M2.1 as alternatives.
OpenRouter: Add grok-4.20-beta, nemotron-3-super-120b-a12b:free,
trinity-large-preview:free, glm-5-turbo, and hunter-alpha to the
model catalog.
MiniMax changes based on PR #1882 by @octo-patch (applied manually
due to stale conflicts in refactored pricing module).
Add first-class GitHub Copilot and Copilot ACP provider support across
model selection, runtime provider resolution, CLI sessions, delegated
subagents, cron jobs, and the Telegram gateway.
This also normalizes Copilot model catalogs and API modes, introduces a
Copilot ACP OpenAI-compatible shim, and fixes service-mode auth by
resolving Homebrew-installed gh binaries under launchd.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The /browser command handler existed in cli.py but was never added to
COMMAND_REGISTRY after the centralized command registry refactor. This
meant:
- /browser didn't appear in /help
- No tab-completion or subcommand suggestions
- Dispatch used _base_word fallback instead of canonical resolution
Added CommandDef with connect/disconnect/status subcommands and
switched dispatch to use canonical instead of _base_word.
* feat: OpenAI-compatible API server platform adapter
Salvaged from PR #956, updated for current main.
Adds an HTTP API server as a gateway platform adapter that exposes
hermes-agent via the OpenAI Chat Completions and Responses APIs.
Any OpenAI-compatible frontend (Open WebUI, LobeChat, LibreChat,
AnythingLLM, NextChat, ChatBox, etc.) can connect by pointing at
http://localhost:8642/v1.
Endpoints:
- POST /v1/chat/completions — stateless Chat Completions API
- POST /v1/responses — stateful Responses API with chaining
- GET /v1/responses/{id} — retrieve stored response
- DELETE /v1/responses/{id} — delete stored response
- GET /v1/models — list hermes-agent as available model
- GET /health — health check
Features:
- Real SSE streaming via stream_delta_callback (uses main's streaming)
- In-memory LRU response store for Responses API conversation chaining
- Named conversations via 'conversation' parameter
- Bearer token auth (optional, via API_SERVER_KEY)
- CORS support for browser-based frontends
- System prompt layering (frontend system messages on top of core)
- Real token usage tracking in responses
Integration points:
- Platform.API_SERVER in gateway/config.py
- _create_adapter() branch in gateway/run.py
- API_SERVER_* env vars in hermes_cli/config.py
- Env var overrides in gateway/config.py _apply_env_overrides()
Changes vs original PR #956:
- Removed streaming infrastructure (already on main via stream_consumer.py)
- Removed Telegram reply_to_mode (separate feature, not included)
- Updated _resolve_model() -> _resolve_gateway_model()
- Updated stream_callback -> stream_delta_callback
- Updated connect()/disconnect() to use _mark_connected()/_mark_disconnected()
- Adapted to current Platform enum (includes MATTERMOST, MATRIX, DINGTALK)
Tests: 72 new tests, all passing
Docs: API server guide, Open WebUI integration guide, env var reference
* feat(whatsapp): make reply prefix configurable via config.yaml
Reworked from PR #1764 (ifrederico) to use config.yaml instead of .env.
The WhatsApp bridge prepends a header to every outgoing message.
This was hardcoded to '⚕ *Hermes Agent*'. Users can now customize
or disable it via config.yaml:
whatsapp:
reply_prefix: '' # disable header
reply_prefix: '🤖 *My Bot*\n───\n' # custom prefix
How it works:
- load_gateway_config() reads whatsapp.reply_prefix from config.yaml
and stores it in PlatformConfig.extra['reply_prefix']
- WhatsAppAdapter reads it from config.extra at init
- When spawning bridge.js, the adapter passes it as
WHATSAPP_REPLY_PREFIX in the subprocess environment
- bridge.js handles undefined (default), empty (no header),
or custom values with \\n escape support
- Self-chat echo suppression uses the configured prefix
Also fixes _config_version: was 9 but ENV_VARS_BY_VERSION had a
key 10 (TAVILY_API_KEY), so existing users at v9 would never be
prompted for Tavily. Bumped to 10 to close the gap. Added a
regression test to prevent this from happening again.
Credit: ifrederico (PR #1764) for the bridge.js implementation
and the config version gap discovery.
---------
Co-authored-by: Test <test@test.com>
The OpenCode Zen and OpenCode Go setup sections used prompt_text()
which is undefined. All other providers correctly use the local
prompt() function defined in setup.py. Fixes crash during
'hermes setup' when selecting either OpenCode provider.
- Add summary_base_url config option to compression block for custom
OpenAI-compatible endpoints (e.g. zai, DeepSeek, Ollama)
- Remove compression env var bridges from cli.py and gateway/run.py
(CONTEXT_COMPRESSION_* env vars no longer set from config)
- Switch run_agent.py to read compression config directly from
config.yaml instead of env vars
- Fix backwards-compat block in _resolve_task_provider_model to also
fire when auxiliary.compression.provider is 'auto' (DEFAULT_CONFIG
sets this, which was silently preventing the compression section's
summary_* keys from being read)
- Add test for summary_base_url config-to-client flow
- Update docs to show compression as config.yaml-only
Closes#1591
Based on PR #1702 by @uzaylisak
Four small fixes:
1. model_tools.py: Tool import failures logged at WARNING instead of
DEBUG. If a tool module fails to import (syntax error, missing dep),
the user now sees a warning instead of the tool silently vanishing.
2. hermes_cli/config.py: Remove duplicate 'import sys' (lines 19, 21).
3. agent/model_metadata.py: Remove 6 duplicate entries in
DEFAULT_CONTEXT_LENGTHS dict. Python keeps the last value, so no
functional change, but removes maintenance confusion.
4. hermes_state.py: Add missing self._lock to the LIKE query in
resolve_session_id(). The exact-match path used get_session()
(which locks internally), but the prefix fallback queried _conn
without the lock.
Salvage of PR #1707 by @kshitijk4poor (cherry-picked with authorship preserved).
Adds Tavily as a third web backend alongside Firecrawl and Parallel, using the Tavily REST API via httpx.
- Backend selection via hermes tools → saved as web.backend in config.yaml
- All three tools supported: search, extract, crawl
- TAVILY_API_KEY in config registry, doctor, status, setup wizard
- 15 new Tavily tests + 9 backend selection tests + 5 config tests
- Backward compatible
Closes#1707
* feat(web): add Parallel as alternative web search/extract backend
Adds Parallel (parallel.ai) as a drop-in alternative to Firecrawl for
web_search and web_extract tools using the official parallel-web SDK.
- Backend selection via WEB_SEARCH_BACKEND env var (auto/parallel/firecrawl)
- Auto mode prefers Firecrawl when both keys present; Parallel when sole backend
- web_crawl remains Firecrawl-only with clear error when unavailable
- Lazy SDK imports, interrupt support, singleton clients
- 16 new unit tests for backend selection and client config
Co-authored-by: s-jag <s-jag@users.noreply.github.com>
* fix: add PARALLEL_API_KEY to config registry and fix web_crawl policy tests
Follow-up for Parallel backend integration:
- Add PARALLEL_API_KEY to OPTIONAL_ENV_VARS (hermes doctor, env blocklist)
- Add to set_config_value api_keys list (hermes config set)
- Add to doctor keys display
- Fix 2 web_crawl policy tests that didn't set FIRECRAWL_API_KEY
(needed now that web_crawl has a Firecrawl availability guard)
* refactor: explicit backend selection via hermes tools, not auto-detect
Replace the auto-detect backend selection with explicit user choice:
- hermes tools saves WEB_SEARCH_BACKEND to .env when user picks a provider
- _get_backend() reads the explicit choice first
- Fallback only for manual/legacy config (uses whichever key is present)
- _is_provider_active() shows [active] for the selected web backend
- Updated tests, docs, and .env.example to remove 'auto' mode language
* refactor: use config.yaml for web backend, not env var
Match the TTS/browser pattern — web.backend is stored in config.yaml
(set by hermes tools), not as a WEB_SEARCH_BACKEND env var.
- _load_web_config() reads web: section from config.yaml
- _get_backend() reads web.backend from config, falls back to key detection
- _configure_provider() saves to config dict (saved to config.yaml)
- _is_provider_active() reads from config dict
- Removed WEB_SEARCH_BACKEND from .env.example, set_config_value, docs
- Updated all tests to mock _load_web_config instead of env vars
---------
Co-authored-by: s-jag <s-jag@users.noreply.github.com>
The function uses subprocess.run() and subprocess.CalledProcessError but
never imported the module. This caused a NameError crash during setup
when users selected NeuTTS as their TTS provider.
Fixes#1698
Add the ability to selectively enable/disable individual MCP server
tools through the interactive 'hermes tools' TUI.
Changes:
- tools/mcp_tool.py: Add probe_mcp_server_tools() — lightweight function
that temporarily connects to configured MCP servers, discovers their
tools (names + descriptions), and disconnects. No registry side effects.
- hermes_cli/tools_config.py: Add 'Configure MCP tools' option to the
interactive menu. When selected:
1. Probes all enabled MCP servers for their available tools
2. Shows a per-server curses checklist with tool descriptions
3. Pre-selects tools based on existing include/exclude config
4. Writes changes back as tools.exclude entries in config.yaml
5. Reports which servers failed to connect
The existing CLI commands (hermes tools enable/disable server:tool)
continue to work unchanged. This adds the interactive TUI counterpart
so users can browse and toggle MCP tools visually.
Tests: 22 new tests covering probe function edge cases and interactive
flow (pre-selection, exclude/include modes, description truncation,
multi-server handling, error paths).
fetch_nous_models() uses keyword-only parameters (the * separator in
its signature), but models.py called it with positional args and in
the wrong order (api_key first, base_url second). This always raised
TypeError, silently caught by except Exception: pass.
Result: Nous provider model list was completely broken — /model
autocomplete and provider_model_ids('nous') always fell back to the
static model catalog instead of fetching live models.
Adds both platforms to the config system so hermes setup, hermes doctor,
and hermes config properly discover and manage their env vars.
- MATTERMOST_URL, MATTERMOST_TOKEN, MATTERMOST_ALLOWED_USERS
- MATRIX_HOMESERVER, MATRIX_ACCESS_TOKEN, MATRIX_USER_ID, MATRIX_ALLOWED_USERS
- Extra env keys for .env sanitizer: MATTERMOST_HOME_CHANNEL,
MATTERMOST_REPLY_MODE, MATRIX_PASSWORD, MATRIX_ENCRYPTION, MATRIX_HOME_ROOM
Add support for Mattermost (self-hosted Slack alternative) and Matrix
(federated messaging protocol) as messaging platforms.
Mattermost adapter:
- REST API v4 client for posts, files, channels, typing indicators
- WebSocket listener for real-time 'posted' events with reconnect backoff
- Thread support via root_id
- File upload/download with auth-aware caching
- Dedup cache (5min TTL, 2000 entries)
- Full self-hosted instance support
Matrix adapter:
- matrix-nio AsyncClient with sync loop
- Dual auth: access token or user_id + password
- Optional E2EE via matrix-nio[e2e] (libolm)
- Thread support via m.thread (MSC3440)
- Reply support via m.in_reply_to with fallback stripping
- Media upload/download via mxc:// URLs (authenticated v1.11+ endpoint)
- Auto-join on room invite
- DM detection via m.direct account data with sync fallback
- Markdown to HTML conversion
Fixes applied over original PR #1225 by @cyb0rgk1tty:
- Mattermost: add timeout to file downloads, wrap API helpers in
try/except for network errors, download incoming files immediately
with auth headers instead of passing auth-required URLs
- Matrix: use authenticated media endpoint (/_matrix/client/v1/media/),
robust m.direct cache with sync fallback, prefer aiohttp over httpx
Install Matrix support: pip install 'hermes-agent[matrix]'
Mattermost needs no extra deps (uses aiohttp).
Salvaged from PR #1225 by @cyb0rgk1tty with fixes.
* feat(gateway): add DingTalk platform adapter
Add DingTalk as a messaging platform using the dingtalk-stream SDK
for real-time message reception via Stream Mode (no webhook needed).
Replies are sent via session webhook using markdown format.
Features:
- Stream Mode connection (long-lived WebSocket, no public URL needed)
- Text and rich text message support
- DM and group chat support
- Message deduplication with 5-minute window
- Auto-reconnection with exponential backoff
- Session webhook caching for reply routing
Configuration:
export DINGTALK_CLIENT_ID=your-app-key
export DINGTALK_CLIENT_SECRET=your-app-secret
# or in config.yaml:
platforms:
dingtalk:
enabled: true
extra:
client_id: your-app-key
client_secret: your-app-secret
Files:
- gateway/platforms/dingtalk.py (340 lines) — adapter implementation
- gateway/config.py — add DINGTALK to Platform enum
- gateway/run.py — add DingTalk to _create_adapter
- hermes_cli/config.py — add env vars to _EXTRA_ENV_KEYS
- hermes_cli/tools_config.py — add dingtalk to PLATFORMS
- tests/gateway/test_dingtalk.py — 21 tests
* docs: add Alibaba Cloud and DingTalk to setup wizard and docs
Wire Alibaba Cloud (DashScope) into hermes setup and hermes model
provider selection flows. Add DingTalk env vars to documentation.
Changes:
- setup.py: Add Alibaba Cloud as provider choice (index 11) with
DASHSCOPE_API_KEY prompt and model studio link
- main.py: Add alibaba to provider_labels, providers list, and
model flow dispatch
- environment-variables.md: Add DASHSCOPE_API_KEY, DINGTALK_CLIENT_ID,
DINGTALK_CLIENT_SECRET, and alibaba to HERMES_INFERENCE_PROVIDER
- Default enabled: false (zero overhead when not configured)
- Fast path: cached disabled state skips all work immediately
- TTL cache (30s) for parsed policy — avoids re-reading config.yaml
on every URL check
- Missing shared files warn + skip instead of crashing all web tools
- Lazy yaml import — missing PyYAML doesn't break browser toolset
- Guarded browser_tool import — fail-open lambda fallback
- check_website_access never raises for default path (fail-open with
warning log); only raises with explicit config_path (test mode)
- Simplified enforcement code in web_tools/browser_tool — no more
try/except wrappers since errors are handled internally
Add DingTalk as a messaging platform using the dingtalk-stream SDK
for real-time message reception via Stream Mode (no webhook needed).
Replies are sent via session webhook using markdown format.
Features:
- Stream Mode connection (long-lived WebSocket, no public URL needed)
- Text and rich text message support
- DM and group chat support
- Message deduplication with 5-minute window
- Auto-reconnection with exponential backoff
- Session webhook caching for reply routing
Configuration:
export DINGTALK_CLIENT_ID=your-app-key
export DINGTALK_CLIENT_SECRET=your-app-secret
# or in config.yaml:
platforms:
dingtalk:
enabled: true
extra:
client_id: your-app-key
client_secret: your-app-secret
Files:
- gateway/platforms/dingtalk.py (340 lines) — adapter implementation
- gateway/config.py — add DINGTALK to Platform enum
- gateway/run.py — add DingTalk to _create_adapter
- hermes_cli/config.py — add env vars to _EXTRA_ENV_KEYS
- hermes_cli/tools_config.py — add dingtalk to PLATFORMS
- tests/gateway/test_dingtalk.py — 21 tests
ANTHROPIC_BASE_URL collides with Claude Code and other Anthropic
tooling. Remove it from the Anthropic provider — base URL overrides
should go through config.yaml model.base_url instead.
The Alibaba/DashScope provider has its own dedicated base URL and
API key env vars which don't collide with anything.
Add display.theme_mode setting (auto/light/dark) that makes the CLI
readable on light terminal backgrounds.
- Auto-detect terminal background via COLORFGBG, OSC 11, and macOS
appearance (fallback chain in hermes_cli/colors.py)
- Add colors_light overrides to all 7 built-in skins with dark/readable
colors for light backgrounds
- SkinConfig.get_color() now returns light overrides when theme is light
- get_prompt_toolkit_style_overrides() uses light bg colors for
completion menus in light mode
- init_skin_from_config() reads display.theme_mode from config
- 7 new tests covering theme mode resolution, detection fallbacks,
and light-mode skin overrides
Salvaged from PR #1187 by @peteromallet. Core design preserved;
adapted to current main (kept all existing helpers, tool_emojis,
convenience functions that were added after the PR branched).
Co-authored-by: Peter O'Mallet <peteromallet@users.noreply.github.com>
Add Alibaba Cloud (DashScope) as a first-class inference provider
using the Anthropic-compatible endpoint. This gives access to Qwen
models (qwen3.5-plus, qwen3-max, qwen3-coder-plus, etc.) through
the same api_mode as native Anthropic.
Also add ANTHROPIC_BASE_URL env var support so users can point the
Anthropic provider at any compatible endpoint.
Changes:
- auth.py: Add alibaba ProviderConfig + ANTHROPIC_BASE_URL on anthropic
- models.py: Add alibaba to catalog, labels, aliases (dashscope/aliyun/qwen), provider order
- runtime_provider.py: Add alibaba resolution (anthropic_messages api_mode) + ANTHROPIC_BASE_URL
- model_metadata.py: Add Qwen model context lengths (128K)
- config.py: Add DASHSCOPE_API_KEY, DASHSCOPE_BASE_URL, ANTHROPIC_BASE_URL env vars
Usage:
hermes --provider alibaba --model qwen3.5-plus
# or via aliases:
hermes --provider qwen --model qwen3-max
Add Kilo Gateway (kilo.ai) as an API-key provider with OpenAI-compatible
endpoint at https://api.kilo.ai/api/gateway. Supports 500+ models from
Anthropic, OpenAI, Google, xAI, Mistral, MiniMax via a single API key.
- Register kilocode in PROVIDER_REGISTRY with aliases (kilo, kilo-code,
kilo-gateway) and KILOCODE_API_KEY / KILOCODE_BASE_URL env vars
- Add to model catalog, CLI provider menu, setup wizard, doctor checks
- Add google/gemini-3-flash-preview as default aux model
- 12 new tests covering registration, aliases, credential resolution,
runtime config
- Documentation updates (env vars, config, fallback providers)
- Fix setup test index shift from provider insertion
Inspired by PR #1473 by @amanning3390.
Co-authored-by: amanning3390 <amanning3390@users.noreply.github.com>
Docker terminal sessions are secret-dark by default. This adds
terminal.docker_forward_env as an explicit allowlist for env vars
that may be forwarded into Docker containers.
Values resolve from the current shell first, then fall back to
~/.hermes/.env. Only variables the user explicitly lists are
forwarded — nothing is auto-exposed.
Cherry-picked from PR #1449 by @teknium1, conflict-resolved onto
current main.
Fixes#1436
Supersedes #1439
Remove the optional skill (redundant now that NeuTTS is a built-in TTS
provider). Replace neutts_cli dependency with a standalone synthesis
helper (tools/neutts_synth.py) that calls the neutts Python API directly
in a subprocess.
Add TTS provider selection to hermes setup:
- 'hermes setup' now prompts for TTS provider after model selection
- 'hermes setup tts' available as standalone section
- Selecting NeuTTS checks for deps and offers to install:
espeak-ng (system) + neutts[all] (pip)
- ElevenLabs/OpenAI selections prompt for API keys
- Tool status display shows NeuTTS install state
Changes:
- Remove optional-skills/mlops/models/neutts/ (skill + CLI scaffold)
- Add tools/neutts_synth.py (standalone synthesis subprocess helper)
- Move jo.wav/jo.txt to tools/neutts_samples/ (bundled default voice)
- Refactor _generate_neutts() — uses neutts API via subprocess, no
neutts_cli dependency, config-driven ref_audio/ref_text/model/device
- Add TTS setup to hermes_cli/setup.py (SETUP_SECTIONS, tool status)
- Update config.py defaults (ref_audio, ref_text, model, device)
* fix: prevent infinite 400 failure loop on context overflow (#1630)
When a gateway session exceeds the model's context window, Anthropic may
return a generic 400 invalid_request_error with just 'Error' as the
message. This bypassed the phrase-based context-length detection,
causing the agent to treat it as a non-retryable client error. Worse,
the failed user message was still persisted to the transcript, making
the session even larger on each attempt — creating an infinite loop.
Three-layer fix:
1. run_agent.py — Fallback heuristic: when a 400 error has a very short
generic message AND the session is large (>40% of context or >80
messages), treat it as a probable context overflow and trigger
compression instead of aborting.
2. run_agent.py + gateway/run.py — Don't persist failed messages:
when the agent returns failed=True before generating any response,
skip writing the user's message to the transcript/DB. This prevents
the session from growing on each failure.
3. gateway/run.py — Smarter error messages: detect context-overflow
failures and suggest /compact or /reset specifically, instead of a
generic 'try again' that will fail identically.
* fix(skills): detect prompt injection patterns and block cache file reads
Adds two security layers to prevent prompt injection via skills hub
cache files (#1558):
1. read_file: blocks direct reads of ~/.hermes/skills/.hub/ directory
(index-cache, catalog files). The 3.5MB clawhub_catalog_v1.json
was the original injection vector — untrusted skill descriptions
in the catalog contained adversarial text that the model executed.
2. skill_view: warns when skills are loaded from outside the trusted
~/.hermes/skills/ directory, and detects common injection patterns
in skill content ("ignore previous instructions", "<system>", etc.).
Cherry-picked from PR #1562 by ygd58.
* fix(tools): chunk long messages in send_message_tool before dispatch (#1552)
Long messages sent via send_message tool or cron delivery silently
failed when exceeding platform limits. Gateway adapters handle this
via truncate_message(), but the standalone senders in send_message_tool
bypassed that entirely.
- Apply truncate_message() chunking in _send_to_platform() before
dispatching to individual platform senders
- Remove naive message[i:i+2000] character split in _send_discord()
in favor of centralized smart splitting
- Attach media files to last chunk only for Telegram
- Add regression tests for chunking and media placement
Cherry-picked from PR #1557 by llbn.
* fix(approval): show full command in dangerous command approval (#1553)
Previously the command was truncated to 80 chars in CLI (with a
[v]iew full option), 500 chars in Discord embeds, and missing entirely
in Telegram/Slack approval messages. Now the full command is always
displayed everywhere:
- CLI: removed 80-char truncation and [v]iew full menu option
- Gateway (TG/Slack): approval_required message includes full command
in a code block
- Discord: embed shows full command up to 4096-char limit
- Windows: skip SIGALRM-based test timeout (Unix-only)
- Updated tests: replaced view-flow tests with direct approval tests
Cherry-picked from PR #1566 by crazywriter1.
* fix(cli): flush stdout during agent loop to prevent macOS display freeze (#1624)
The interrupt polling loop in chat() waited on the queue without
invalidating the prompt_toolkit renderer. On macOS, the StdoutProxy
buffer only flushed on input events, causing the CLI to appear frozen
during tool execution until the user typed a key.
Fix: call _invalidate() on each queue timeout (every ~100ms, throttled
to 150ms) to force the renderer to flush buffered agent output.
* fix(claw): warn when API keys are skipped during OpenClaw migration (#1580)
When --migrate-secrets is not passed (the default), API keys like
OPENROUTER_API_KEY are silently skipped with no warning. Users don't
realize their keys weren't migrated until the agent fails to connect.
Add a post-migration warning with actionable instructions: either
re-run with --migrate-secrets or add the key manually via
hermes config set.
Cherry-picked from PR #1593 by ygd58.
* fix(security): block sandbox backend creds from subprocess env (#1264)
Add Modal and Daytona sandbox credentials to the subprocess env
blocklist so they're not leaked to agent terminal sessions via
printenv/env.
Cherry-picked from PR #1571 by ygd58.
* fix(gateway): cap interrupt recursion depth to prevent resource exhaustion (#816)
When a user sends multiple messages while the agent keeps failing,
_run_agent() calls itself recursively with no depth limit. This can
exhaust stack/memory if the agent is in a failure loop.
Add _MAX_INTERRUPT_DEPTH = 3. When exceeded, the pending message is
logged and the current result is returned instead of recursing deeper.
The log handler duplication bug described in #816 was already fixed
separately (AIAgent.__init__ deduplicates handlers).
* fix(gateway): /model shows active fallback model instead of config default (#1615)
When the agent falls back to a different model (e.g. due to rate
limiting), /model still showed the config default. Now tracks the
effective model/provider after each agent run and displays it.
Cleared when the primary model succeeds again or the user explicitly
switches via /model.
Cherry-picked from PR #1616 by MaxKerkula. Added hasattr guard for
test compatibility.
* feat(gateway): inject reply-to message context for out-of-session replies (#1594)
When a user replies to a Telegram message, check if the quoted text
exists in the current session transcript. If missing (from cron jobs,
background tasks, or old sessions), prepend [Replying to: "..."] to
the message so the agent has context about what's being referenced.
- Add reply_to_text field to MessageEvent (base.py)
- Populate from Telegram's reply_to_message (text or caption)
- Inject context in _handle_message when not found in history
Based on PR #1596 by anpicasso (cherry-picked reply-to feature only,
excluded unrelated /server command and background delegation changes).
* fix: recognize Claude Code OAuth credentials in startup gate (#1455)
The _has_any_provider_configured() startup check didn't look for
Claude Code OAuth credentials (~/.claude/.credentials.json). Users
with only Claude Code auth got the setup wizard instead of starting.
Cherry-picked from PR #1455 by kshitijk4poor.
---------
Co-authored-by: buray <ygd58@users.noreply.github.com>
Co-authored-by: lbn <llbn@users.noreply.github.com>
Co-authored-by: crazywriter1 <53251494+crazywriter1@users.noreply.github.com>
Co-authored-by: Max K <MaxKerkula@users.noreply.github.com>
Co-authored-by: Angello Picasso <angello.picasso@devsu.com>
Co-authored-by: kshitij <kshitijk4poor@users.noreply.github.com>
* fix(security): harden terminal safety and sandbox file writes
Two security improvements:
1. Dangerous command detection: expand shell -c pattern to catch
combined flags (bash -lc, bash -ic, ksh -c) that were previously
undetected. Pattern changed from matching only 'bash -c' to
matching any shell invocation with -c anywhere in the flags.
2. File write sandboxing: add HERMES_WRITE_SAFE_ROOT env var that
constrains all write_file/patch operations to a configured directory
tree. Opt-in — when unset, behavior is unchanged. Useful for
gateway/messaging deployments that should only touch a workspace.
Based on PR #1085 by ismoilh.
* fix: correct "POSIDEON" typo to "POSEIDON" in banner ASCII art
The poseidon skin's banner_logo had the E and I letters swapped,
spelling "POSIDEON-AGENT" instead of "POSEIDON-AGENT".
---------
Co-authored-by: ismoilh <ismoilh@users.noreply.github.com>
Co-authored-by: unmodeled-tyler <unmodeled.tyler@proton.me>
* feat(skills): add bundled neutts optional skill
Add NeuTTS optional skill with CLI scaffold, bootstrap helper, and
sample voice profile. Also fixes skills_hub.py to handle binary
assets (WAV files) during skill installation.
Changes:
- optional-skills/mlops/models/neutts/ — skill + CLI scaffold
- tools/skills_hub.py — binary asset support (read_bytes, write_bytes)
- tests/tools/test_skills_hub.py — regression tests for binary assets
* feat(tts): add NeuTTS as local TTS provider backend
Add NeuTTS as a fourth TTS provider option alongside Edge, ElevenLabs,
and OpenAI. NeuTTS runs fully on-device via neutts_cli — no API key
needed.
Provider behavior:
- Explicit: set tts.provider to 'neutts' in config.yaml
- Fallback: when Edge TTS is unavailable and neutts_cli is installed,
automatically falls back to NeuTTS instead of failing
- check_tts_requirements() now includes NeuTTS in availability checks
NeuTTS outputs WAV natively. For Telegram voice bubbles, ffmpeg
converts to Opus (same pattern as Edge TTS).
Changes:
- tools/tts_tool.py — _generate_neutts(), _check_neutts_available(),
provider dispatch, fallback logic, Opus conversion
- hermes_cli/config.py — tts.neutts config defaults
---------
Co-authored-by: unmodeled-tyler <unmodeled.tyler@proton.me>
Remove HERMES_API_MODE env var. api_mode is now configured where the
endpoint is defined:
- model.api_mode in config.yaml (for the active model config)
- custom_providers[].api_mode (for named custom providers)
Replace _get_configured_api_mode() with _parse_api_mode() which just
validates a value against the whitelist without reading env vars.
Both paths (model config and named custom providers) now read api_mode
from their respective config entries rather than a global override.
* fix: prevent infinite 400 failure loop on context overflow (#1630)
When a gateway session exceeds the model's context window, Anthropic may
return a generic 400 invalid_request_error with just 'Error' as the
message. This bypassed the phrase-based context-length detection,
causing the agent to treat it as a non-retryable client error. Worse,
the failed user message was still persisted to the transcript, making
the session even larger on each attempt — creating an infinite loop.
Three-layer fix:
1. run_agent.py — Fallback heuristic: when a 400 error has a very short
generic message AND the session is large (>40% of context or >80
messages), treat it as a probable context overflow and trigger
compression instead of aborting.
2. run_agent.py + gateway/run.py — Don't persist failed messages:
when the agent returns failed=True before generating any response,
skip writing the user's message to the transcript/DB. This prevents
the session from growing on each failure.
3. gateway/run.py — Smarter error messages: detect context-overflow
failures and suggest /compact or /reset specifically, instead of a
generic 'try again' that will fail identically.
* fix(skills): detect prompt injection patterns and block cache file reads
Adds two security layers to prevent prompt injection via skills hub
cache files (#1558):
1. read_file: blocks direct reads of ~/.hermes/skills/.hub/ directory
(index-cache, catalog files). The 3.5MB clawhub_catalog_v1.json
was the original injection vector — untrusted skill descriptions
in the catalog contained adversarial text that the model executed.
2. skill_view: warns when skills are loaded from outside the trusted
~/.hermes/skills/ directory, and detects common injection patterns
in skill content ("ignore previous instructions", "<system>", etc.).
Cherry-picked from PR #1562 by ygd58.
* fix(tools): chunk long messages in send_message_tool before dispatch (#1552)
Long messages sent via send_message tool or cron delivery silently
failed when exceeding platform limits. Gateway adapters handle this
via truncate_message(), but the standalone senders in send_message_tool
bypassed that entirely.
- Apply truncate_message() chunking in _send_to_platform() before
dispatching to individual platform senders
- Remove naive message[i:i+2000] character split in _send_discord()
in favor of centralized smart splitting
- Attach media files to last chunk only for Telegram
- Add regression tests for chunking and media placement
Cherry-picked from PR #1557 by llbn.
* fix(approval): show full command in dangerous command approval (#1553)
Previously the command was truncated to 80 chars in CLI (with a
[v]iew full option), 500 chars in Discord embeds, and missing entirely
in Telegram/Slack approval messages. Now the full command is always
displayed everywhere:
- CLI: removed 80-char truncation and [v]iew full menu option
- Gateway (TG/Slack): approval_required message includes full command
in a code block
- Discord: embed shows full command up to 4096-char limit
- Windows: skip SIGALRM-based test timeout (Unix-only)
- Updated tests: replaced view-flow tests with direct approval tests
Cherry-picked from PR #1566 by crazywriter1.
* fix(cli): flush stdout during agent loop to prevent macOS display freeze (#1624)
The interrupt polling loop in chat() waited on the queue without
invalidating the prompt_toolkit renderer. On macOS, the StdoutProxy
buffer only flushed on input events, causing the CLI to appear frozen
during tool execution until the user typed a key.
Fix: call _invalidate() on each queue timeout (every ~100ms, throttled
to 150ms) to force the renderer to flush buffered agent output.
* fix(claw): warn when API keys are skipped during OpenClaw migration (#1580)
When --migrate-secrets is not passed (the default), API keys like
OPENROUTER_API_KEY are silently skipped with no warning. Users don't
realize their keys weren't migrated until the agent fails to connect.
Add a post-migration warning with actionable instructions: either
re-run with --migrate-secrets or add the key manually via
hermes config set.
Cherry-picked from PR #1593 by ygd58.
---------
Co-authored-by: buray <ygd58@users.noreply.github.com>
Co-authored-by: lbn <llbn@users.noreply.github.com>
Co-authored-by: crazywriter1 <53251494+crazywriter1@users.noreply.github.com>
Add in-session tool management via /tools disable/enable/list, plus
hermes tools list/disable/enable CLI subcommands. Supports both
built-in toolsets (web, memory) and MCP tools (github:create_issue).
To preserve prompt caching, /tools disable/enable in a chat session
saves the change to config and resets the session cleanly — the user
is asked to confirm before the reset happens.
Also improves prefix matching: /qui now dispatches to /quit instead
of showing ambiguous when longer skill commands like /quint-pipeline
are installed.
Based on PR #1520 by @YanSte.
Co-authored-by: Yannick Stephan <YanSte@users.noreply.github.com>
Add HERMES_API_MODE env var and model.api_mode config field to let
custom OpenAI-compatible endpoints opt into codex_responses mode
without requiring the OpenAI Codex OAuth provider path.
- _get_configured_api_mode() reads HERMES_API_MODE env (precedence)
then model.api_mode from config.yaml; validates against whitelist
- Applied in both _resolve_openrouter_runtime() and
_resolve_named_custom_runtime() (original PR only covered openrouter)
- Fix _dump_api_request_debug() to show /responses URL when in
codex_responses mode instead of always showing /chat/completions
- Tests for config override, env override, invalid values, named
custom providers, and debug dump URL for both API modes
Inspired by PR #1041 by @mxyhi.
Co-authored-by: mxyhi <mxyhi@users.noreply.github.com>
Add support for OpenCode Zen (pay-as-you-go, 35+ curated models) and
OpenCode Go ($10/month subscription, open models) as first-class providers.
Both are OpenAI-compatible endpoints resolved via the generic api_key
provider flow — no custom adapter needed.
Files changed:
- hermes_cli/auth.py — ProviderConfig entries + aliases
- hermes_cli/config.py — OPENCODE_ZEN/GO API key env vars
- hermes_cli/models.py — model catalogs, labels, aliases, provider order
- hermes_cli/main.py — provider labels, menu entries, model flow dispatch
- hermes_cli/setup.py — setup wizard branches (idx 10, 11)
- agent/model_metadata.py — context lengths for all OpenCode models
- agent/auxiliary_client.py — default aux models
- .env.example — documentation
Co-authored-by: DevAgarwal2 <DevAgarwal2@users.noreply.github.com>
Fixes hanging when using /skills install or /skills uninstall from the
TUI — bare input() calls hang inside prompt_toolkit's event loop.
Changes:
- Add skip_confirm parameter to do_install() and do_uninstall()
- Separate --yes/-y (confirmation bypass) from --force (scan override)
in both argparse and slash command handlers
- Update usage hint for /skills uninstall to show [--yes]
The original PR (#1595) accidentally deleted the install_from_quarantine()
call, which would have broken all installs. That bug is not present here.
Based on PR #1595 by 333Alden333.
Co-authored-by: 333Alden333 <333Alden333@users.noreply.github.com>
Add 'custom' to the provider order so custom OpenAI-compatible
endpoints appear in /model list. Probes the endpoint's /models API
to dynamically discover available models.
Changes:
- Add 'custom' to _PROVIDER_ORDER in list_available_providers()
- Add _get_custom_base_url() helper to read model.base_url from config
- Add custom branch in provider_model_ids() using fetch_api_models()
- Custom endpoint detection via base_url presence for has_creds check
Based on PR #1612 by @aashizpoudel.
Co-authored-by: Aashish Poudel <aashizpoudel@users.noreply.github.com>
* feat(cli): two-stage /model autocomplete with ghost text suggestions
- SlashCommandCompleter: Tab-complete providers first (anthropic:, openrouter:, etc.)
then models within the selected provider
- SlashCommandAutoSuggest: inline ghost text for slash commands, subcommands,
and /model provider:model two-stage suggestions
- Custom Tab key binding: accepts provider completion and immediately
re-triggers completions to show that provider's models
- COMMANDS_BY_CATEGORY: structured format with explicit subcommands for
tab completion and ghost text (prompt, reasoning, voice, skills, cron, browser)
- SUBCOMMANDS dict auto-extracted from command definitions
- Model/provider info cached 60s for responsive completions
* fix: repair test regression and restore gold color from PR #1622
- Fix test_unknown_command_still_shows_error: patch _cprint instead of
console.print to match the _cprint switch in process_command()
- Restore gold color on 'Type /help' hint using _DIM + _GOLD constants
instead of bare \033[2m (was losing the #B8860B gold)
- Use _GOLD constant for ambiguous command message for consistency
- Add clarifying comment on SUBCOMMANDS regex fallback
---------
Co-authored-by: Lars van der Zande <lmvanderzande@gmail.com>
- Bump _config_version 8 → 9
- Move stale ANTHROPIC_TOKEN clearing into 'if current_ver < 9' block
so it only runs once during the upgrade, not on every migrate_config()
- ANTHROPIC_TOKEN is still a valid auth path (OAuth flow), so we don't
want to clear it repeatedly — only during the one-time migration from
old setups that left it stale
- Add test_skips_on_version_9_or_later to verify one-time behavior
- All tests set config version 8 to trigger migration
- Remove *** placeholder detection from _sanitize_env_lines (was based on
confusing terminal redaction with literal file content)
- Add migrate_config() logic to clear stale ANTHROPIC_TOKEN when better
credentials exist (ANTHROPIC_API_KEY or Claude Code auto-discovery)
- Old ANTHROPIC_TOKEN values shadow Claude Code credential fallthrough,
breaking auth for users who updated without re-running setup
- Preserves ANTHROPIC_TOKEN when it's the only auth method available
- 3 new migration tests, updated existing tests
Fixes two corruption patterns that break API keys during updates:
1. Concatenated KEY=VALUE pairs on a single line due to missing newlines
(e.g. ANTHROPIC_API_KEY=sk-...OPENAI_BASE_URL=https://...). Uses a
known-keys set to safely detect and split concatenated entries without
false-splitting values that contain uppercase text.
2. Stale KEY=*** placeholder entries left by incomplete setup runs that
never get updated and shadow real credentials.
Changes:
- Add _sanitize_env_lines() that splits concatenated known keys and drops
*** placeholders
- Add sanitize_env_file() public API for explicit repair
- Call sanitization in save_env_value() on every read (self-healing)
- Call sanitize_env_file() at the start of migrate_config() so existing
corrupted files are repaired on update
- 12 new tests covering splits, placeholders, edge cases, and integration
Introduce a cloud browser provider abstraction so users can switch
between Local Browser, Browserbase, and Browser Use (or future providers)
via hermes tools / hermes setup.
Cloud browser providers are behind an ABC (tools/browser_providers/base.py)
so adding a new provider is a single-file addition with no changes to
browser_tool.py internals.
Changes:
- tools/browser_providers/ package with ABC, Browserbase extraction,
and Browser Use provider
- browser_tool.py refactored to use _PROVIDER_REGISTRY + _get_cloud_provider()
(cached) instead of hardcoded _is_local_mode() / _create_browserbase_session()
- tools_config.py: generic _is_provider_active() / _detect_active_provider_index()
replace TTS-only logic; Browser Use added as third browser option
- config.py: BROWSER_USE_API_KEY added to OPTIONAL_ENV_VARS + show_config + allowlist
- subprocess pipe hang fix: agent-browser daemon inherits pipe fds,
communicate() blocks. Replaced with Popen + temp files.
Original PR: #1208
Co-authored-by: ShawnPana <shawnpana@users.noreply.github.com>
* feat: add Vercel AI Gateway as a first-class provider
Adds AI Gateway (ai-gateway.vercel.sh) as a new inference provider
with AI_GATEWAY_API_KEY authentication, live model discovery, and
reasoning support via extra_body.reasoning.
Based on PR #1492 by jerilynzheng.
* feat: add AI Gateway to setup wizard, doctor, and fallback providers
* test: add AI Gateway to api_key_providers test suite
* feat: add AI Gateway to hermes model CLI and model metadata
Wire AI Gateway into the interactive model selection menu and add
context lengths for AI Gateway model IDs in model_metadata.py.
* feat: use claude-haiku-4.5 as AI Gateway auxiliary model
* revert: use gemini-3-flash as AI Gateway auxiliary model
* fix: move AI Gateway below established providers in selection order
---------
Co-authored-by: jerilynzheng <jerilynzheng@users.noreply.github.com>
Co-authored-by: jerilynzheng <zheng.jerilyn@gmail.com>
* refactor: centralize slash command registry
Replace 7+ scattered command definition sites with a single
CommandDef registry in hermes_cli/commands.py. All downstream
consumers now derive from this registry:
- CLI process_command() resolves aliases via resolve_command()
- Gateway _known_commands uses GATEWAY_KNOWN_COMMANDS frozenset
- Gateway help text generated by gateway_help_lines()
- Telegram BotCommands generated by telegram_bot_commands()
- Slack subcommand map generated by slack_subcommand_map()
Adding a command or alias is now a one-line change to
COMMAND_REGISTRY instead of touching 6+ files.
Bugfixes included:
- Telegram now registers /rollback, /background (were missing)
- Slack now has /voice, /update, /reload-mcp (were missing)
- Gateway duplicate 'reasoning' dispatch (dead code) removed
- Gateway help text can no longer drift from CLI help
Backwards-compatible: COMMANDS and COMMANDS_BY_CATEGORY dicts are
rebuilt from the registry, so existing imports work unchanged.
* docs: update developer docs for centralized command registry
Update AGENTS.md with full 'Slash Command Registry' and 'Adding a
Slash Command' sections covering CommandDef fields, registry helpers,
and the one-line alias workflow.
Also update:
- CONTRIBUTING.md: commands.py description
- website/docs/reference/slash-commands.md: reference central registry
- docs/plans/centralize-command-registry.md: mark COMPLETED
- plans/checkpoint-rollback.md: reference new pattern
- hermes-agent-dev skill: architecture table
* chore: remove stale plan docs
Repair stale launchd/systemd definitions during install and
teach launchd start to reload unloaded jobs before retrying.
Stop masking service restart failures by falling back to a
foreground gateway when a configured service manager is still
broken.
Refs: #1613
* fix: Anthropic OAuth compatibility — Claude Code identity fingerprinting
Anthropic routes OAuth/subscription requests based on Claude Code's
identity markers. Without them, requests get intermittent 500 errors
(~25% failure rate observed). This matches what pi-ai (clawdbot) and
OpenCode both implement for OAuth compatibility.
Changes (OAuth tokens only — API key users unaffected):
1. Headers: user-agent 'claude-cli/2.1.2 (external, cli)' + x-app 'cli'
2. System prompt: prepend 'You are Claude Code, Anthropic's official CLI'
3. System prompt sanitization: replace Hermes/Nous references
4. Tool names: prefix with 'mcp_' (Claude Code convention for non-native tools)
5. Tool name stripping: remove 'mcp_' prefix from response tool calls
Before: 9/12 OK, 1 hard fail, 4 needed retries (~25% error rate)
After: 16/16 OK, 0 failures, 0 retries (0% error rate)
* fix: auto-detect DBUS_SESSION_BUS_ADDRESS for systemctl --user on headless servers
On SSH sessions to headless servers, DBUS_SESSION_BUS_ADDRESS and
XDG_RUNTIME_DIR may not be set even when the user's systemd instance
is running via linger. This causes 'systemctl --user' to fail with
'Failed to connect to bus: No medium found', breaking gateway
restart/start/stop as a service and falling back to foreground mode.
Add _ensure_user_systemd_env() that detects the standard D-Bus socket
at /run/user/<UID>/bus and sets the env vars before any systemctl --user
call. Called from _systemctl_cmd() so all existing call sites benefit
automatically with zero changes.
Fixes: gateway restart falling back to foreground on headless servers
* fix: show linger guidance when gateway restart fails during update and gateway restart
When systemctl --user restart fails during 'hermes update' or
'hermes gateway restart', check linger status and tell the user
exactly what to run (sudo -S -p '' loginctl enable-linger) instead of
silently falling back to foreground mode.
Also applies _ensure_user_systemd_env() to the raw systemctl calls
in cmd_update so they work properly on SSH sessions where D-Bus
env vars are missing.
* feat: add optional smart model routing
Add a conservative cheap-vs-strong routing option that can send very short/simple turns to a cheaper model across providers while keeping the primary model for complex work. Wire it through CLI, gateway, and cron, and document the config.yaml workflow.
* fix(gateway): remove recursive ExecStop from systemd units, extend TimeoutStopSec to 60s
* fix(gateway): avoid recursive ExecStop in user systemd unit
* fix: extend ExecStop removal and TimeoutStopSec=60 to system unit
The cherry-picked PR #1448 fix only covered the user systemd unit.
The system unit had the same TimeoutStopSec=15 and could benefit
from the same 60s timeout for clean shutdown. Also adds a regression
test for the system unit.
---------
Co-authored-by: Ninja <ninja@local>
* feat(skills): add blender-mcp optional skill for 3D modeling
Control a running Blender instance from Hermes via socket connection
to the blender-mcp addon (port 9876). Supports creating 3D objects,
materials, animations, and running arbitrary bpy code.
Placed in optional-skills/ since it requires Blender 4.3+ desktop
with a third-party addon manually started each session.
* feat(acp): support slash commands in ACP adapter (#1532)
Adds /help, /model, /tools, /context, /reset, /compact, /version
to the ACP adapter (VS Code, Zed, JetBrains). Commands are handled
directly in the server without instantiating the TUI — each command
queries agent/session state and returns plain text.
Unrecognized /commands fall through to the LLM as normal messages.
/model uses detect_provider_for_model() for auto-detection when
switching models, matching the CLI and gateway behavior.
Fixes#1402
* fix(logging): improve error logging in session search tool (#1533)
* fix(gateway): restart on retryable startup failures (#1517)
* feat(email): add skip_attachments option via config.yaml
* feat(email): add skip_attachments option via config.yaml
Adds a config.yaml-driven option to skip email attachments in the
gateway email adapter. Useful for malware protection and bandwidth
savings.
Configure in config.yaml:
platforms:
email:
skip_attachments: true
Based on PR #1521 by @an420eth, changed from env var to config.yaml
(via PlatformConfig.extra) to match the project's config-first pattern.
* docs: document skip_attachments option for email adapter
* fix(telegram): retry on transient TLS failures during connect and send
Add exponential-backoff retry (3 attempts) around initialize() to
handle transient TLS resets during gateway startup. Also catches
TimedOut and OSError in addition to NetworkError.
Add exponential-backoff retry (3 attempts) around send_message() for
NetworkError during message delivery, wrapping the existing Markdown
fallback logic.
Both imports are guarded with try/except ImportError for test
environments where telegram is mocked.
Based on PR #1527 by cmd8. Closes#1526.
* feat: permissive block_anchor thresholds and unicode normalization (#1539)
Salvaged from PR #1528 by an420eth. Closes#517.
Improves _strategy_block_anchor in fuzzy_match.py:
- Add unicode normalization (smart quotes, em/en-dashes, ellipsis,
non-breaking spaces → ASCII) so LLM-produced unicode artifacts
don't break anchor line matching
- Lower thresholds: 0.10 for unique matches (was 0.70), 0.30 for
multiple candidates — if first/last lines match exactly, the
block is almost certainly correct
- Use original (non-normalized) content for offset calculation to
preserve correct character positions
Tested: 3 new scenarios fixed (em-dash anchors, non-breaking space
anchors, very-low-similarity unique matches), zero regressions on
all 9 existing fuzzy match tests.
Co-authored-by: an420eth <an420eth@users.noreply.github.com>
* feat(cli): add file path autocomplete in the input prompt (#1545)
When typing a path-like token (./ ../ ~/ / or containing /),
the CLI now shows filesystem completions in the dropdown menu.
Directories show a trailing slash and 'dir' label; files show
their size. Completions are case-insensitive and capped at 30
entries.
Triggered by tokens like:
edit ./src/ma → shows ./src/main.py, ./src/manifest.json, ...
check ~/doc → shows ~/docs/, ~/documents/, ...
read /etc/hos → shows /etc/hosts, /etc/hostname, ...
open tools/reg → shows tools/registry.py
Slash command autocomplete (/help, /model, etc.) is unaffected —
it still triggers when the input starts with /.
Inspired by OpenCode PR #145 (file path completion menu).
Implementation:
- hermes_cli/commands.py: _extract_path_word() detects path-like
tokens, _path_completions() yields filesystem Completions with
size labels, get_completions() routes to paths vs slash commands
- tests/hermes_cli/test_path_completion.py: 26 tests covering
path extraction, prefix filtering, directory markers, home
expansion, case-insensitivity, integration with slash commands
* feat(privacy): redact PII from LLM context when privacy.redact_pii is enabled
Add privacy.redact_pii config option (boolean, default false). When
enabled, the gateway redacts personally identifiable information from
the system prompt before sending it to the LLM provider:
- Phone numbers (user IDs on WhatsApp/Signal) → hashed to user_<sha256>
- User IDs → hashed to user_<sha256>
- Chat IDs → numeric portion hashed, platform prefix preserved
- Home channel IDs → hashed
- Names/usernames → NOT affected (user-chosen, publicly visible)
Hashes are deterministic (same user → same hash) so the model can
still distinguish users in group chats. Routing and delivery use
the original values internally — redaction only affects LLM context.
Inspired by OpenClaw PR #47959.
* fix(privacy): skip PII redaction on Discord/Slack (mentions need real IDs)
Discord uses <@user_id> for mentions and Slack uses <@U12345> — the LLM
needs the real ID to tag users. Redaction now only applies to WhatsApp,
Signal, and Telegram where IDs are pure routing metadata.
Add 4 platform-specific tests covering Discord, WhatsApp, Signal, Slack.
* feat: smart approvals + /stop command (inspired by OpenAI Codex)
* feat: smart approvals — LLM-based risk assessment for dangerous commands
Adds a 'smart' approval mode that uses the auxiliary LLM to assess
whether a flagged command is genuinely dangerous or a false positive,
auto-approving low-risk commands without prompting the user.
Inspired by OpenAI Codex's Smart Approvals guardian subagent
(openai/codex#13860).
Config (config.yaml):
approvals:
mode: manual # manual (default), smart, off
Modes:
- manual — current behavior, always prompt the user
- smart — aux LLM evaluates risk: APPROVE (auto-allow), DENY (block),
or ESCALATE (fall through to manual prompt)
- off — skip all approval prompts (equivalent to --yolo)
When smart mode auto-approves, the pattern gets session-level approval
so subsequent uses of the same pattern don't trigger another LLM call.
When it denies, the command is blocked without user prompt. When
uncertain, it escalates to the normal manual approval flow.
The LLM prompt is carefully scoped: it sees only the command text and
the flagged reason, assesses actual risk vs false positive, and returns
a single-word verdict.
* feat: make smart approval model configurable via config.yaml
Adds auxiliary.approval section to config.yaml with the same
provider/model/base_url/api_key pattern as other aux tasks (vision,
web_extract, compression, etc.).
Config:
auxiliary:
approval:
provider: auto
model: '' # fast/cheap model recommended
base_url: ''
api_key: ''
Bridged to env vars in both CLI and gateway paths so the aux client
picks them up automatically.
* feat: add /stop command to kill all background processes
Adds a /stop slash command that kills all running background processes
at once. Currently users have to process(list) then process(kill) for
each one individually.
Inspired by OpenAI Codex's separation of interrupt (Ctrl+C stops current
turn) from /stop (cleans up background processes). See openai/codex#14602.
Ctrl+C continues to only interrupt the active agent turn — background
dev servers, watchers, etc. are preserved. /stop is the explicit way
to clean them all up.
* feat: first-class plugin architecture + hide status bar cost by default (#1544)
The persistent status bar now shows context %, token counts, and
duration but NOT $ cost by default. Cost display is opt-in via:
display:
show_cost: true
in config.yaml, or: hermes config set display.show_cost true
The /usage command still shows full cost breakdown since the user
explicitly asked for it — this only affects the always-visible bar.
Status bar without cost:
⚕ claude-sonnet-4 │ 12K/200K │ 6% │ 15m
Status bar with show_cost: true:
⚕ claude-sonnet-4 │ 12K/200K │ 6% │ $0.06 │ 15m
* feat: improve memory prioritization + aggressive skill updates (inspired by OpenAI Codex)
* feat: improve memory prioritization — user preferences over procedural knowledge
Inspired by OpenAI Codex's memory prompt improvements (openai/codex#14493)
which focus memory writes on user preferences and recurring patterns
rather than procedural task details.
Key insight: 'Optimize for reducing future user steering — the most
valuable memory prevents the user from having to repeat themselves.'
Changes:
- MEMORY_GUIDANCE (prompt_builder.py): added prioritization hierarchy
and the core principle about reducing user steering
- MEMORY_SCHEMA (memory_tool.py): reordered WHEN TO SAVE list to put
corrections first, added explicit PRIORITY guidance
- Memory nudge (run_agent.py): now asks specifically about preferences,
corrections, and workflow patterns instead of generic 'anything'
- Memory flush (run_agent.py): now instructs to prioritize user
preferences and corrections over task-specific details
* feat: more aggressive skill creation and update prompting
Press harder on skill updates — the agent should proactively patch
skills when it encounters issues during use, not wait to be asked.
Changes:
- SKILLS_GUIDANCE: 'consider saving' → 'save'; added explicit instruction
to patch skills immediately when found outdated/wrong
- Skills header: added instruction to update loaded skills before finishing
if they had missing steps or wrong commands
- Skill nudge: more assertive ('save the approach' not 'consider saving'),
now also prompts for updating existing skills used in the task
- Skill nudge interval: lowered default from 15 to 10 iterations
- skill_manage schema: added 'patch it immediately' to update triggers
* feat: first-class plugin architecture (#1555)
Plugin system for extending Hermes with custom tools, hooks, and
integrations — no source code changes required.
Core system (hermes_cli/plugins.py):
- Plugin discovery from ~/.hermes/plugins/, .hermes/plugins/, and
pip entry_points (hermes_agent.plugins group)
- PluginContext with register_tool() and register_hook()
- 6 lifecycle hooks: pre/post tool_call, pre/post llm_call,
on_session_start/end
- Namespace package handling for relative imports in plugins
- Graceful error isolation — broken plugins never crash the agent
Integration (model_tools.py):
- Plugin discovery runs after built-in + MCP tools
- Plugin tools bypass toolset filter via get_plugin_tool_names()
- Pre/post tool call hooks fire in handle_function_call()
CLI:
- /plugins command shows loaded plugins, tool counts, status
- Added to COMMANDS dict for autocomplete
Docs:
- Getting started guide (build-a-hermes-plugin.md) — full tutorial
building a calculator plugin step by step
- Reference page (features/plugins.md) — quick overview + tables
- Covers: file structure, schemas, handlers, hooks, data files,
bundled skills, env var gating, pip distribution, common mistakes
Tests: 16 tests covering discovery, loading, hooks, tool visibility.
* feat: add /bg as alias for /background slash command
Adds /bg alias across CLI, gateway, and Slack platform adapter.
Updates help text, autocomplete, known_commands set, and dispatch
logic. Includes tests for the new alias.
* docs: add plan for centralized slash command registry
Scopes a refactor to replace 7+ scattered command definition sites
with a single CommandDef registry in hermes_cli/commands.py. Includes
derived helper functions for gateway help text, Telegram BotCommands,
Slack subcommand maps, and alias resolution.
Documents current drift (Telegram missing /rollback + /background,
Slack missing /voice + /update, gateway dead code) that the refactor
fixes for free.
---------
Co-authored-by: Ninja <ninja@local>
Co-authored-by: alireza78a <alireza78a@users.noreply.github.com>
Co-authored-by: Oktay Aydin <113846926+aydnOktay@users.noreply.github.com>
Co-authored-by: JP Lew <polydegen@protonmail.com>
Co-authored-by: an420eth <an420eth@users.noreply.github.com>
* feat: add optional smart model routing
Add a conservative cheap-vs-strong routing option that can send very short/simple turns to a cheaper model across providers while keeping the primary model for complex work. Wire it through CLI, gateway, and cron, and document the config.yaml workflow.
* fix(gateway): remove recursive ExecStop from systemd units, extend TimeoutStopSec to 60s
* fix(gateway): avoid recursive ExecStop in user systemd unit
* fix: extend ExecStop removal and TimeoutStopSec=60 to system unit
The cherry-picked PR #1448 fix only covered the user systemd unit.
The system unit had the same TimeoutStopSec=15 and could benefit
from the same 60s timeout for clean shutdown. Also adds a regression
test for the system unit.
---------
Co-authored-by: Ninja <ninja@local>
* feat(skills): add blender-mcp optional skill for 3D modeling
Control a running Blender instance from Hermes via socket connection
to the blender-mcp addon (port 9876). Supports creating 3D objects,
materials, animations, and running arbitrary bpy code.
Placed in optional-skills/ since it requires Blender 4.3+ desktop
with a third-party addon manually started each session.
* feat(acp): support slash commands in ACP adapter (#1532)
Adds /help, /model, /tools, /context, /reset, /compact, /version
to the ACP adapter (VS Code, Zed, JetBrains). Commands are handled
directly in the server without instantiating the TUI — each command
queries agent/session state and returns plain text.
Unrecognized /commands fall through to the LLM as normal messages.
/model uses detect_provider_for_model() for auto-detection when
switching models, matching the CLI and gateway behavior.
Fixes#1402
* fix(logging): improve error logging in session search tool (#1533)
* fix(gateway): restart on retryable startup failures (#1517)
* feat(email): add skip_attachments option via config.yaml
* feat(email): add skip_attachments option via config.yaml
Adds a config.yaml-driven option to skip email attachments in the
gateway email adapter. Useful for malware protection and bandwidth
savings.
Configure in config.yaml:
platforms:
email:
skip_attachments: true
Based on PR #1521 by @an420eth, changed from env var to config.yaml
(via PlatformConfig.extra) to match the project's config-first pattern.
* docs: document skip_attachments option for email adapter
* fix(telegram): retry on transient TLS failures during connect and send
Add exponential-backoff retry (3 attempts) around initialize() to
handle transient TLS resets during gateway startup. Also catches
TimedOut and OSError in addition to NetworkError.
Add exponential-backoff retry (3 attempts) around send_message() for
NetworkError during message delivery, wrapping the existing Markdown
fallback logic.
Both imports are guarded with try/except ImportError for test
environments where telegram is mocked.
Based on PR #1527 by cmd8. Closes#1526.
* feat: permissive block_anchor thresholds and unicode normalization (#1539)
Salvaged from PR #1528 by an420eth. Closes#517.
Improves _strategy_block_anchor in fuzzy_match.py:
- Add unicode normalization (smart quotes, em/en-dashes, ellipsis,
non-breaking spaces → ASCII) so LLM-produced unicode artifacts
don't break anchor line matching
- Lower thresholds: 0.10 for unique matches (was 0.70), 0.30 for
multiple candidates — if first/last lines match exactly, the
block is almost certainly correct
- Use original (non-normalized) content for offset calculation to
preserve correct character positions
Tested: 3 new scenarios fixed (em-dash anchors, non-breaking space
anchors, very-low-similarity unique matches), zero regressions on
all 9 existing fuzzy match tests.
Co-authored-by: an420eth <an420eth@users.noreply.github.com>
* feat(cli): add file path autocomplete in the input prompt (#1545)
When typing a path-like token (./ ../ ~/ / or containing /),
the CLI now shows filesystem completions in the dropdown menu.
Directories show a trailing slash and 'dir' label; files show
their size. Completions are case-insensitive and capped at 30
entries.
Triggered by tokens like:
edit ./src/ma → shows ./src/main.py, ./src/manifest.json, ...
check ~/doc → shows ~/docs/, ~/documents/, ...
read /etc/hos → shows /etc/hosts, /etc/hostname, ...
open tools/reg → shows tools/registry.py
Slash command autocomplete (/help, /model, etc.) is unaffected —
it still triggers when the input starts with /.
Inspired by OpenCode PR #145 (file path completion menu).
Implementation:
- hermes_cli/commands.py: _extract_path_word() detects path-like
tokens, _path_completions() yields filesystem Completions with
size labels, get_completions() routes to paths vs slash commands
- tests/hermes_cli/test_path_completion.py: 26 tests covering
path extraction, prefix filtering, directory markers, home
expansion, case-insensitivity, integration with slash commands
* feat(privacy): redact PII from LLM context when privacy.redact_pii is enabled
Add privacy.redact_pii config option (boolean, default false). When
enabled, the gateway redacts personally identifiable information from
the system prompt before sending it to the LLM provider:
- Phone numbers (user IDs on WhatsApp/Signal) → hashed to user_<sha256>
- User IDs → hashed to user_<sha256>
- Chat IDs → numeric portion hashed, platform prefix preserved
- Home channel IDs → hashed
- Names/usernames → NOT affected (user-chosen, publicly visible)
Hashes are deterministic (same user → same hash) so the model can
still distinguish users in group chats. Routing and delivery use
the original values internally — redaction only affects LLM context.
Inspired by OpenClaw PR #47959.
* fix(privacy): skip PII redaction on Discord/Slack (mentions need real IDs)
Discord uses <@user_id> for mentions and Slack uses <@U12345> — the LLM
needs the real ID to tag users. Redaction now only applies to WhatsApp,
Signal, and Telegram where IDs are pure routing metadata.
Add 4 platform-specific tests covering Discord, WhatsApp, Signal, Slack.
* feat: smart approvals + /stop command (inspired by OpenAI Codex)
* feat: smart approvals — LLM-based risk assessment for dangerous commands
Adds a 'smart' approval mode that uses the auxiliary LLM to assess
whether a flagged command is genuinely dangerous or a false positive,
auto-approving low-risk commands without prompting the user.
Inspired by OpenAI Codex's Smart Approvals guardian subagent
(openai/codex#13860).
Config (config.yaml):
approvals:
mode: manual # manual (default), smart, off
Modes:
- manual — current behavior, always prompt the user
- smart — aux LLM evaluates risk: APPROVE (auto-allow), DENY (block),
or ESCALATE (fall through to manual prompt)
- off — skip all approval prompts (equivalent to --yolo)
When smart mode auto-approves, the pattern gets session-level approval
so subsequent uses of the same pattern don't trigger another LLM call.
When it denies, the command is blocked without user prompt. When
uncertain, it escalates to the normal manual approval flow.
The LLM prompt is carefully scoped: it sees only the command text and
the flagged reason, assesses actual risk vs false positive, and returns
a single-word verdict.
* feat: make smart approval model configurable via config.yaml
Adds auxiliary.approval section to config.yaml with the same
provider/model/base_url/api_key pattern as other aux tasks (vision,
web_extract, compression, etc.).
Config:
auxiliary:
approval:
provider: auto
model: '' # fast/cheap model recommended
base_url: ''
api_key: ''
Bridged to env vars in both CLI and gateway paths so the aux client
picks them up automatically.
* feat: add /stop command to kill all background processes
Adds a /stop slash command that kills all running background processes
at once. Currently users have to process(list) then process(kill) for
each one individually.
Inspired by OpenAI Codex's separation of interrupt (Ctrl+C stops current
turn) from /stop (cleans up background processes). See openai/codex#14602.
Ctrl+C continues to only interrupt the active agent turn — background
dev servers, watchers, etc. are preserved. /stop is the explicit way
to clean them all up.
* feat: first-class plugin architecture + hide status bar cost by default (#1544)
The persistent status bar now shows context %, token counts, and
duration but NOT $ cost by default. Cost display is opt-in via:
display:
show_cost: true
in config.yaml, or: hermes config set display.show_cost true
The /usage command still shows full cost breakdown since the user
explicitly asked for it — this only affects the always-visible bar.
Status bar without cost:
⚕ claude-sonnet-4 │ 12K/200K │ 6% │ 15m
Status bar with show_cost: true:
⚕ claude-sonnet-4 │ 12K/200K │ 6% │ $0.06 │ 15m
* feat: improve memory prioritization + aggressive skill updates (inspired by OpenAI Codex)
* feat: improve memory prioritization — user preferences over procedural knowledge
Inspired by OpenAI Codex's memory prompt improvements (openai/codex#14493)
which focus memory writes on user preferences and recurring patterns
rather than procedural task details.
Key insight: 'Optimize for reducing future user steering — the most
valuable memory prevents the user from having to repeat themselves.'
Changes:
- MEMORY_GUIDANCE (prompt_builder.py): added prioritization hierarchy
and the core principle about reducing user steering
- MEMORY_SCHEMA (memory_tool.py): reordered WHEN TO SAVE list to put
corrections first, added explicit PRIORITY guidance
- Memory nudge (run_agent.py): now asks specifically about preferences,
corrections, and workflow patterns instead of generic 'anything'
- Memory flush (run_agent.py): now instructs to prioritize user
preferences and corrections over task-specific details
* feat: more aggressive skill creation and update prompting
Press harder on skill updates — the agent should proactively patch
skills when it encounters issues during use, not wait to be asked.
Changes:
- SKILLS_GUIDANCE: 'consider saving' → 'save'; added explicit instruction
to patch skills immediately when found outdated/wrong
- Skills header: added instruction to update loaded skills before finishing
if they had missing steps or wrong commands
- Skill nudge: more assertive ('save the approach' not 'consider saving'),
now also prompts for updating existing skills used in the task
- Skill nudge interval: lowered default from 15 to 10 iterations
- skill_manage schema: added 'patch it immediately' to update triggers
* feat: first-class plugin architecture (#1555)
Plugin system for extending Hermes with custom tools, hooks, and
integrations — no source code changes required.
Core system (hermes_cli/plugins.py):
- Plugin discovery from ~/.hermes/plugins/, .hermes/plugins/, and
pip entry_points (hermes_agent.plugins group)
- PluginContext with register_tool() and register_hook()
- 6 lifecycle hooks: pre/post tool_call, pre/post llm_call,
on_session_start/end
- Namespace package handling for relative imports in plugins
- Graceful error isolation — broken plugins never crash the agent
Integration (model_tools.py):
- Plugin discovery runs after built-in + MCP tools
- Plugin tools bypass toolset filter via get_plugin_tool_names()
- Pre/post tool call hooks fire in handle_function_call()
CLI:
- /plugins command shows loaded plugins, tool counts, status
- Added to COMMANDS dict for autocomplete
Docs:
- Getting started guide (build-a-hermes-plugin.md) — full tutorial
building a calculator plugin step by step
- Reference page (features/plugins.md) — quick overview + tables
- Covers: file structure, schemas, handlers, hooks, data files,
bundled skills, env var gating, pip distribution, common mistakes
Tests: 16 tests covering discovery, loading, hooks, tool visibility.
* fix: hermes update causes dual gateways on macOS (launchd)
Three bugs worked together to create the dual-gateway problem:
1. cmd_update only checked systemd for gateway restart, completely
ignoring launchd on macOS. After killing the PID it would print
'Restart it with: hermes gateway run' even when launchd was about
to auto-respawn the process.
2. launchd's KeepAlive.SuccessfulExit=false respawns the gateway
after SIGTERM (non-zero exit), so the user's manual restart
created a second instance.
3. The launchd plist lacked --replace (systemd had it), so the
respawned gateway didn't kill stale instances on startup.
Fixes:
- Add --replace to launchd ProgramArguments (matches systemd)
- Add launchd detection to cmd_update's auto-restart logic
- Print 'auto-restart via launchd' instead of manual restart hint
* fix: add launchd plist auto-refresh + explicit restart in cmd_update
Two integration issues with the initial fix:
1. Existing macOS users with old plist (no --replace) would never
get the fix until manual uninstall/reinstall. Added
refresh_launchd_plist_if_needed() — mirrors the existing
refresh_systemd_unit_if_needed(). Called from launchd_start(),
launchd_restart(), and cmd_update.
2. cmd_update relied on KeepAlive respawn after SIGTERM rather than
explicit launchctl stop/start. This caused races: launchd would
respawn the old process before the PID file was cleaned up.
Now does explicit stop+start (matching how systemd gets an
explicit systemctl restart), with plist refresh first so the
new --replace flag is picked up.
---------
Co-authored-by: Ninja <ninja@local>
Co-authored-by: alireza78a <alireza78a@users.noreply.github.com>
Co-authored-by: Oktay Aydin <113846926+aydnOktay@users.noreply.github.com>
Co-authored-by: JP Lew <polydegen@protonmail.com>
Co-authored-by: an420eth <an420eth@users.noreply.github.com>
Streaming is now off by default for both CLI and gateway. Users opt in:
CLI (config.yaml):
display:
streaming: true
Gateway (config.yaml):
streaming:
enabled: true
This lets early adopters test streaming while existing users see zero
change. Once we have enough field validation, we flip the default to
true in a subsequent release.
The persistent status bar now shows context %, token counts, and
duration but NOT $ cost by default. Cost display is opt-in via:
display:
show_cost: true
in config.yaml, or: hermes config set display.show_cost true
The /usage command still shows full cost breakdown since the user
explicitly asked for it — this only affects the always-visible bar.
Status bar without cost:
⚕ claude-sonnet-4 │ 12K/200K │ 6% │ 15m
Status bar with show_cost: true:
⚕ claude-sonnet-4 │ 12K/200K │ 6% │ $0.06 │ 15m
Add /browser slash command for connecting browser tools to the user's
live Chrome instance via Chrome DevTools Protocol:
/browser connect — connect to Chrome on localhost:9222
/browser connect ws://host:port — custom CDP endpoint
/browser disconnect — revert to default (headless/Browserbase)
/browser status — show current browser mode + connectivity
When connected:
- All browser tools (navigate, snapshot, click, etc.) control the
user's real Chrome — logged-in sessions, cookies, open tabs
- Platform-specific Chrome launch instructions are shown
- Port connectivity is tested immediately
- A context message is injected so the model knows it's controlling
a live browser and should be mindful of user's open tabs
Implementation:
- BROWSER_CDP_URL env var drives the backend selection in browser_tool.py
- New _create_cdp_session() creates sessions using the CDP override
- _get_cdp_override() checked before local/Browserbase selection
- Existing agent-browser --cdp flag handles the actual CDP connection
Inspired by OpenClaw's browser profile system.
* feat: smart approvals — LLM-based risk assessment for dangerous commands
Adds a 'smart' approval mode that uses the auxiliary LLM to assess
whether a flagged command is genuinely dangerous or a false positive,
auto-approving low-risk commands without prompting the user.
Inspired by OpenAI Codex's Smart Approvals guardian subagent
(openai/codex#13860).
Config (config.yaml):
approvals:
mode: manual # manual (default), smart, off
Modes:
- manual — current behavior, always prompt the user
- smart — aux LLM evaluates risk: APPROVE (auto-allow), DENY (block),
or ESCALATE (fall through to manual prompt)
- off — skip all approval prompts (equivalent to --yolo)
When smart mode auto-approves, the pattern gets session-level approval
so subsequent uses of the same pattern don't trigger another LLM call.
When it denies, the command is blocked without user prompt. When
uncertain, it escalates to the normal manual approval flow.
The LLM prompt is carefully scoped: it sees only the command text and
the flagged reason, assesses actual risk vs false positive, and returns
a single-word verdict.
* feat: make smart approval model configurable via config.yaml
Adds auxiliary.approval section to config.yaml with the same
provider/model/base_url/api_key pattern as other aux tasks (vision,
web_extract, compression, etc.).
Config:
auxiliary:
approval:
provider: auto
model: '' # fast/cheap model recommended
base_url: ''
api_key: ''
Bridged to env vars in both CLI and gateway paths so the aux client
picks them up automatically.
* feat: add /stop command to kill all background processes
Adds a /stop slash command that kills all running background processes
at once. Currently users have to process(list) then process(kill) for
each one individually.
Inspired by OpenAI Codex's separation of interrupt (Ctrl+C stops current
turn) from /stop (cleans up background processes). See openai/codex#14602.
Ctrl+C continues to only interrupt the active agent turn — background
dev servers, watchers, etc. are preserved. /stop is the explicit way
to clean them all up.
When typing a path-like token (./ ../ ~/ / or containing /),
the CLI now shows filesystem completions in the dropdown menu.
Directories show a trailing slash and 'dir' label; files show
their size. Completions are case-insensitive and capped at 30
entries.
Triggered by tokens like:
edit ./src/ma → shows ./src/main.py, ./src/manifest.json, ...
check ~/doc → shows ~/docs/, ~/documents/, ...
read /etc/hos → shows /etc/hosts, /etc/hostname, ...
open tools/reg → shows tools/registry.py
Slash command autocomplete (/help, /model, etc.) is unaffected —
it still triggers when the input starts with /.
Inspired by OpenCode PR #145 (file path completion menu).
Implementation:
- hermes_cli/commands.py: _extract_path_word() detects path-like
tokens, _path_completions() yields filesystem Completions with
size labels, get_completions() routes to paths vs slash commands
- tests/hermes_cli/test_path_completion.py: 26 tests covering
path extraction, prefix filtering, directory markers, home
expansion, case-insensitivity, integration with slash commands
Add privacy.redact_pii config option (boolean, default false). When
enabled, the gateway redacts personally identifiable information from
the system prompt before sending it to the LLM provider:
- Phone numbers (user IDs on WhatsApp/Signal) → hashed to user_<sha256>
- User IDs → hashed to user_<sha256>
- Chat IDs → numeric portion hashed, platform prefix preserved
- Home channel IDs → hashed
- Names/usernames → NOT affected (user-chosen, publicly visible)
Hashes are deterministic (same user → same hash) so the model can
still distinguish users in group chats. Routing and delivery use
the original values internally — redaction only affects LLM context.
Inspired by OpenClaw PR #47959.
Keep Docker sandboxes isolated by default. Add an explicit terminal.docker_mount_cwd_to_workspace opt-in, thread it through terminal/file environment creation, and document the security tradeoff and config.yaml workflow clearly.
* fix(gateway): avoid recursive ExecStop in user systemd unit
* fix: extend ExecStop removal and TimeoutStopSec=60 to system unit
The cherry-picked PR #1448 fix only covered the user systemd unit.
The system unit had the same TimeoutStopSec=15 and could benefit
from the same 60s timeout for clean shutdown. Also adds a regression
test for the system unit.
---------
Co-authored-by: Ninja <ninja@local>
Checkpoint & rollback upgrades:
1. Enabled by default — checkpoints are now on for all new sessions.
Zero cost when no file-mutating tools fire. Disable with
checkpoints.enabled: false in config.yaml.
2. Diff preview — /rollback diff <N> shows a git diff between the
checkpoint and current working tree before committing to a restore.
3. File-level restore — /rollback <N> <file> restores a single file
from a checkpoint instead of the entire directory.
4. Conversation undo on rollback — when restoring files, the last
chat turn is automatically undone so the agent's context matches
the restored filesystem state.
5. Terminal command checkpoints — destructive terminal commands (rm,
mv, sed -i, truncate, git reset/clean, output redirects) now
trigger automatic checkpoints before execution. Previously only
write_file and patch were covered.
6. Change summary in listing — /rollback now shows file count and
+insertions/-deletions for each checkpoint.
7. Fixed dead code — removed duplicate _run_git call in
list_checkpoints with nonsensical --all if False condition.
8. Updated help text — /rollback with no args now shows available
subcommands (diff, file-level restore).
Multiple Hermes installations on the same machine now get unique
systemd service names:
- Default ~/.hermes → hermes-gateway (backward compatible)
- Custom HERMES_HOME → hermes-gateway-<8-char-hash>
Changes:
- Add get_service_name() in hermes_cli/gateway.py that derives a
deterministic service name from HERMES_HOME via SHA256
- Replace all hardcoded 'hermes-gateway' systemd references with
get_service_name() across gateway.py, main.py, status.py, uninstall.py
- Add HERMES_HOME env var to both user and system systemd unit templates
so the gateway process uses the correct installation
- Update tests to use get_service_name() in assertions
cmd_update only ran 'systemctl --user restart hermes-gateway', which
left manually-started gateway processes alive, causing duplicates.
Now uses get_running_pid() from gateway/status.py (scoped to
HERMES_HOME) to find and SIGTERM this installation's gateway before
restarting. Safe with multiple Hermes installations since each
HERMES_HOME has its own PID file.
If no systemd service exists, informs the user to restart manually.
Based on PR #1131 by teknium1. Dropped the cli.py Rich from_ansi
changes (already on main).
When typing /model deepseek-chat while on a different provider, the
model name now auto-resolves to the correct provider instead of
silently staying on the wrong one and causing API errors.
Detection priority:
1. Direct provider with credentials (e.g. DEEPSEEK_API_KEY set)
2. OpenRouter catalog match with proper slug remapping
3. Direct provider without creds (clear error beats silent failure)
Also adds DeepSeek as a first-class API-key provider — just set
DEEPSEEK_API_KEY and /model deepseek-chat routes directly.
Bare model names get remapped to proper OpenRouter slugs:
/model gpt-5.4 → openai/gpt-5.4
/model claude-opus-4.6 → anthropic/claude-opus-4.6
Salvages the concept from PR #1177 by @virtaava with credential
awareness and OpenRouter slug mapping added.
Co-authored-by: virtaava <virtaava@users.noreply.github.com>
_update_config_for_provider() was called immediately after provider
selection for zai, kimi-coding, minimax, minimax-cn, and anthropic —
before model selection happened. Since the gateway re-reads config.yaml
per-message, this created a race where the gateway would pick up the
new provider but still use the old (incompatible) model name.
Capture selected_base_url in each provider block, then call
_update_config_for_provider() once, after model selection completes,
right before save_config(). The in-memory _set_model_provider() calls
stay in place so the config object remains consistent during setup.
Closes#1182
- Add 'emoji' field to ToolEntry and 'get_emoji()' to ToolRegistry
- Add emoji= to all 50+ registry.register() calls across tool files
- Add get_tool_emoji() helper in agent/display.py with 3-tier resolution:
skin override → registry default → hardcoded fallback
- Replace hardcoded emoji maps in run_agent.py, delegate_tool.py, and
gateway/run.py with centralized get_tool_emoji() calls
- Add 'tool_emojis' field to SkinConfig so skins can override per-tool
emojis (e.g. ares skin could use swords instead of wrenches)
- Add 11 tests (5 registry emoji, 6 display/skin integration)
- Update AGENTS.md skin docs table
Based on the approach from PR #1061 by ForgingAlex (emoji centralization
in registry). This salvage fixes several issues from the original:
- Does NOT split the cronjob tool (which would crash on missing schemas)
- Does NOT change image_generate toolset/requires_env/is_async
- Does NOT delete existing tests
- Completes the centralization (gateway/run.py was missed)
- Hooks into the skin system for full customizability
SSH persistent shell now defaults to true — non-local backends benefit
most from state persistence across execute() calls. Local backend
remains opt-in via TERMINAL_LOCAL_PERSISTENT env var.
New config.yaml option: terminal.persistent_shell (default: true)
Controls the default for non-local backends. Users can disable with:
hermes config set terminal.persistent_shell false
Precedence: per-backend env var > TERMINAL_PERSISTENT_SHELL > default.
Wired through cli.py, gateway/run.py, and hermes_cli/config.py so the
config.yaml value reaches terminal_tool via env var bridge.
Two changes to align Discord behavior with Slack:
1. Auto-thread on @mention (default: true)
- When someone @mentions the bot in a server channel, a thread is
automatically created from their message and the response goes there.
- Each thread gets its own isolated session (like Slack).
- Configurable via discord.auto_thread in config.yaml (default: true)
or DISCORD_AUTO_THREAD env var (env takes precedence).
- DMs and existing threads are unaffected.
2. Skip @mention in bot-participated threads
- Once the bot has responded in a thread (auto-created or manually
entered), subsequent messages in that thread no longer require
@mention. Users can just type normally.
- Tracked via in-memory set (_bot_participated_threads). After a
gateway restart, users need to @mention once to re-establish.
- Threads the bot hasn't participated in still require @mention.
Config change:
discord:
auto_thread: true # new, added to DEFAULT_CONFIG
Tests: 7 new tests covering auto-thread default, disable, bot thread
participation tracking, and mention skip logic. All 903 gateway tests pass.
Hermes startup entrypoints now load ~/.hermes/.env and project fallback env files with user config taking precedence over stale shell-exported values. This makes model/provider/base URL changes in .env actually take effect after restarting Hermes. Adds a shared env loader plus regression coverage, and reproduces the original bug case where OPENAI_BASE_URL and HERMES_INFERENCE_PROVIDER remained stuck on old shell values before import.
Resolve session IDs by exact match or unique prefix for sessions delete/export/rename so IDs copied from Preview Last Active Src ID
──────────────────────────────────────────────────────────────────────────────────────────
Search for GitHub/GitLab source repositories for 11m ago cli 20260315_034720_8e1f
[SYSTEM: The user has invoked the "minecraft-atm 1m ago cli 20260315_034035_57b6
1h ago cron cron_job-1_20260315_
[SYSTEM: The user has invoked the "hermes-agent- 9m ago cli 20260315_014304_652a
4h ago cron cron_job-1_20260314_
[The user attached an image. Here's what it cont 4h ago cli 20260314_233806_c8f3
[SYSTEM: The user has invoked the "google-worksp 1h ago cli 20260314_233301_b04f
Inspect the opencode codebase for how it sends m 4h ago cli 20260314_232543_0601
Inspect the clawdbot codebase for how it sends m 4h ago cli 20260314_232543_8125
4h ago cron cron_job-1_20260314_
Reply with exactly: smoke-ok 4h ago cli 20260314_231730_aac9
4h ago cron cron_job-1_20260314_
[SYSTEM: The user has invoked the "hermes-agent- 4h ago cli 20260314_231111_3586
[SYSTEM: The user has invoked the "hermes-agent- 4h ago cli 20260314_225551_daff
5h ago cron cron_job-1_20260314_
[SYSTEM: The user has invoked the "google-worksp 4h ago cli 20260314_224629_a9c6
k_sze — 10:34 PM Just ran hermes update and I 5h ago cli 20260314_224243_544e
5h ago cron cron_job-1_20260314_
5h ago cron cron_job-1_20260314_
5h ago cron cron_job-1_20260314_ work even when the table view truncates them. Add SessionDB prefix-resolution coverage and a CLI regression test for deleting by listed prefix.
_save_platform_tools() overwrote the entire platform_toolsets list with
only the toolsets known to CONFIGURABLE_TOOLSETS. This silently dropped
any MCP server toolsets that users had added manually to config.yaml.
Fix: collect any existing toolset keys that are not in CONFIGURABLE_TOOLSETS
and append them back after the wizard's selections are written. This ensures
MCP toolsets survive a hermes tools save.
Fixes#1247
- add stt.enabled to the default user config
- make transcription_tools respect the disabled flag globally
- surface disabled state cleanly in voice mode diagnostics
- add regression coverage for disabled STT provider selection
- mark private-channel scopes/events as optional
- note reinstall requirement after scope/event changes
- correct Slack allowlist messaging to match gateway behavior
- Add background thread mechanism (prefetch_update_check/get_update_result)
so git fetch runs in parallel with skill sync and agent init
- Fix repo path fallback in check_for_updates() for dev installs
- Remove duplicate build_welcome_banner (~180 lines) and
_format_context_length from cli.py — the banner.py version is
now the single source of truth
- Port skin banner_hero/banner_logo support and terminal width check
from cli.py's version into banner.py
- Add update status output to hermes version command
- Add unit tests for update check, prefetch, and version string
Add base_url/api_key overrides for auxiliary tasks and delegation so users can
route those flows straight to a custom OpenAI-compatible endpoint without
having to rely on provider=main or named custom providers.
Also clear gateway session env vars in test isolation so the full suite stays
deterministic when run from a messaging-backed agent session.
Moonshot (legacy key) users were shown kimi-for-coding and
kimi-k2-thinking-turbo which only work on the Coding Plan endpoint
(api.kimi.com/coding/v1). Add a separate "moonshot" model list that
excludes plan-specific models.
`hermes update` crashed with CalledProcessError when run on a local-only
branch (e.g. fix/stoicneko) because `git rev-list HEAD..origin/{branch}`
fails when origin/{branch} doesn't exist. Now verifies the remote branch
exists first and falls back to origin/main.
Fixes#1005
Without linger, user-level systemd services stop when the SSH session
ends — even though systemctl --user status shows active (running).
Changes to systemd_install():
- Try loginctl enable-linger automatically (succeeds when the process
has the required privileges)
- If loginctl fails (no privileges), print a clear, copy-pasteable
warning with the exact command the user must run
New helper: _ensure_linger_enabled()
- Fast path: checks /var/lib/systemd/linger/<user> (no subprocess)
- Auto-enable: loginctl enable-linger <user>
- Fallback: actionable warning with sudo command + restart instructions
Tests: 4 new tests in TestEnsureLingerEnabled, 205 passed total
Keep the argparse CLI aligned with the slash command so --yes and -y
behave the same as --force for hermes skills install.
Add a parser-level regression test.
Salvaged from PR #1007 by stablegenius49.
- let INSTALL_POLICY decide dangerous verdict handling for builtin skills
- allow --force to override blocked dangerous decisions for trusted and community sources
- accept --yes / -y as aliases for --force in /skills install
- update regression tests to match the intended policy precedence
Follow up on salvaged PR #1012.
Prevents raw custom-provider names from intercepting built-in provider ids,
and keeps the regression coverage focused on current-main behavior.
Add regression coverage for the new provider-aware vision setup flow and make the default OpenAI choice write AUXILIARY_VISION_MODEL so auxiliary vision requests don't fall back to the main model slug.
The old flow blindly asked for an OpenRouter API key after ANY non-OR
provider selection, even for Nous Portal and Codex which already
support vision natively. This was confusing and annoying.
New behavior:
- OpenRouter: skip — vision uses Gemini via their OR key
- Nous Portal OAuth: skip — vision uses Gemini via Nous
- OpenAI Codex: skip — gpt-5.3-codex supports vision
- Custom endpoint (api.openai.com): show OpenAI vision model picker
(gpt-4o, gpt-4o-mini, gpt-4.1, etc.), saves AUXILIARY_VISION_MODEL
- Custom (other) / z.ai / kimi / minimax / nous-api:
- First checks if existing OR/Nous creds already cover vision
- If not, offers friendly choice: OpenRouter / OpenAI / Skip
- No more 'enter OpenRouter key' thrown in your face
Also fixes the setup summary to check actual vision availability
across all providers instead of hardcoding 'requires OPENROUTER_API_KEY'.
MoA still correctly requires OpenRouter (calls multiple frontier models).
Update the unknown-subcommand config help output to use placeholder syntax too,
and extend the placeholder regression tests to cover show_config() and that
fallback help path.
Round out the skills hub integration with:
- richer skills.sh metadata and security surfacing during inspect/install
- generic check/update flows for hub-installed skills
- support for well-known Agent Skills endpoints via /.well-known/skills/index.json
Also persist upstream bundle metadata in the lock file and add
regression coverage plus live-compatible path handling for both
skills.sh aliases and well-known endpoints.