Salvage of PR #2173 (hanai) and PR #3432 (timknip).
Injects PATH, VIRTUAL_ENV, and HERMES_HOME into the macOS launchd plist so gateway subprocesses find user-installed tools (node, ffmpeg, etc.). Matches systemd unit parity with venv/bin, node_modules/.bin, and resolved node dir in PATH. Includes 7 new tests and docs updates across 4 pages.
Co-Authored-By: Han <ihanai1991@gmail.com>
Co-Authored-By: timknip <timknip@users.noreply.github.com>
* Fixing mattermost configuration parsing bugs
* fix: add homeassistant to skills_config + platform consistency tests
Follow-up for cherry-picked #3512:
- Add homeassistant to skills_config.py PLATFORMS (was in tools_config
but missing from skills_config)
- Add 3 consistency tests that verify all platforms in tools_config have
matching toolset definitions, gateway includes, and skills_config entries
— prevents this class of bug from recurring
---------
Co-authored-by: DaneelV3 <dannel@v3rtical.tech>
* feat: GPT tool-use steering + strip budget warnings from history
Two changes to improve tool reliability, especially for OpenAI GPT models:
1. GPT tool-use enforcement prompt: Adds GPT_TOOL_USE_GUIDANCE to the
system prompt when the model name contains 'gpt' and tools are loaded.
This addresses a known behavioral pattern where GPT models describe
intended actions ('I will run the tests') instead of actually making
tool calls. Inspired by similar steering in OpenCode (beast.txt) and
Cline (GPT-5.1 variant).
2. Budget warning history stripping: Budget pressure warnings injected by
_get_budget_warning() into tool results are now stripped when
conversation history is replayed via run_conversation(). Previously,
these turn-scoped signals persisted across turns, causing models to
avoid tool calls in all subsequent messages after any turn that hit
the 70-90% iteration threshold.
* fix: replace hardcoded ~/.hermes paths with get_hermes_home() for profile support
Prep for the upcoming profiles feature — each profile is a separate
HERMES_HOME directory, so all paths must respect the env var.
Fixes:
- gateway/platforms/matrix.py: Matrix E2EE store was hardcoded to
~/.hermes/matrix/store, ignoring HERMES_HOME. Now uses
get_hermes_home() so each profile gets its own Matrix state.
- gateway/platforms/telegram.py: Two locations reading config.yaml via
Path.home()/.hermes instead of get_hermes_home(). DM topic thread_id
persistence and hot-reload would read the wrong config in a profile.
- tools/file_tools.py: Security path for hub index blocking was
hardcoded to ~/.hermes, would miss the actual profile's hub cache.
- hermes_cli/gateway.py: Service naming now uses the profile name
(hermes-gateway-coder) instead of a cryptic hash suffix. Extracted
_profile_suffix() helper shared by systemd and launchd.
- hermes_cli/gateway.py: Launchd plist path and Label now scoped per
profile (ai.hermes.gateway-coder.plist). Previously all profiles
would collide on the same plist file on macOS.
- hermes_cli/gateway.py: Launchd plist now includes HERMES_HOME in
EnvironmentVariables — was missing entirely, making custom
HERMES_HOME broken on macOS launchd (pre-existing bug).
- All launchctl commands in gateway.py, main.py, status.py updated
to use get_launchd_label() instead of hardcoded string.
Test fixes: DM topic tests now set HERMES_HOME env var alongside
Path.home() mock. Launchd test uses get_launchd_label() for expected
commands.
The TOOL_USE_ENFORCEMENT_GUIDANCE injection (added in #3528) was
hardcoded to only match gpt/codex model names. This makes it a
config option so users can turn it on for any model family.
New config key: agent.tool_use_enforcement
- "auto" (default): matches gpt/codex (existing behavior)
- true: inject for all models
- false: never inject
- list of strings: custom model-name substrings to match
e.g. ["gpt", "codex", "deepseek", "qwen"]
No version bump needed — deep merge provides the default
automatically for existing installs.
12 new tests covering all config modes.
dict.get(key, default) only returns default when key is ABSENT.
When YAML has 'key:' with no value, it parses as None — .get()
returns None, then .strip() crashes with AttributeError.
Use (x or '') pattern to handle both missing and null cases.
Salvaged from PR #3217.
Co-authored-by: erosika <erosika@users.noreply.github.com>
The minimax-specific auto-correction in runtime_provider.py was
preventing users from overriding to the OpenAI-compatible endpoint
via MINIMAX_BASE_URL. Users in certain regions get nginx 404 on
api.minimax.io/anthropic and need to switch to api.minimax.chat/v1.
The generic URL-suffix detection already handles /anthropic →
anthropic_messages, so the minimax-specific code was redundant for
the default path and harmful for the override path.
Now: default /anthropic URL works via generic detection, user
override to /v1 gets chat_completions mode naturally.
Closes#3546 (different approach — respects user overrides instead
of changing the default endpoint).
Drop the swe-rex dependency for Modal terminal backend and use the
Modal SDK directly (Sandbox.create + Sandbox.exec). This fixes:
- AsyncUsageWarning from synchronous App.lookup() in async context
- DeprecationError from unencrypted_ports / .url on unencrypted tunnels
(deprecated 2026-03-05)
The new implementation:
- Uses modal.App.lookup.aio() for async-safe app creation
- Uses Sandbox.create.aio() with 'sleep infinity' entrypoint
- Uses Sandbox.exec.aio() for direct command execution (no HTTP server
or tunnel needed)
- Keeps all existing features: persistent filesystem snapshots,
configurable resources (CPU/memory/disk), sudo support, interrupt
handling, _AsyncWorker for event loop safety
Consistent with the Docker backend precedent (PR #2804) where we
removed mini-swe-agent in favor of direct docker run.
Files changed:
- tools/environments/modal.py - core rewrite
- tools/terminal_tool.py - health check: modal instead of swerex
- hermes_cli/setup.py - install modal instead of swe-rex[modal]
- pyproject.toml - modal extra: modal>=1.0.0 instead of swe-rex[modal]
- scripts/kill_modal.sh - grep for hermes-agent instead of swe-rex
- tests/ - updated for new implementation
- environments/README.md - updated patches section
- website/docs - updated install command
The plugin system defined six lifecycle hooks but only pre_tool_call and
post_tool_call were invoked. This activates the remaining four so that
external plugins (e.g. memory systems) can hook into the conversation
loop without touching core code.
Hook semantics:
- on_session_start: fires once when a new session is created
- pre_llm_call: fires once per turn before the tool-calling loop;
plugins can return {"context": "..."} to inject into the ephemeral
system prompt (not cached, not persisted)
- post_llm_call: fires once per turn after the loop completes, with
user_message and assistant_response for sync/storage
- on_session_end: fires at the end of every run_conversation call
invoke_hook() now returns a list of non-None callback return values,
enabling pre_llm_call context injection while remaining backward
compatible (existing hooks that return None are unaffected).
Salvaged from PR #2823.
Co-authored-by: Nicolò Boschi <boschi1997@gmail.com>
Add ~/.local/bin, ~/.cargo/bin, ~/go/bin, ~/.npm-global/bin to the
systemd unit PATH so tools installed via uv/pipx/cargo/go are
discoverable by MCP servers and terminal commands.
Uses a _build_user_local_paths() helper that checks exists() before
adding, and correctly resolves home dir for both user and system
service types.
Co-authored-by: Kal Sze <ksze@users.noreply.github.com>
* fix: harden `hermes update` against diverged history, non-main branches, and gateway edge cases
The self-update command (`hermes update` / gateway `/update`) could fail
or silently corrupt state in several scenarios:
1. **Diverged history** — `git pull --ff-only` aborts with a cryptic
subprocess error when upstream has force-pushed or rebased. Now falls
back to `git reset --hard origin/main` since local changes are already
stashed.
2. **User on a feature branch / detached HEAD** — the old code would
either clobber the feature branch HEAD to point at origin/main, or
silently pull against a non-existent remote branch. Now auto-checkouts
main before pulling, with a clear warning.
3. **Fetch failures** — network or auth errors produced raw subprocess
tracebacks. Now shows user-friendly messages ("Network error",
"Authentication failed") with actionable hints.
4. **reset --hard failure** — if the fallback reset itself fails (disk
full, permissions), the old code would still attempt stash restore on
a broken working tree. Now skips restore and tells the user their
changes are safe in stash.
5. **Gateway /update stash conflicts** — non-interactive mode (Telegram
`/update`) called sys.exit(1) when stash restore had conflicts, making
the entire update report as failed even though the code update itself
succeeded. Now treats stash conflicts as non-fatal in non-interactive
mode (returns False instead of exiting).
* fix: restore stash and branch on 'already up to date' early return
The PR moved stash creation before the commit-count check (needed for
the branch-switching feature), but the 'already up to date' early return
didn't restore the stash or switch back to the original branch — leaving
the user stranded on main with changes trapped in a stash.
Now the early-return path restores the stash and checks out the original
branch when applicable.
---------
Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>
- Change default inference_base_url from dashscope-intl Anthropic-compat
endpoint to coding-intl OpenAI-compat /v1 endpoint. The old Anthropic
endpoint 404'd when used with the OpenAI SDK (which appends
/chat/completions to a /apps/anthropic base URL).
- Update curated model list: remove models unavailable on coding-intl
(qwen3-max, qwen-plus-latest, qwen3.5-flash, qwen-vl-max), add
third-party models available on the platform (glm-5, glm-4.7,
kimi-k2.5, MiniMax-M2.5).
- URL-based api_mode auto-detection still works: overriding
DASHSCOPE_BASE_URL to an /apps/anthropic endpoint automatically
switches to anthropic_messages mode.
- Update provider description and env var descriptions to reflect the
coding-intl multi-provider platform.
- Update tests to match new default URL and test the anthropic override
path instead.
Matrix platform was missing from the PLATFORMS config, causing a
KeyError in _get_platform_tools() when handling Matrix messages.
Every other platform (telegram, discord, slack, etc.) was present
but matrix was overlooked.
Co-authored-by: williamtwomey <williamtwomey@users.noreply.github.com>
hermes tools and _get_platform_tools() call get_plugin_toolsets() /
_get_plugin_toolset_keys() without first ensuring plugins have been
discovered. discover_plugins() only runs as a side effect of importing
model_tools.py, which hermes tools never does. This means:
- hermes tools TUI never shows plugin toolsets (invisible to users)
- _get_platform_tools() in standalone processes misses plugin toolsets
Fix: call discover_plugins() (idempotent) in both
_get_plugin_toolset_keys() and _get_effective_configurable_toolsets()
before accessing plugin state. In the gateway/CLI where model_tools.py
is already imported, the call is a no-op (discover_and_load checks
_discovered flag).
Show only agentic models that map to OpenRouter defaults:
Qwen/Qwen3.5-397B-A17B ↔ qwen/qwen3.5-plus
Qwen/Qwen3.5-35B-A3B ↔ qwen/qwen3.5-35b-a3b
deepseek-ai/DeepSeek-V3.2 ↔ deepseek/deepseek-chat
moonshotai/Kimi-K2.5 ↔ moonshotai/kimi-k2.5
MiniMaxAI/MiniMax-M2.5 ↔ minimax/minimax-m2.5
zai-org/GLM-5 ↔ z-ai/glm-5
XiaomiMiMo/MiMo-V2-Flash ↔ xiaomi/mimo-v2-pro
moonshotai/Kimi-K2-Thinking ↔ moonshotai/kimi-k2-thinking
Users can still pick any HF model via Enter custom model name.
Salvage of PR #1747 (original PR #1171 by @davanstrien) onto current main.
Registers Hugging Face Inference Providers (router.huggingface.co/v1) as a named provider:
- hermes chat --provider huggingface (or --provider hf)
- 18 curated open models via hermes model picker
- HF_TOKEN in ~/.hermes/.env
- OpenAI-compatible endpoint with automatic failover (Groq, Together, SambaNova, etc.)
Files: auth.py, models.py, main.py, setup.py, config.py, model_metadata.py, .env.example, 5 docs pages, 17 new tests.
Co-authored-by: Daniel van Strien <davanstrien@gmail.com>
The API server adapter was creating agents without specifying
enabled_toolsets, causing ALL tools to load — including clarify,
send_message, and text_to_speech which don't work without interactive
callbacks or gateway dispatch.
Changes:
- toolsets.py: Add hermes-api-server toolset (core tools minus clarify,
send_message, text_to_speech)
- api_server.py: Resolve toolsets from config.yaml platform_toolsets
via _get_platform_tools() — same path as all other gateway platforms.
Falls back to hermes-api-server default when no override configured.
- tools_config.py: Add api_server to PLATFORMS dict so users can
customize via 'hermes tools' or platform_toolsets.api_server in
config.yaml
- 12 tests covering toolset definition, config resolution, and
user override
Reported by thatwolfieguy on Discord.
Two changes:
1. Fix /queue command: remove the _agent_running guard that rejected
/queue after the agent finished. The prompt was deferred in
_pending_input until the agent completed, then the handler checked
_agent_running (now False) and rejected it. /queue now always queues
regardless of timing.
2. Add display.busy_input_mode config (CLI-only):
- 'interrupt' (default): Enter while busy interrupts the current run
(preserves existing behavior)
- 'queue': Enter while busy queues the message for the next turn,
with a 'Queued for the next turn: ...' confirmation
Ctrl+C always interrupts regardless of this setting.
Salvaged from PR #3037 by StefanoChiodino. Key differences:
- Default is 'interrupt' (preserves existing behavior) not 'queue'
- No config version bump (unnecessary for new key in existing section)
- Simpler normalization (no alias map)
- /queue fix is simpler: just remove the guard instead of intercepting
commands during busy state
Two bugs caused the OpenClaw migration during first-time setup to be
ineffective, forcing users to reconfigure everything manually:
1. The setup wizard created config.yaml with all defaults BEFORE running
the migration, then the migrator ran with overwrite=False. Every config
setting was reported as a 'conflict' against the defaults and skipped.
Fix: use overwrite=True during setup-time migration (safe because only
defaults exist at that point). The hermes claw migrate CLI command
still defaults to overwrite=False for post-setup use.
2. After migration, the full setup wizard ran all 5 sections unconditionally,
forcing the user through model/terminal/agent/messaging/tools configuration
even when those settings were just imported.
Fix: add _get_section_config_summary() and _skip_configured_section()
helpers. After migration, each section checks if it's already configured
(API keys present, non-default values, platform tokens) and offers
'Reconfigure? [y/N]' with default No. Unconfigured sections still run
normally.
Reported by Dev Bredda on social media.
- add managed modal and gateway-backed tool integrations\n- improve CLI setup, auth, and configuration for subscriber flows\n- expand tests and docs for managed tool support
* feat: config-gated /verbose command for messaging gateway
Add gateway_config_gate field to CommandDef, allowing cli_only commands
to be conditionally available in the gateway based on a config value.
- CommandDef gains gateway_config_gate: str | None — a config dotpath
that, when truthy, overrides cli_only for gateway surfaces
- /verbose uses gateway_config_gate='display.tool_progress_command'
- Default is off (cli_only behavior preserved)
- When enabled, /verbose cycles tool_progress mode (off/new/all/verbose)
in the gateway, saving to config.yaml — same cycle as the CLI
- Gateway helpers (help, telegram menus, slack mapping) dynamically
check config to include/exclude config-gated commands
- GATEWAY_KNOWN_COMMANDS always includes config-gated commands so
the gateway recognizes them and can respond appropriately
- Handles YAML 1.1 bool coercion (bare 'off' parses as False)
- 8 new tests for the config gate mechanism + gateway handler
* docs: document gateway_config_gate and /verbose messaging support
- AGENTS.md: add gateway_config_gate to CommandDef fields
- slash-commands.md: note /verbose can be enabled for messaging, update Notes
- configuration.md: add tool_progress_command to display section + usage note
- cli.md: cross-link to config docs for messaging enablement
- messaging/index.md: show tool_progress_command in config snippet
- plugins.md: add gateway_config_gate to register_command parameter table
When third-party tools (Paperclip orchestrator, etc.) spawn hermes chat
as a subprocess, their sessions pollute user session history and search.
- hermes chat --source <tag> (also HERMES_SESSION_SOURCE env var)
- exclude_sources parameter on list_sessions_rich() and search_messages()
- Sessions with source=tool hidden from sessions list/browse/search
- Third-party adapters pass --source tool to isolate agent sessions
Cherry-picked from PR #3208 by HenkDz.
Co-authored-by: Henkey <noonou7@gmail.com>
Nous Portal now passes through OpenRouter model names and routes from
there. Update the static fallback model list and auxiliary client default
to use OpenRouter-format slugs (provider/model) instead of bare names.
- _PROVIDER_MODELS['nous']: full OpenRouter catalog
- _NOUS_MODEL: google/gemini-3-flash-preview (was gemini-3-flash)
- Updated 4 test assertions for the new default model name
Validate each ZIP member's resolved path against the extraction directory
before extracting. A crafted ZIP with paths like ../../etc/passwd would
previously write outside the target directory.
Fixes#3075
Co-authored-by: Hiren <hiren.thakore58@gmail.com>
* fix(skills): reduce skills.sh resolution churn and preserve trust for wrapped identifiers
- Accept common skills.sh prefix typos (skils-sh/, skils.sh/)
- Strip skills-sh/ prefix in _resolve_trust_level() so trusted repos
stay trusted when installed through skills.sh
- Use resolved identifier (from bundle/meta) for scan_skill source
- Prefer tree search before root scan in _discover_identifier()
- Add _resolve_github_meta() consolidation for inspect flow
Cherry-picked from PR #3001 by kshitijk4poor.
* fix: restore candidate loop in SkillsShSource.fetch() for consistency
The cherry-picked PR only tried the first candidate identifier in
fetch() while inspect() (via _resolve_github_meta) tried all four.
This meant skills at repo/skills/path would be found by inspect but
missed by fetch, forcing it through the heavier _discover_identifier
flow. Restore the candidate loop so both paths behave identically.
Updated the test assertion to match.
---------
Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>
Gateway sessions had their own inline toolset resolution that only read
platform_toolsets from config, which never includes MCP server names.
MCP tools were discovered and registered but invisible to the model.
- Replace duplicated gateway toolset resolution in _run_agent() and
_run_background_task() with calls to the shared _get_platform_tools()
- Extend _get_platform_tools() to include globally enabled MCP servers
at runtime (include_default_mcp_servers=True), while config-editing
flows use include_default_mcp_servers=False to avoid persisting
implicit MCP defaults into platform_toolsets
- Add homeassistant to PLATFORMS dict (was missing, caused KeyError)
- Fix CLI entry point to use _get_platform_tools() as well, so MCP
tools are visible in CLI mode too
- Remove redundant platform_key reassignment in _run_background_task
Co-authored-by: kshitijk4poor <kshitijk4poor@users.noreply.github.com>
The default SOUL.md seeded for new users should match
DEFAULT_AGENT_IDENTITY — a short, neutral identity paragraph.
The elaborate voice spec (avoid lists, dialogue examples, symbol
conventions) was never intended as the default for all users.
Users who want a custom persona write their own SOUL.md.
Three categories of cleanup, all zero-behavioral-change:
1. F-strings without placeholders (154 fixes across 29 files)
- Converted f'...' to '...' where no {expression} was present
- Heaviest files: run_agent.py (24), cli.py (20), honcho_integration/cli.py (34)
2. Simplify defensive patterns in run_agent.py
- Added explicit self._is_anthropic_oauth = False in __init__ (before
the api_mode branch that conditionally sets it)
- Replaced 7x getattr(self, '_is_anthropic_oauth', False) with direct
self._is_anthropic_oauth (attribute always initialized now)
- Added _is_openrouter_url() and _is_anthropic_url() helper methods
- Replaced 3 inline 'openrouter' in self._base_url_lower checks
3. Remove dead code in small files
- hermes_cli/claw.py: removed unused 'total' computation
- tools/fuzzy_match.py: removed unused strip_indent() function and
pattern_stripped variable
Full test suite: 6184 passed, 0 failures
E2E PTY: banner clean, tool calls work, zero garbled ANSI
sessions delete and prune call input() for confirmation without
catching EOFError. When stdin isn't a TTY (piped input, CI/CD, cron),
input() throws EOFError and the command crashes.
Extract a _confirm_prompt() helper that handles EOFError and
KeyboardInterrupt, defaulting to cancel. Both call sites now use it.
Salvaged from PR #2622 by dieutx (improved from duplicated try/except
to shared helper). Closes#2565.
The update commands called bare 'pip' as fallback when uv wasn't found.
On modern Debian/Ubuntu enforcing PEP 668, this resolves to system pip
which refuses to install in an externally-managed environment.
Use sys.executable -m pip to ensure the venv's pip is used. Fixed in
both cmd_update and _update_via_zip (the PR only caught one instance).
Salvaged from PR #2655 by devorun. Fixes#2648.
Cherry-picked from PR #2542 by ReqX. Adds glm-5-turbo to the direct
zai provider curated model list so /model zai:glm-5-turbo validates
correctly. The model was already in _OPENROUTER_UPSTREAM_MODELS but
missing from the direct provider list.
cmd_update calls input() unconditionally during config migration.
In headless environments (Telegram gateway, systemd), there's no TTY,
so input() throws EOFError and the update crashes.
Guard with sys.stdin.isatty(), default to skipping the migration
prompt when non-interactive.
Salvaged from PR #2850 by devorun. Closes#2848.
The /model command is removed from both the interactive CLI and
messenger gateway (Telegram/Discord/Slack/WhatsApp). Users can
still change models via 'hermes model' CLI subcommand or by
editing config.yaml directly.
Removed:
- CommandDef entry from COMMAND_REGISTRY
- CLI process_command() handler and model autocomplete logic
- Gateway _handle_model_command() and dispatch
- SlashCommandCompleter model_completer_provider parameter
- Two-stage Tab completion and ghost text for /model
- All /model-specific tests
Unaffected:
- /provider command (read-only, shows current model + providers)
- ACP adapter _cmd_model (separate system for VS Code/Zed/JetBrains)
- model_switch.py module (used by ACP)
- 'hermes model' CLI subcommand
Author: Teknium
Centralizes two widely-duplicated patterns into hermes_constants.py:
1. get_hermes_home() — Path resolution for ~/.hermes (HERMES_HOME env var)
- Was copy-pasted inline across 30+ files as:
Path(os.getenv("HERMES_HOME", Path.home() / ".hermes"))
- Now defined once in hermes_constants.py (zero-dependency module)
- hermes_cli/config.py re-exports it for backward compatibility
- Removed local wrapper functions in honcho_integration/client.py,
tools/website_policy.py, tools/tirith_security.py, hermes_cli/uninstall.py
2. parse_reasoning_effort() — Reasoning effort string validation
- Was copy-pasted in cli.py, gateway/run.py, cron/scheduler.py
- Same validation logic: check against (xhigh, high, medium, low, minimal, none)
- Now defined once in hermes_constants.py, called from all 3 locations
- Warning log for unknown values kept at call sites (context-specific)
31 files changed, net +31 lines (125 insertions, 94 deletions)
Full test suite: 6179 passed, 0 failed
* feat: nix flake, uv2nix build, dev shell and home manager
* fixed nix run, updated docs for setup
* feat(nix): NixOS module with persistent container mode, managed guards, checks
- Replace homeModules.nix with nixosModules.nix (two deployment modes)
- Mode A (native): hardened systemd service with ProtectSystem=strict
- Mode B (container): persistent Ubuntu container with /nix/store bind-mount,
identity-hash-based recreation, GC root protection, symlink-based updates
- Add HERMES_MANAGED guards blocking CLI config mutation (config set, setup,
gateway install/uninstall) when running under NixOS module
- Add nix/checks.nix with build-time verification (binary, CLI, managed guard)
- Remove container.nix (no Nix-built OCI image; pulls ubuntu:24.04 at runtime)
- Simplify packages.nix (drop fetchFromGitHub submodules, PYTHONPATH wrappers)
- Rewrite docs/nixos-setup.md with full options reference, container
architecture, secrets management, and troubleshooting guide
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* Update config.py
* feat(nix): add CI workflow and enhanced build checks
- GitHub Actions workflow for nix flake check + build on linux/macOS
- Entry point sync check to catch pyproject.toml drift
- Expanded managed-guard check to cover config edit
- Wrap hermes-acp binary in Nix package
- Fix Path type mismatch in is_managed()
* Update MCP server package name; bundled skills support
* fix reading .env. instead have container user a common mounted .env file
* feat(nix): container entrypoint with privilege drop and sudo provisioning
Container was running as non-root via --user, which broke apt/pip installs
and caused crashes when $HOME didn't exist. Replace --user with a Nix-built
entrypoint script that provisions the hermes user, sudo (NOPASSWD), and
/home/hermes inside the container on first boot, then drops privileges via
setpriv. Writable layer persists so setup only runs once.
Also expands MCP server options to support HTTP transport and sampling.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
* fix group and user creation in container mode
* feat(nix): persistent /home/hermes and MESSAGING_CWD in container mode
Container mode now bind-mounts ${stateDir}/home to /home/hermes so the
agent's home directory survives container recreation. Previously it lived
in the writable layer and was lost on image/volume/options changes.
Also passes MESSAGING_CWD to the container so the agent finds its
workspace and documents, matching native mode behavior.
Other changes:
- Extract containerDataDir/containerHomeDir bindings (no more magic strings)
- Fix entrypoint chown to run unconditionally (volume mounts always exist)
- Add schema field to container identity hash for auto-recreation
- Add idempotency test (Scenario G) to config-roundtrip check
* docs: add Nix & NixOS setup guide to docs site
Add comprehensive Nix documentation to the Docusaurus site at
website/docs/getting-started/nix-setup.md, covering nix run/profile
install, NixOS module (native + container modes), declarative settings,
secrets management, MCP servers, managed mode, container architecture,
dev shell, flake checks, and full options reference.
- Register nix-setup in sidebar after installation page
- Add Nix callout tip to installation.md linking to new guide
- Add canonical version pointer in docs/nixos-setup.md
* docs: remove docs/nixos-setup.md, consolidate into website docs
Backfill missing details (restart/restartSec in full example,
gateway.pid, 0750 permissions, docker inspect commands) into
the canonical website/docs/getting-started/nix-setup.md and
delete the old standalone file.
* fix(nix): add compression.protect_last_n and target_ratio to config-keys.json
New keys were added to DEFAULT_CONFIG on main, causing the
config-drift check to fail in CI.
* fix(nix): skip checks on aarch64-darwin (onnxruntime wheel missing)
The full Python venv includes onnxruntime (via faster-whisper/STT)
which lacks a compatible uv2nix wheel on aarch64-darwin. Gate all
checks behind stdenv.hostPlatform.isLinux. The package and devShell
still evaluate on macOS.
* fix(nix): skip flake check and build on macOS CI
onnxruntime (transitive dep via faster-whisper) lacks a compatible
uv2nix wheel on aarch64-darwin. Run full checks and build on Linux
only; macOS CI verifies the flake evaluates without building.
* fix(nix): preserve container writable layer across nixos-rebuild
The container identity hash included the entrypoint's Nix store path,
which changes on every nixpkgs update (due to runtimeShell/stdenv
input-addressing). This caused false-positive identity mismatches,
triggering container recreation and losing the persistent writable layer.
- Use stable symlink (current-entrypoint) like current-package already does
- Remove entrypoint from identity hash (only image/volumes/options matter)
- Add GC root for entrypoint so nix-collect-garbage doesn't break it
- Remove global HERMES_HOME env var from addToSystemPackages (conflicted
with interactive CLI use, service already sets its own)
---------
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Each subagent now gets its own IterationBudget instead of sharing the
parent's. The per-subagent cap is controlled by delegation.max_iterations
in config.yaml (default 50). Total iterations across parent + subagents
can exceed the parent's max_iterations, but the user retains control via
the config setting.
Previously, subagents shared the parent's budget, so three parallel
subagents configured for max_iterations=50 racing against a parent that
already used 60 of 90 would each only get ~10 iterations.
Inspired by PR #2928 (Bartok9) which identified the issue (#2873).
- threshold: 0.80 → 0.50 (compress at 50%, not 80%)
- target_ratio: 0.40 → 0.20, now relative to threshold not total context
(20% of 50% = 10% of context as tail budget)
- summary ceiling: 32K → 12K (Gemini can't output more than ~12K)
- Updated DEFAULT_CONFIG, config display, example config, and tests
PR #2554 made these configurable via config.yaml but didn't add them
to DEFAULT_CONFIG or the config display. Users couldn't discover the
new knobs without reading the source.
- threshold: 0.80 (compress at 80% context usage)
- target_ratio: 0.40 (preserve 40% of context as recent tail)
- protect_last_n: 20 (keep last 20 messages uncompressed)
- Updated hermes config display to show all three fields
Move OpenRouter to position 1 in the setup wizard's provider list
to match hermes model ordering. Update default selection index and
fix test expectations for the new ordering.
Setup order: OpenRouter → Nous Portal → Codex → Custom → ...
* feat: env var passthrough for skills and user config
Skills that declare required_environment_variables now have those vars
passed through to sandboxed execution environments (execute_code and
terminal). Previously, execute_code stripped all vars containing KEY,
TOKEN, SECRET, etc. and the terminal blocklist removed Hermes
infrastructure vars — both blocked skill-declared env vars.
Two passthrough sources:
1. Skill-scoped (automatic): when a skill is loaded via skill_view and
declares required_environment_variables, vars that are present in
the environment are registered in a session-scoped passthrough set.
2. Config-based (manual): terminal.env_passthrough in config.yaml lets
users explicitly allowlist vars for non-skill use cases.
Changes:
- New module: tools/env_passthrough.py — shared passthrough registry
- hermes_cli/config.py: add terminal.env_passthrough to DEFAULT_CONFIG
- tools/skills_tool.py: register available skill env vars on load
- tools/code_execution_tool.py: check passthrough before filtering
- tools/environments/local.py: check passthrough in _sanitize_subprocess_env
and _make_run_env
- 19 new tests covering all layers
* docs: add environment variable passthrough documentation
Document the env var passthrough feature across four docs pages:
- security.md: new 'Environment Variable Passthrough' section with
full explanation, comparison table, and security considerations
- code-execution.md: update security section, add passthrough subsection,
fix comparison table
- creating-skills.md: add tip about automatic sandbox passthrough
- skills.md: add note about passthrough after secure setup docs
Live-tested: launched interactive CLI, loaded a skill with
required_environment_variables, verified TEST_SKILL_SECRET_KEY was
accessible inside execute_code sandbox (value: passthrough-test-value-42).
Complete cleanup after dropping the mini-swe-agent submodule (PR #2804):
- Remove MSWEA_SILENT_STARTUP and MSWEA_GLOBAL_CONFIG_DIR env var
settings from cli.py, run_agent.py, hermes_cli/main.py, doctor.py
- Remove mini-swe-agent health check from hermes doctor
- Remove 'minisweagent' from logger suppression lists
- Remove litellm/typer/platformdirs from requirements.txt
- Remove mini-swe-agent install steps from install.ps1 (Windows)
- Remove mini-swe-agent install steps from website docs
- Update all stale comments/docstrings referencing mini-swe-agent
in terminal_tool.py, tools/__init__.py, code_execution_tool.py,
environments/README.md, environments/agent_loop.py
- Remove mini_swe_runner from pyproject.toml py-modules
(still exists as standalone script for RL training use)
- Shrink test_minisweagent_path.py to empty stub
The orphaned mini-swe-agent/ directory on disk needs manual removal:
rm -rf mini-swe-agent/
browser_vision and other browser commands had a hardcoded 30-second
subprocess timeout that couldn't be overridden. Users with slower
machines (local Chromium without GPU) would hit timeouts on screenshot
capture even when setting browser.command_timeout in config.yaml,
because nothing read that value.
Changes:
- Add browser.command_timeout to DEFAULT_CONFIG (default: 30s)
- Add _get_command_timeout() helper that reads config, falls back to 30s
- _run_browser_command() now defaults to config value instead of constant
- browser_vision screenshot no longer hardcodes timeout=30
- browser_navigate uses max(config_timeout, 60) as floor for navigation
Reported by Gamer1988.
Phase 4 of the /model command overhaul.
Both the CLI (cli.py) and gateway (gateway/run.py) /model handlers
had ~50 lines of duplicated core logic: parsing, provider detection,
credential resolution, and model validation. This extracts that
pipeline into hermes_cli/model_switch.py.
New module exports:
- ModelSwitchResult: dataclass with all fields both handlers need
- CustomAutoResult: dataclass for bare '/model custom' results
- switch_model(): core pipeline — parse → detect → resolve → validate
- switch_to_custom_provider(): resolve endpoint + auto-detect model
The shared functions are pure (no I/O side effects). Each caller
handles its own platform-specific concerns:
- CLI: sets self.model/provider/etc, calls save_config_value(), prints
- Gateway: writes config.yaml directly, sets env vars, returns markdown
Net result: -244 lines from handlers, +234 lines in shared module.
The handlers are now ~80 lines each (down from ~150+) and can't drift
apart on core logic.
Fixes#2492.
`generate_systemd_unit()` and `get_python_path()` hardcoded `venv`
as the virtualenv directory name. When the virtualenv is `.venv`
(which `setup-hermes.sh` and `.gitignore` both reference), the
generated systemd unit had incorrect VIRTUAL_ENV and PATH variables.
Introduce `_detect_venv_dir()` which:
1. Checks `sys.prefix` vs `sys.base_prefix` to detect the active venv
2. Falls back to probing `.venv` then `venv` under PROJECT_ROOT
Both `get_python_path()` and `generate_systemd_unit()` now use
this detection instead of hardcoded paths.
Co-authored-by: Hermes <hermes@nousresearch.ai>
* feat(model): persist base_url on /model switch, auto-detect for bare /model custom
Phase 2+3 of the /model command overhaul:
Phase 2 — Persist base_url on model switch:
- CLI: save model.base_url when switching to a non-OpenRouter endpoint;
clear it when switching away from custom to prevent stale URLs
leaking into the new provider's resolution
- Gateway: same logic using direct YAML write
Phase 3 — Better feedback and edge cases:
- Bare '/model custom' now auto-detects the model from the endpoint
using _auto_detect_local_model() and saves all three config values
(model, provider, base_url) atomically
- Shows endpoint URL in success messages when switching to/from
custom providers (both CLI and gateway)
- Clear error messages when no custom endpoint is configured
- Updated test assertions for the additional save_config_value call
Fixes#2562 (Phase 2+3)
* feat(model): support custom:name:model triple syntax for named custom providers
Phase 5 of the /model command overhaul.
Extends parse_model_input() to handle the triple syntax:
/model custom:local-server:qwen → provider='custom:local-server', model='qwen'
/model custom:my-model → provider='custom', model='my-model' (unchanged)
The 'custom:local-server' provider string is already supported by
_get_named_custom_provider() in runtime_provider.py, which matches
it against the custom_providers list in config.yaml. This just wires
the parsing so users can do it from the /model slash command.
Added 4 tests covering single, triple, whitespace, and empty model cases.
resolve_provider('custom') was silently returning 'openrouter', causing
users who set provider: custom in config.yaml to unknowingly route
through OpenRouter instead of their local/custom endpoint. The display
showed 'via openrouter' even when the user explicitly chose custom.
Changes:
- auth.py: Split the conditional so 'custom' returns 'custom' as-is
- runtime_provider.py: _resolve_named_custom_runtime now returns
provider='custom' instead of 'openrouter'
- runtime_provider.py: _resolve_openrouter_runtime returns
provider='custom' when that was explicitly requested
- Add 'no-key-required' placeholder for keyless local servers
- Update existing test + add 5 new tests covering the fix
Fixes#2562
* feat(config): support ${ENV_VAR} substitution in config.yaml
* fix: extend env var expansion to CLI and gateway config loaders
The original PR (#2680) only wired _expand_env_vars into load_config(),
which is used by 'hermes tools' and 'hermes setup'. The two primary
config paths were missed:
- load_cli_config() in cli.py (interactive CLI)
- Module-level _cfg in gateway/run.py (gateway — bridges api_keys to env vars)
Also:
- Remove redundant 'import re' (already imported at module level)
- Add missing blank lines between top-level functions (PEP 8)
- Add tests for load_cli_config() expansion
---------
Co-authored-by: teyrebaz33 <hakanerten02@hotmail.com>
Cherry-picked from PR #2576 by ereid7, plus read-side fix from 173a5c62.
Both fixes were originally landed in 173a5c62 but were inadvertently
reverted by commit 34be3f8b (a squash-merge that bundled unrelated
tools_config.py changes).
Save side (_save_platform_tools): exclude platform default toolset
names (hermes-cli, hermes-telegram) from preserved entries so they
don't silently re-enable everything.
Read side (_get_platform_tools): when the saved list contains explicit
configurable keys, use direct membership instead of subset inference.
The subset approach is broken when composite toolsets like hermes-cli
resolve to ALL tools.
Based on PR #2454 by @kshitijk4poor (reimplemented lean — 127 lines
vs original 715).
Type @ in the CLI input to get autocomplete suggestions for context
references:
- Static: @diff, @staged, @file:, @folder:, @git:, @url:
- @file:path and @folder:path browse the filesystem
- Bare @ or @partial shows matching files/folders from cwd
Dropped from original: .hermesignore walking, custom shell tokenizer,
PathToken dataclass, fuzzy matching, token estimates. Kept: all
user-facing functionality.
Reads auxiliary.vision.timeout from config.yaml (default: 30s) and
passes it to async_call_llm. Useful for slow local vision models
that need more than 30 seconds.
Setting is in config.yaml (not .env) since it's not a secret:
auxiliary:
vision:
timeout: 120
Based on PR #2306.
Co-authored-by: kshitijk4poor <kshitijk4poor@users.noreply.github.com>
* fix: respect DashScope v1 runtime mode for alibaba
Remove the hardcoded Alibaba branch from resolve_runtime_provider()
that forced api_mode='anthropic_messages' regardless of the base URL.
Alibaba now goes through the generic API-key provider path, which
auto-detects the protocol from the URL:
- /apps/anthropic → anthropic_messages (via endswith check)
- /v1 → chat_completions (default)
This fixes Alibaba setup with OpenAI-compatible DashScope endpoints
(e.g. coding-intl.dashscope.aliyuncs.com/v1) that were broken because
runtime always forced Anthropic mode even when setup saved a /v1 URL.
Based on PR #2024 by @kshitijk4poor.
* docs(skill): add split, merge, search examples to ocr-and-documents skill
Adds pymupdf examples for PDF splitting, merging, and text search
to the existing ocr-and-documents skill. No new dependencies — pymupdf
already covers all three operations natively.
* fix: replace all production print() calls with logger in rl_training_tool
Replace all bare print() calls in production code paths with proper logger calls.
- Add `import logging` and module-level `logger = logging.getLogger(__name__)`
- Replace print() in _start_training_run() with logger.info()
- Replace print() in _stop_training_run() with logger.info()
- Replace print(Warning/Note) calls with logger.warning() and logger.info()
Using the logging framework allows log level filtering, proper formatting,
and log routing instead of always printing to stdout.
* fix(gateway): process /queue'd messages after agent completion
/queue stored messages in adapter._pending_messages but never consumed
them after normal (non-interrupted) completion. The consumption path
at line 5219 only checked pending messages when result.get('interrupted')
was True — since /queue deliberately doesn't interrupt, queued messages
were silently dropped.
Now checks adapter._pending_messages after both interrupted AND normal
completion. For queued messages (non-interrupt), the first response is
delivered before recursing to process the queued follow-up. Skips the
direct send when streaming already delivered the response.
Reported by GhostMode on Discord.
* chore: add minimax/minimax-m2.7 to OpenRouter and MiniMax model catalogs
---------
Co-authored-by: kshitijk4poor <kshitijk4poor@users.noreply.github.com>
Co-authored-by: memosr.eth <96793918+memosr@users.noreply.github.com>
Reverts the sanitizer addition from PR #2466 (originally #2129).
We already have _empty_content_retries handling for reasoning-only
responses. The trailing strip risks silently eating valid messages
and is redundant with existing empty-content handling.
Add hermes mcp add/remove/list/test/configure CLI for managing MCP
server connections interactively. Discovery-first 'add' flow connects,
discovers tools, and lets users select which to enable via curses checklist.
Add OAuth 2.1 PKCE authentication for MCP HTTP servers (RFC 7636).
Supports browser-based and manual (headless) authorization, token
caching with 0600 permissions, automatic refresh. Zero external deps.
Add ${ENV_VAR} interpolation in MCP server config values, resolved
from os.environ + ~/.hermes/.env at load time.
Core OAuth module from PR #2021 by @imnotdev25. CLI and mcp_tool
wiring rewritten against current main. Closes#497, #690.
* docs: add Gemini OAuth provider implementation plan
Planning doc for a standard-route Gemini provider using Google OAuth
(Authorization Code + PKCE) with the OpenAI-compatible endpoint at
generativelanguage.googleapis.com. Covers OAuth flow, token lifecycle,
file list, and estimated scope (~700 lines).
Replaces the Node.js bridge approach from PR #2042.
* chore: update OpenRouter model list
- Add xiaomi/mimo-v2-pro
- Add nvidia/nemotron-3-super-120b-a12b (paid, higher rate limits)
- Remove openrouter/hunter-alpha and openrouter/healer-alpha (discontinued)
Closes#2453
The DEFAULT_CONFIG was hardcoding google/gemini-3-flash-preview as the
summary_model for context compression. This caused unexpected OpenRouter
charges for users who configured a different provider/model, because the
compression task would silently fall back to gemini via OpenRouter even
when the user's main model was on a different provider.
Fix: change summary_model default to empty string. When empty,
call_llm() resolves the model through the standard auto-detection chain
(auxiliary.compression config -> env vars -> main provider), which
correctly uses the user's configured provider and model.
Users who want a dedicated cheap model for compression can still
explicitly set compression.summary_model in their config.yaml.
Remove the hardcoded Alibaba branch from resolve_runtime_provider()
that forced api_mode='anthropic_messages' regardless of the base URL.
Alibaba now goes through the generic API-key provider path, which
auto-detects the protocol from the URL:
- /apps/anthropic → anthropic_messages (via endswith check)
- /v1 → chat_completions (default)
This fixes Alibaba setup with OpenAI-compatible DashScope endpoints
(e.g. coding-intl.dashscope.aliyuncs.com/v1) that were broken because
runtime always forced Anthropic mode even when setup saved a /v1 URL.
Based on PR #2024 by @kshitijk4poor.
Co-authored-by: kshitijk4poor <kshitijk4poor@users.noreply.github.com>
Cherry-picked from PR #2122 by @AtlasMeridia.
1. do_inspect bytes crash: bundle.files returns bytes for official
skills, .split() expected str. Added decode guard.
2. GitHub redirects: three httpx.get calls missing follow_redirects=True,
causing silent 301 failures on renamed orgs.
3. Skill discovery fallback: scan repo root directories when standard
paths (skills/, .agents/skills/, .claude/skills/) miss.
4. tap list KeyError: t['repo'] crashes for local taps. Use safe .get().
auth_type was "***" instead of "api_key" and api_key_env_vars was
("OPEN...",) instead of ("OPENCODE_GO_API_KEY",). This was introduced
in 35d948b6 when a secret redaction tool masked these values during
the Kilo Code provider commit. OpenCode Go provider was completely
broken as a result.
When 'hermes update' stashes local changes and the restore hits
conflicts, the previous behavior silently ran 'git reset --hard HEAD'
to clean up. This could surprise users who didn't realize their
working tree was being nuked.
Now the conflict handler:
- Lists the specific conflicted files
- Reassures the user their stash is preserved
- Asks before resetting (interactive mode)
- Auto-resets in non-interactive mode (prompt_user=False)
- If declined, leaves the working tree as-is with guidance
When `hermes update` stashes local changes and the subsequent
`git stash apply` fails or leaves unmerged files, the conflict markers
(<<<<<<< etc.) were left in the working tree, making Hermes unrunnable
until manually cleaned up.
Now the update command runs `git reset --hard HEAD` to restore a clean
working tree before exiting, and also detects unmerged files even when
git stash apply reports success.
Closes#2348
Only honor config.model.base_url for Anthropic resolution when
config.model.provider is actually "anthropic". This prevents a Codex
(or other provider) base_url from leaking into Anthropic runtime and
auxiliary client paths, which would send requests to the wrong
endpoint.
Closes#2384
Add has_usable_secret() to reject empty, short (<4 char), and common
placeholder API key values (changeme, your_api_key, placeholder, etc.)
throughout the auth/runtime resolution chain.
Update list_available_providers() to use provider-specific auth status
via get_auth_status() instead of resolve_runtime_provider(), preventing
cross-provider key fallback from making providers appear available when
they aren't actually configured.
Preserve keyless custom endpoint support by checking via base URL.
Cherry-picked from PR #2121 by aashizpoudel.
- Add resolve_config_path(): checks $HERMES_HOME/honcho.json first,
falls back to ~/.honcho/config.json. Enables isolated Hermes instances
with independent Honcho credentials and settings.
- Update CLI and doctor to use resolved path instead of hardcoded global.
- Change default session_strategy from per-session to per-directory.
Part 1 of #1962 by @erosika.
Cherry-picked from PR #2319 by @itenev.
When the gateway fails to connect (e.g. PrivilegedIntentsRequired,
missing token), systemd's default RestartSec=10 with no start rate
limit causes rapid reconnect storms flooding logs and triggering
platform-side rate limits.
- StartLimitIntervalSec=600 + StartLimitBurst=5 in [Unit] (max 5
restarts per 10 min)
- RestartSec: 10 → 30
- Applied to both templates in gateway.py and scripts/hermes-gateway
Same bug as opencode-zen/go — alibaba fell through to the OpenRouter
model list instead of using _setup_provider_model_selection() which
probes the provider's own /models endpoint.
All user-selectable providers now have correct model selection routing.
After selecting OpenCode Zen or Go as provider in hermes setup, the
model selection page showed OpenRouter models because these providers
weren't in the list that routes to _setup_provider_model_selection().
They fell through to the else branch which shows the OpenRouter catalog.
Users ended up with an OpenCode API key but an OpenRouter model name,
causing 'Provider resolver returned an empty API key' on first use.
Fix: add opencode-zen and opencode-go to the provider list that uses
_setup_provider_model_selection() for live /models detection.
Fresh installs without pull.rebase configured hit a git error when
running hermes update because git doesn't know how to reconcile
divergent branches. --ff-only is the right strategy: it works for the
normal case (local branch is behind remote) and fails cleanly if the
user somehow has local commits, rather than silently rebasing them.
The top-level 'toolsets' key in config.yaml was never read at runtime.
Tool selection uses platform_toolsets (per-platform) or the --toolsets
CLI flag. The key existed in load_cli_config() defaults and the example
config as 'toolsets: [all]', misleading users into thinking it
controlled tool availability.
- Remove from load_cli_config() hardcoded defaults
- Remove from hermes config show output
- Replace in cli-config.yaml.example with deprecation note pointing
to platform_toolsets and hermes tools
Two bugs in the save/load roundtrip for platform_toolsets:
1. _save_platform_tools preserved composite toolset entries (hermes-cli,
hermes-telegram, etc.) because they weren't in configurable_keys.
These composites include ALL _HERMES_CORE_TOOLS, so having hermes-cli
in the saved list alongside individual keys negated any disables —
the subset check always found the disabled toolset's tools via the
composite entry.
Fix: also filter out known TOOLSETS keys from preserved entries. Only
truly unknown entries (MCP server names, custom entries) are kept.
2. _get_platform_tools used reverse subset inference to determine which
configurable toolsets were enabled. This is inherently broken when
tools appear in multiple toolsets (e.g. HA tools in both the
homeassistant toolset and _HERMES_CORE_TOOLS).
Fix: when the saved list contains explicit configurable keys (meaning
the user has configured this platform), use direct membership instead
of subset inference. The fallback path still handles legacy configs
that only have a composite entry like hermes-cli.
- Changed the ANSI escape code for gold color in cli.py and banner.py to use true-color format (#FFD700) for better visual consistency.
- Enhanced the _on_tool_progress method in HermesCLI to update the TUI spinner with tool execution status, improving user feedback during operations.
These changes improve the visual representation and user experience in the command-line interface.
Co-authored-by: Test <test@test.com>
Adds /queue <prompt> (alias /q) that queues a message for the next
turn while the agent is busy, without interrupting the current run.
- CLI: /queue <prompt> puts it in _pending_input for the next turn
- Gateway: /queue <prompt> creates a pending MessageEvent on the
adapter, picked up after the current agent run finishes
- Enter still interrupts as usual (no behavior change)
- /queue with no prompt shows usage
- /queue when agent is idle tells user to just type normally
Co-authored-by: Test <test@test.com>
Replace the fragile hardcoded context length system with a multi-source
resolution chain that correctly identifies context windows per provider.
Key changes:
- New agent/models_dev.py: Fetches and caches the models.dev registry
(3800+ models across 100+ providers with per-provider context windows).
In-memory cache (1hr TTL) + disk cache for cold starts.
- Rewritten get_model_context_length() resolution chain:
0. Config override (model.context_length)
1. Custom providers per-model context_length
2. Persistent disk cache
3. Endpoint /models (local servers)
4. Anthropic /v1/models API (max_input_tokens, API-key only)
5. OpenRouter live API (existing, unchanged)
6. Nous suffix-match via OpenRouter (dot/dash normalization)
7. models.dev registry lookup (provider-aware)
8. Thin hardcoded defaults (broad family patterns)
9. 128K fallback (was 2M)
- Provider-aware context: same model now correctly resolves to different
context windows per provider (e.g. claude-opus-4.6: 1M on Anthropic,
128K on GitHub Copilot). Provider name flows through ContextCompressor.
- DEFAULT_CONTEXT_LENGTHS shrunk from 80+ entries to ~16 broad patterns.
models.dev replaces the per-model hardcoding.
- CONTEXT_PROBE_TIERS changed from [2M, 1M, 512K, 200K, 128K, 64K, 32K]
to [128K, 64K, 32K, 16K, 8K]. Unknown models no longer start at 2M.
- hermes model: prompts for context_length when configuring custom
endpoints. Supports shorthand (32k, 128K). Saved to custom_providers
per-model config.
- custom_providers schema extended with optional models dict for
per-model context_length (backward compatible).
- Nous Portal: suffix-matches bare IDs (claude-opus-4-6) against
OpenRouter's prefixed IDs (anthropic/claude-opus-4.6) with dot/dash
normalization. Handles all 15 current Nous models.
- Anthropic direct: queries /v1/models for max_input_tokens. Only works
with regular API keys (sk-ant-api*), not OAuth tokens. Falls through
to models.dev for OAuth users.
Tests: 5574 passed (18 new tests for models_dev + updated probe tiers)
Docs: Updated configuration.md context length section, AGENTS.md
Co-authored-by: Test <test@test.com>
Based on PR #1859 by @magi-morph (too stale to cherry-pick, reimplemented).
GPT-5.x models reject tool calls + reasoning_effort on
/v1/chat/completions with a 400 error directing to /v1/responses.
This auto-detects api.openai.com in the base URL and switches to
codex_responses mode in three places:
- AIAgent.__init__: upgrades chat_completions → codex_responses
- _try_activate_fallback(): same routing for fallback model
- runtime_provider.py: _detect_api_mode_for_url() for both custom
provider and openrouter runtime resolution paths
Also extracts _is_direct_openai_url() helper to replace the inline
check in _max_tokens_param().
Cherry-picked from PR #2120 by @unclebumpy.
- from_env() now reads HONCHO_BASE_URL and enables Honcho when base_url
is set, even without an API key
- from_global_config() reads baseUrl from config root with
HONCHO_BASE_URL env var as fallback
- get_honcho_client() guard relaxed to allow base_url without api_key
for no-auth local instances
- Added HONCHO_BASE_URL to OPTIONAL_ENV_VARS registry
Result: Setting HONCHO_BASE_URL=http://localhost:8000 in ~/.hermes/.env
now correctly routes the Honcho client to a local instance.
Show complete session IDs in 'hermes sessions list' instead of
truncating to 20 characters. Widens title column from 20→30 chars
and adjusts header widths accordingly.
Fixes#2068. Based on PR #2085 by @Nebula037 with a correction
to preserve the no-titles layout (the original PR accidentally
replaced the Preview/Src header with a duplicate Title/Preview header).
MiniMax's default base URL was /v1 which caused runtime_provider to
default to chat_completions mode (OpenAI-style Authorization: Bearer
header). MiniMax rejects this with a 401 because they require the
Anthropic-style x-api-key header.
Changes:
- auth.py: Change default inference_base_url for minimax and minimax-cn
from /v1 to /anthropic
- runtime_provider.py: Auto-correct stale /v1 URLs from existing .env
files to /anthropic, and always default minimax/minimax-cn providers
to anthropic_messages mode
- Update tests to reflect new defaults, add tests for stale URL
auto-correction and explicit api_mode override
Based on PR #2100 by @devorun. Fixes#2094.
Co-authored-by: Test <test@test.com>
Two issues with /model preventing proper provider switching:
1. Bare provider names not detected: typing '/model nous' treated 'nous'
as a model name instead of triggering a provider switch. Fixed by adding
step 0 in detect_provider_for_model() that checks if the input matches
a known provider name/alias (excluding 'custom'/'openrouter' which need
explicit model names) and returns that provider's default model.
2. Custom endpoint details hidden: /model (no args) showed '[custom]' with
just a usage hint but no endpoint URL or model name. Now displays the
configured base_url for custom providers in both CLI and gateway.
Note: config base_url and OPENAI_BASE_URL are intentionally NOT cleared on
provider switch — dedicated provider paths (nous, anthropic, codex) have
their own credential resolution that ignores these, and clearing them would
destroy the user's custom endpoint config, preventing switching back.
Co-authored-by: Test <test@test.com>
* fix: detect context length for custom model endpoints via fuzzy matching + config override
Custom model endpoints (non-OpenRouter, non-known-provider) were silently
falling back to 2M tokens when the model name didn't exactly match what the
endpoint's /v1/models reported. This happened because:
1. Endpoint metadata lookup used exact match only — model name mismatches
(e.g. 'qwen3.5:9b' vs 'Qwen3.5-9B-Q4_K_M.gguf') caused a miss
2. Single-model servers (common for local inference) required exact name
match even though only one model was loaded
3. No user escape hatch to manually set context length
Changes:
- Add fuzzy matching for endpoint model metadata: single-model servers
use the only available model regardless of name; multi-model servers
try substring matching in both directions
- Add model.context_length config override (highest priority) so users
can explicitly set their model's context length in config.yaml
- Log an informative message when falling back to 2M probe, telling
users about the config override option
- Thread config_context_length through ContextCompressor and AIAgent init
Tests: 6 new tests covering fuzzy match, single-model fallback, config
override (including zero/None edge cases).
* fix: auto-detect local model name and context length for local servers
Cherry-picked from PR #2043 by sudoingX.
- Auto-detect model name from local server's /v1/models when only one
model is loaded (no manual model name config needed)
- Add n_ctx_train and n_ctx to context length detection keys for llama.cpp
- Query llama.cpp /props endpoint for actual allocated context (not just
training context from GGUF metadata)
- Strip .gguf suffix from display in banner and status bar
- _auto_detect_local_model() in runtime_provider.py for CLI init
Co-authored-by: sudo <sudoingx@users.noreply.github.com>
* fix: revert accidental summary_target_tokens change + add docs for context_length config
- Revert summary_target_tokens from 2500 back to 500 (accidental change
during patching)
- Add 'Context Length Detection' section to Custom & Self-Hosted docs
explaining model.context_length config override
---------
Co-authored-by: Test <test@test.com>
Co-authored-by: sudo <sudoingx@users.noreply.github.com>
The gateway approval system previously intercepted bare 'yes'/'no' text
from the user's next message to approve/deny dangerous commands. This was
fragile and dangerous — if the agent asked a clarify question and the user
said 'yes' to answer it, the gateway would execute the pending dangerous
command instead. (Fixes#1888)
Changes:
- Remove bare text matching ('yes', 'y', 'approve', 'ok', etc.) from
_handle_message approval check
- Add /approve and /deny as gateway-only slash commands in the command
registry
- /approve supports scoping: /approve (one-time), /approve session,
/approve always (permanent)
- Add 5-minute timeout for stale approvals
- Gateway appends structured instructions to the agent response when a
dangerous command is pending, telling the user exactly how to respond
- 9 tests covering approve, deny, timeout, scoping, and verification
that bare 'yes' no longer triggers execution
Credit to @solo386 and @FlyByNight69420 for identifying and reporting
this security issue in PR #1971 and issue #1888.
Co-authored-by: Test <test@test.com>
After #1675 removed ANTHROPIC_BASE_URL env var support, the Anthropic
provider base URL was hardcoded to https://api.anthropic.com. Now reads
model.base_url from config.yaml as an override, falling back to the
default when not set. Also applies to the auxiliary client.
Cherry-picked from PR #1949 by @rivercrab26.
Co-authored-by: rivercrab26 <rivercrab26@users.noreply.github.com>
Three bugs prevented providers like MiniMax from using their
Anthropic-compatible endpoints (e.g. api.minimax.io/anthropic):
1. _VALID_API_MODES was missing 'anthropic_messages', so explicit
api_mode config was silently rejected and defaulted to
chat_completions.
2. API-key provider resolution hardcoded api_mode to 'chat_completions'
without checking model config or detecting Anthropic-compatible URLs.
3. run_agent.py auto-detection only recognized api.anthropic.com, not
third-party endpoints using the /anthropic URL convention.
Fixes:
- Add 'anthropic_messages' to _VALID_API_MODES
- API-key providers now check model config api_mode and auto-detect
URLs ending in /anthropic
- run_agent.py and fallback logic detect /anthropic URL convention
- 5 new tests covering all scenarios
Users can now either:
- Set MINIMAX_BASE_URL=https://api.minimax.io/anthropic (auto-detected)
- Set api_mode: anthropic_messages in model config (explicit)
- Use custom_providers with api_mode: anthropic_messages
Co-authored-by: Test <test@test.com>
When provider: custom is set in config.yaml with base_url and api_key,
those values are now used instead of falling back to OPENAI_BASE_URL and
OPENAI_API_KEY env vars. Also reads the 'api' field as an alternative to
'api_key' for config compatibility.
Cherry-picked from PR #1762 by crazywriter1.
Co-authored-by: crazywriter1 <53251494+crazywriter1@users.noreply.github.com>
The previous copilot_model_api_mode() checked the catalog's
supported_endpoints first and picked /chat/completions when a model
supported both endpoints. This is wrong — GPT-5+ models should use
the Responses API even when the catalog lists both.
Replicate opencode's shouldUseCopilotResponsesApi() logic:
- GPT-5+ models (gpt-5.4, gpt-5.3-codex, etc.) → Responses API
- gpt-5-mini → Chat Completions (explicit exception)
- Everything else (gpt-4o, claude, gemini, etc.) → Chat Completions
- Model ID pattern is the primary signal, catalog is secondary
The catalog fallback now only matters for non-GPT-5 models that might
exclusively support /v1/messages (e.g. Claude via Copilot).
Models are auto-detected from the live catalog at
api.githubcopilot.com/models — no hardcoded list required for
supported models, only a static fallback for when the API is
unreachable.
Adds /statusbar (alias /sb) to show/hide the bottom status bar that
displays model name, context usage, and session duration.
Uses ConditionalContainer so the bar takes zero space when hidden
rather than leaving a blank line.
- Add anthropic/claude-haiku-4.5
- Move gpt-5.4-pro and gpt-5.4-nano to bottom
- Fix minimax/minimax-m2.7 → minimax-m2.5 (m2.7 not on OpenRouter)
- Tag hunter-alpha and healer-alpha as free
- Place hunter/healer-alpha right below gpt-5.4-mini
Builds on PR #1879's Copilot integration with critical auth improvements
modeled after opencode's implementation:
- Add hermes_cli/copilot_auth.py with:
- OAuth device code flow (copilot_device_code_login) using the same
client_id (Ov23li8tweQw6odWQebz) as opencode and Copilot CLI
- Token type validation: reject classic PATs (ghp_*) with a clear
error message explaining supported token types
- Proper env var priority: COPILOT_GITHUB_TOKEN > GH_TOKEN > GITHUB_TOKEN
(matching Copilot CLI documentation)
- copilot_request_headers() with Openai-Intent, x-initiator, and
Copilot-Vision-Request headers (matching opencode)
- Update auth.py:
- PROVIDER_REGISTRY copilot entry uses correct env var order
- _resolve_api_key_provider_secret delegates to copilot_auth for
the copilot provider with proper token validation
- Update models.py:
- copilot_default_headers() now includes Openai-Intent and x-initiator
- Update main.py:
- _model_flow_copilot offers OAuth device code login when no token
is found, with manual token entry as fallback
- Shows supported vs unsupported token types
- 22 new tests covering token validation, env var priority, header
generation, and integration with existing auth infrastructure
- Strip '_tools' suffix from internal toolset identifiers in the banner
(e.g. 'web_tools' -> 'web', 'homeassistant_tools' -> 'homeassistant')
- Stop appending '_tools' to unavailable toolset names
- Replace 6 hardcoded hex colors (#B8860B, #FFBF00, #FFF8DC) in toolset
rows, overflow line, and MCP server rows with the skin variables
(dim, accent, text) already resolved at the top of the function
Inspired by PR #1871 by @kshitijk4poor.
Adds 4 tests.
* fix: banner skill count now respects disabled skills and platform filtering
The banner's get_available_skills() was doing a raw rglob scan of
~/.hermes/skills/ without checking:
- Whether skills are disabled (skills.disabled config)
- Whether skills match the current platform (platforms: frontmatter)
This caused the banner to show inflated skill counts (e.g. '100 skills'
when many are disabled) and list macOS-only skills on Linux.
Fix: delegate to _find_all_skills() from tools/skills_tool which already
handles both platform gating and disabled-skill filtering.
* fix: system prompt and slash commands now respect disabled skills
Two more places where disabled skills were still surfaced:
1. build_skills_system_prompt() in prompt_builder.py — disabled skills
appeared in the <available_skills> system prompt section, causing
the agent to suggest/load them despite being disabled.
2. scan_skill_commands() in skill_commands.py — disabled skills still
registered as /skill-name slash commands in CLI help and could be
invoked.
Both now load _get_disabled_skill_names() and filter accordingly.
* fix: skill_view blocks disabled skills
skill_view() checked platform compatibility but not disabled state,
so the agent could still load and read disabled skills directly.
Now returns a clear error when a disabled skill is requested, telling
the user to enable it via hermes skills or inspect the files manually.
---------
Co-authored-by: Test <test@test.com>
Recognize hermes_cli/main.py gateway command lines in gateway
process detection and PID validation so --replace reliably finds
existing gateway instances.
Adds a regression test covering script-style cmdline detection.
Closes#1830
Add _wait_for_gateway_exit() that polls get_running_pid() to confirm
the old gateway process has actually exited before starting a new one.
If the process doesn't exit within 5s, sends SIGKILL to the specific
PID. Uses the saved PID from gateway.pid (not launchd labels) so it
works correctly with multiple gateway instances under separate
HERMES_HOME directories.
Applied to both launchd_restart() and the manual restart path (replaces
the blind time.sleep(2)).
Inspired by PR #1881 by @AzothZephyr (race condition diagnosis).
Adds 4 tests.
MiniMax: Add M2.7 and M2.7-highspeed as new defaults across provider
model lists, auxiliary client, metadata, setup wizard, RL training tool,
fallback tests, and docs. Retain M2.5/M2.1 as alternatives.
OpenRouter: Add grok-4.20-beta, nemotron-3-super-120b-a12b:free,
trinity-large-preview:free, glm-5-turbo, and hunter-alpha to the
model catalog.
MiniMax changes based on PR #1882 by @octo-patch (applied manually
due to stale conflicts in refactored pricing module).
Add first-class GitHub Copilot and Copilot ACP provider support across
model selection, runtime provider resolution, CLI sessions, delegated
subagents, cron jobs, and the Telegram gateway.
This also normalizes Copilot model catalogs and API modes, introduces a
Copilot ACP OpenAI-compatible shim, and fixes service-mode auth by
resolving Homebrew-installed gh binaries under launchd.
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
The /browser command handler existed in cli.py but was never added to
COMMAND_REGISTRY after the centralized command registry refactor. This
meant:
- /browser didn't appear in /help
- No tab-completion or subcommand suggestions
- Dispatch used _base_word fallback instead of canonical resolution
Added CommandDef with connect/disconnect/status subcommands and
switched dispatch to use canonical instead of _base_word.
* feat: OpenAI-compatible API server platform adapter
Salvaged from PR #956, updated for current main.
Adds an HTTP API server as a gateway platform adapter that exposes
hermes-agent via the OpenAI Chat Completions and Responses APIs.
Any OpenAI-compatible frontend (Open WebUI, LobeChat, LibreChat,
AnythingLLM, NextChat, ChatBox, etc.) can connect by pointing at
http://localhost:8642/v1.
Endpoints:
- POST /v1/chat/completions — stateless Chat Completions API
- POST /v1/responses — stateful Responses API with chaining
- GET /v1/responses/{id} — retrieve stored response
- DELETE /v1/responses/{id} — delete stored response
- GET /v1/models — list hermes-agent as available model
- GET /health — health check
Features:
- Real SSE streaming via stream_delta_callback (uses main's streaming)
- In-memory LRU response store for Responses API conversation chaining
- Named conversations via 'conversation' parameter
- Bearer token auth (optional, via API_SERVER_KEY)
- CORS support for browser-based frontends
- System prompt layering (frontend system messages on top of core)
- Real token usage tracking in responses
Integration points:
- Platform.API_SERVER in gateway/config.py
- _create_adapter() branch in gateway/run.py
- API_SERVER_* env vars in hermes_cli/config.py
- Env var overrides in gateway/config.py _apply_env_overrides()
Changes vs original PR #956:
- Removed streaming infrastructure (already on main via stream_consumer.py)
- Removed Telegram reply_to_mode (separate feature, not included)
- Updated _resolve_model() -> _resolve_gateway_model()
- Updated stream_callback -> stream_delta_callback
- Updated connect()/disconnect() to use _mark_connected()/_mark_disconnected()
- Adapted to current Platform enum (includes MATTERMOST, MATRIX, DINGTALK)
Tests: 72 new tests, all passing
Docs: API server guide, Open WebUI integration guide, env var reference
* feat(whatsapp): make reply prefix configurable via config.yaml
Reworked from PR #1764 (ifrederico) to use config.yaml instead of .env.
The WhatsApp bridge prepends a header to every outgoing message.
This was hardcoded to '⚕ *Hermes Agent*'. Users can now customize
or disable it via config.yaml:
whatsapp:
reply_prefix: '' # disable header
reply_prefix: '🤖 *My Bot*\n───\n' # custom prefix
How it works:
- load_gateway_config() reads whatsapp.reply_prefix from config.yaml
and stores it in PlatformConfig.extra['reply_prefix']
- WhatsAppAdapter reads it from config.extra at init
- When spawning bridge.js, the adapter passes it as
WHATSAPP_REPLY_PREFIX in the subprocess environment
- bridge.js handles undefined (default), empty (no header),
or custom values with \\n escape support
- Self-chat echo suppression uses the configured prefix
Also fixes _config_version: was 9 but ENV_VARS_BY_VERSION had a
key 10 (TAVILY_API_KEY), so existing users at v9 would never be
prompted for Tavily. Bumped to 10 to close the gap. Added a
regression test to prevent this from happening again.
Credit: ifrederico (PR #1764) for the bridge.js implementation
and the config version gap discovery.
---------
Co-authored-by: Test <test@test.com>
The OpenCode Zen and OpenCode Go setup sections used prompt_text()
which is undefined. All other providers correctly use the local
prompt() function defined in setup.py. Fixes crash during
'hermes setup' when selecting either OpenCode provider.
- Add summary_base_url config option to compression block for custom
OpenAI-compatible endpoints (e.g. zai, DeepSeek, Ollama)
- Remove compression env var bridges from cli.py and gateway/run.py
(CONTEXT_COMPRESSION_* env vars no longer set from config)
- Switch run_agent.py to read compression config directly from
config.yaml instead of env vars
- Fix backwards-compat block in _resolve_task_provider_model to also
fire when auxiliary.compression.provider is 'auto' (DEFAULT_CONFIG
sets this, which was silently preventing the compression section's
summary_* keys from being read)
- Add test for summary_base_url config-to-client flow
- Update docs to show compression as config.yaml-only
Closes#1591
Based on PR #1702 by @uzaylisak
Four small fixes:
1. model_tools.py: Tool import failures logged at WARNING instead of
DEBUG. If a tool module fails to import (syntax error, missing dep),
the user now sees a warning instead of the tool silently vanishing.
2. hermes_cli/config.py: Remove duplicate 'import sys' (lines 19, 21).
3. agent/model_metadata.py: Remove 6 duplicate entries in
DEFAULT_CONTEXT_LENGTHS dict. Python keeps the last value, so no
functional change, but removes maintenance confusion.
4. hermes_state.py: Add missing self._lock to the LIKE query in
resolve_session_id(). The exact-match path used get_session()
(which locks internally), but the prefix fallback queried _conn
without the lock.
Salvage of PR #1707 by @kshitijk4poor (cherry-picked with authorship preserved).
Adds Tavily as a third web backend alongside Firecrawl and Parallel, using the Tavily REST API via httpx.
- Backend selection via hermes tools → saved as web.backend in config.yaml
- All three tools supported: search, extract, crawl
- TAVILY_API_KEY in config registry, doctor, status, setup wizard
- 15 new Tavily tests + 9 backend selection tests + 5 config tests
- Backward compatible
Closes#1707
* feat(web): add Parallel as alternative web search/extract backend
Adds Parallel (parallel.ai) as a drop-in alternative to Firecrawl for
web_search and web_extract tools using the official parallel-web SDK.
- Backend selection via WEB_SEARCH_BACKEND env var (auto/parallel/firecrawl)
- Auto mode prefers Firecrawl when both keys present; Parallel when sole backend
- web_crawl remains Firecrawl-only with clear error when unavailable
- Lazy SDK imports, interrupt support, singleton clients
- 16 new unit tests for backend selection and client config
Co-authored-by: s-jag <s-jag@users.noreply.github.com>
* fix: add PARALLEL_API_KEY to config registry and fix web_crawl policy tests
Follow-up for Parallel backend integration:
- Add PARALLEL_API_KEY to OPTIONAL_ENV_VARS (hermes doctor, env blocklist)
- Add to set_config_value api_keys list (hermes config set)
- Add to doctor keys display
- Fix 2 web_crawl policy tests that didn't set FIRECRAWL_API_KEY
(needed now that web_crawl has a Firecrawl availability guard)
* refactor: explicit backend selection via hermes tools, not auto-detect
Replace the auto-detect backend selection with explicit user choice:
- hermes tools saves WEB_SEARCH_BACKEND to .env when user picks a provider
- _get_backend() reads the explicit choice first
- Fallback only for manual/legacy config (uses whichever key is present)
- _is_provider_active() shows [active] for the selected web backend
- Updated tests, docs, and .env.example to remove 'auto' mode language
* refactor: use config.yaml for web backend, not env var
Match the TTS/browser pattern — web.backend is stored in config.yaml
(set by hermes tools), not as a WEB_SEARCH_BACKEND env var.
- _load_web_config() reads web: section from config.yaml
- _get_backend() reads web.backend from config, falls back to key detection
- _configure_provider() saves to config dict (saved to config.yaml)
- _is_provider_active() reads from config dict
- Removed WEB_SEARCH_BACKEND from .env.example, set_config_value, docs
- Updated all tests to mock _load_web_config instead of env vars
---------
Co-authored-by: s-jag <s-jag@users.noreply.github.com>
The function uses subprocess.run() and subprocess.CalledProcessError but
never imported the module. This caused a NameError crash during setup
when users selected NeuTTS as their TTS provider.
Fixes#1698
Add the ability to selectively enable/disable individual MCP server
tools through the interactive 'hermes tools' TUI.
Changes:
- tools/mcp_tool.py: Add probe_mcp_server_tools() — lightweight function
that temporarily connects to configured MCP servers, discovers their
tools (names + descriptions), and disconnects. No registry side effects.
- hermes_cli/tools_config.py: Add 'Configure MCP tools' option to the
interactive menu. When selected:
1. Probes all enabled MCP servers for their available tools
2. Shows a per-server curses checklist with tool descriptions
3. Pre-selects tools based on existing include/exclude config
4. Writes changes back as tools.exclude entries in config.yaml
5. Reports which servers failed to connect
The existing CLI commands (hermes tools enable/disable server:tool)
continue to work unchanged. This adds the interactive TUI counterpart
so users can browse and toggle MCP tools visually.
Tests: 22 new tests covering probe function edge cases and interactive
flow (pre-selection, exclude/include modes, description truncation,
multi-server handling, error paths).
fetch_nous_models() uses keyword-only parameters (the * separator in
its signature), but models.py called it with positional args and in
the wrong order (api_key first, base_url second). This always raised
TypeError, silently caught by except Exception: pass.
Result: Nous provider model list was completely broken — /model
autocomplete and provider_model_ids('nous') always fell back to the
static model catalog instead of fetching live models.
Adds both platforms to the config system so hermes setup, hermes doctor,
and hermes config properly discover and manage their env vars.
- MATTERMOST_URL, MATTERMOST_TOKEN, MATTERMOST_ALLOWED_USERS
- MATRIX_HOMESERVER, MATRIX_ACCESS_TOKEN, MATRIX_USER_ID, MATRIX_ALLOWED_USERS
- Extra env keys for .env sanitizer: MATTERMOST_HOME_CHANNEL,
MATTERMOST_REPLY_MODE, MATRIX_PASSWORD, MATRIX_ENCRYPTION, MATRIX_HOME_ROOM
Add support for Mattermost (self-hosted Slack alternative) and Matrix
(federated messaging protocol) as messaging platforms.
Mattermost adapter:
- REST API v4 client for posts, files, channels, typing indicators
- WebSocket listener for real-time 'posted' events with reconnect backoff
- Thread support via root_id
- File upload/download with auth-aware caching
- Dedup cache (5min TTL, 2000 entries)
- Full self-hosted instance support
Matrix adapter:
- matrix-nio AsyncClient with sync loop
- Dual auth: access token or user_id + password
- Optional E2EE via matrix-nio[e2e] (libolm)
- Thread support via m.thread (MSC3440)
- Reply support via m.in_reply_to with fallback stripping
- Media upload/download via mxc:// URLs (authenticated v1.11+ endpoint)
- Auto-join on room invite
- DM detection via m.direct account data with sync fallback
- Markdown to HTML conversion
Fixes applied over original PR #1225 by @cyb0rgk1tty:
- Mattermost: add timeout to file downloads, wrap API helpers in
try/except for network errors, download incoming files immediately
with auth headers instead of passing auth-required URLs
- Matrix: use authenticated media endpoint (/_matrix/client/v1/media/),
robust m.direct cache with sync fallback, prefer aiohttp over httpx
Install Matrix support: pip install 'hermes-agent[matrix]'
Mattermost needs no extra deps (uses aiohttp).
Salvaged from PR #1225 by @cyb0rgk1tty with fixes.
* feat(gateway): add DingTalk platform adapter
Add DingTalk as a messaging platform using the dingtalk-stream SDK
for real-time message reception via Stream Mode (no webhook needed).
Replies are sent via session webhook using markdown format.
Features:
- Stream Mode connection (long-lived WebSocket, no public URL needed)
- Text and rich text message support
- DM and group chat support
- Message deduplication with 5-minute window
- Auto-reconnection with exponential backoff
- Session webhook caching for reply routing
Configuration:
export DINGTALK_CLIENT_ID=your-app-key
export DINGTALK_CLIENT_SECRET=your-app-secret
# or in config.yaml:
platforms:
dingtalk:
enabled: true
extra:
client_id: your-app-key
client_secret: your-app-secret
Files:
- gateway/platforms/dingtalk.py (340 lines) — adapter implementation
- gateway/config.py — add DINGTALK to Platform enum
- gateway/run.py — add DingTalk to _create_adapter
- hermes_cli/config.py — add env vars to _EXTRA_ENV_KEYS
- hermes_cli/tools_config.py — add dingtalk to PLATFORMS
- tests/gateway/test_dingtalk.py — 21 tests
* docs: add Alibaba Cloud and DingTalk to setup wizard and docs
Wire Alibaba Cloud (DashScope) into hermes setup and hermes model
provider selection flows. Add DingTalk env vars to documentation.
Changes:
- setup.py: Add Alibaba Cloud as provider choice (index 11) with
DASHSCOPE_API_KEY prompt and model studio link
- main.py: Add alibaba to provider_labels, providers list, and
model flow dispatch
- environment-variables.md: Add DASHSCOPE_API_KEY, DINGTALK_CLIENT_ID,
DINGTALK_CLIENT_SECRET, and alibaba to HERMES_INFERENCE_PROVIDER
- Default enabled: false (zero overhead when not configured)
- Fast path: cached disabled state skips all work immediately
- TTL cache (30s) for parsed policy — avoids re-reading config.yaml
on every URL check
- Missing shared files warn + skip instead of crashing all web tools
- Lazy yaml import — missing PyYAML doesn't break browser toolset
- Guarded browser_tool import — fail-open lambda fallback
- check_website_access never raises for default path (fail-open with
warning log); only raises with explicit config_path (test mode)
- Simplified enforcement code in web_tools/browser_tool — no more
try/except wrappers since errors are handled internally
Add DingTalk as a messaging platform using the dingtalk-stream SDK
for real-time message reception via Stream Mode (no webhook needed).
Replies are sent via session webhook using markdown format.
Features:
- Stream Mode connection (long-lived WebSocket, no public URL needed)
- Text and rich text message support
- DM and group chat support
- Message deduplication with 5-minute window
- Auto-reconnection with exponential backoff
- Session webhook caching for reply routing
Configuration:
export DINGTALK_CLIENT_ID=your-app-key
export DINGTALK_CLIENT_SECRET=your-app-secret
# or in config.yaml:
platforms:
dingtalk:
enabled: true
extra:
client_id: your-app-key
client_secret: your-app-secret
Files:
- gateway/platforms/dingtalk.py (340 lines) — adapter implementation
- gateway/config.py — add DINGTALK to Platform enum
- gateway/run.py — add DingTalk to _create_adapter
- hermes_cli/config.py — add env vars to _EXTRA_ENV_KEYS
- hermes_cli/tools_config.py — add dingtalk to PLATFORMS
- tests/gateway/test_dingtalk.py — 21 tests
ANTHROPIC_BASE_URL collides with Claude Code and other Anthropic
tooling. Remove it from the Anthropic provider — base URL overrides
should go through config.yaml model.base_url instead.
The Alibaba/DashScope provider has its own dedicated base URL and
API key env vars which don't collide with anything.
Add display.theme_mode setting (auto/light/dark) that makes the CLI
readable on light terminal backgrounds.
- Auto-detect terminal background via COLORFGBG, OSC 11, and macOS
appearance (fallback chain in hermes_cli/colors.py)
- Add colors_light overrides to all 7 built-in skins with dark/readable
colors for light backgrounds
- SkinConfig.get_color() now returns light overrides when theme is light
- get_prompt_toolkit_style_overrides() uses light bg colors for
completion menus in light mode
- init_skin_from_config() reads display.theme_mode from config
- 7 new tests covering theme mode resolution, detection fallbacks,
and light-mode skin overrides
Salvaged from PR #1187 by @peteromallet. Core design preserved;
adapted to current main (kept all existing helpers, tool_emojis,
convenience functions that were added after the PR branched).
Co-authored-by: Peter O'Mallet <peteromallet@users.noreply.github.com>
Add Alibaba Cloud (DashScope) as a first-class inference provider
using the Anthropic-compatible endpoint. This gives access to Qwen
models (qwen3.5-plus, qwen3-max, qwen3-coder-plus, etc.) through
the same api_mode as native Anthropic.
Also add ANTHROPIC_BASE_URL env var support so users can point the
Anthropic provider at any compatible endpoint.
Changes:
- auth.py: Add alibaba ProviderConfig + ANTHROPIC_BASE_URL on anthropic
- models.py: Add alibaba to catalog, labels, aliases (dashscope/aliyun/qwen), provider order
- runtime_provider.py: Add alibaba resolution (anthropic_messages api_mode) + ANTHROPIC_BASE_URL
- model_metadata.py: Add Qwen model context lengths (128K)
- config.py: Add DASHSCOPE_API_KEY, DASHSCOPE_BASE_URL, ANTHROPIC_BASE_URL env vars
Usage:
hermes --provider alibaba --model qwen3.5-plus
# or via aliases:
hermes --provider qwen --model qwen3-max
Add Kilo Gateway (kilo.ai) as an API-key provider with OpenAI-compatible
endpoint at https://api.kilo.ai/api/gateway. Supports 500+ models from
Anthropic, OpenAI, Google, xAI, Mistral, MiniMax via a single API key.
- Register kilocode in PROVIDER_REGISTRY with aliases (kilo, kilo-code,
kilo-gateway) and KILOCODE_API_KEY / KILOCODE_BASE_URL env vars
- Add to model catalog, CLI provider menu, setup wizard, doctor checks
- Add google/gemini-3-flash-preview as default aux model
- 12 new tests covering registration, aliases, credential resolution,
runtime config
- Documentation updates (env vars, config, fallback providers)
- Fix setup test index shift from provider insertion
Inspired by PR #1473 by @amanning3390.
Co-authored-by: amanning3390 <amanning3390@users.noreply.github.com>
Docker terminal sessions are secret-dark by default. This adds
terminal.docker_forward_env as an explicit allowlist for env vars
that may be forwarded into Docker containers.
Values resolve from the current shell first, then fall back to
~/.hermes/.env. Only variables the user explicitly lists are
forwarded — nothing is auto-exposed.
Cherry-picked from PR #1449 by @teknium1, conflict-resolved onto
current main.
Fixes#1436
Supersedes #1439
Remove the optional skill (redundant now that NeuTTS is a built-in TTS
provider). Replace neutts_cli dependency with a standalone synthesis
helper (tools/neutts_synth.py) that calls the neutts Python API directly
in a subprocess.
Add TTS provider selection to hermes setup:
- 'hermes setup' now prompts for TTS provider after model selection
- 'hermes setup tts' available as standalone section
- Selecting NeuTTS checks for deps and offers to install:
espeak-ng (system) + neutts[all] (pip)
- ElevenLabs/OpenAI selections prompt for API keys
- Tool status display shows NeuTTS install state
Changes:
- Remove optional-skills/mlops/models/neutts/ (skill + CLI scaffold)
- Add tools/neutts_synth.py (standalone synthesis subprocess helper)
- Move jo.wav/jo.txt to tools/neutts_samples/ (bundled default voice)
- Refactor _generate_neutts() — uses neutts API via subprocess, no
neutts_cli dependency, config-driven ref_audio/ref_text/model/device
- Add TTS setup to hermes_cli/setup.py (SETUP_SECTIONS, tool status)
- Update config.py defaults (ref_audio, ref_text, model, device)
* fix: prevent infinite 400 failure loop on context overflow (#1630)
When a gateway session exceeds the model's context window, Anthropic may
return a generic 400 invalid_request_error with just 'Error' as the
message. This bypassed the phrase-based context-length detection,
causing the agent to treat it as a non-retryable client error. Worse,
the failed user message was still persisted to the transcript, making
the session even larger on each attempt — creating an infinite loop.
Three-layer fix:
1. run_agent.py — Fallback heuristic: when a 400 error has a very short
generic message AND the session is large (>40% of context or >80
messages), treat it as a probable context overflow and trigger
compression instead of aborting.
2. run_agent.py + gateway/run.py — Don't persist failed messages:
when the agent returns failed=True before generating any response,
skip writing the user's message to the transcript/DB. This prevents
the session from growing on each failure.
3. gateway/run.py — Smarter error messages: detect context-overflow
failures and suggest /compact or /reset specifically, instead of a
generic 'try again' that will fail identically.
* fix(skills): detect prompt injection patterns and block cache file reads
Adds two security layers to prevent prompt injection via skills hub
cache files (#1558):
1. read_file: blocks direct reads of ~/.hermes/skills/.hub/ directory
(index-cache, catalog files). The 3.5MB clawhub_catalog_v1.json
was the original injection vector — untrusted skill descriptions
in the catalog contained adversarial text that the model executed.
2. skill_view: warns when skills are loaded from outside the trusted
~/.hermes/skills/ directory, and detects common injection patterns
in skill content ("ignore previous instructions", "<system>", etc.).
Cherry-picked from PR #1562 by ygd58.
* fix(tools): chunk long messages in send_message_tool before dispatch (#1552)
Long messages sent via send_message tool or cron delivery silently
failed when exceeding platform limits. Gateway adapters handle this
via truncate_message(), but the standalone senders in send_message_tool
bypassed that entirely.
- Apply truncate_message() chunking in _send_to_platform() before
dispatching to individual platform senders
- Remove naive message[i:i+2000] character split in _send_discord()
in favor of centralized smart splitting
- Attach media files to last chunk only for Telegram
- Add regression tests for chunking and media placement
Cherry-picked from PR #1557 by llbn.
* fix(approval): show full command in dangerous command approval (#1553)
Previously the command was truncated to 80 chars in CLI (with a
[v]iew full option), 500 chars in Discord embeds, and missing entirely
in Telegram/Slack approval messages. Now the full command is always
displayed everywhere:
- CLI: removed 80-char truncation and [v]iew full menu option
- Gateway (TG/Slack): approval_required message includes full command
in a code block
- Discord: embed shows full command up to 4096-char limit
- Windows: skip SIGALRM-based test timeout (Unix-only)
- Updated tests: replaced view-flow tests with direct approval tests
Cherry-picked from PR #1566 by crazywriter1.
* fix(cli): flush stdout during agent loop to prevent macOS display freeze (#1624)
The interrupt polling loop in chat() waited on the queue without
invalidating the prompt_toolkit renderer. On macOS, the StdoutProxy
buffer only flushed on input events, causing the CLI to appear frozen
during tool execution until the user typed a key.
Fix: call _invalidate() on each queue timeout (every ~100ms, throttled
to 150ms) to force the renderer to flush buffered agent output.
* fix(claw): warn when API keys are skipped during OpenClaw migration (#1580)
When --migrate-secrets is not passed (the default), API keys like
OPENROUTER_API_KEY are silently skipped with no warning. Users don't
realize their keys weren't migrated until the agent fails to connect.
Add a post-migration warning with actionable instructions: either
re-run with --migrate-secrets or add the key manually via
hermes config set.
Cherry-picked from PR #1593 by ygd58.
* fix(security): block sandbox backend creds from subprocess env (#1264)
Add Modal and Daytona sandbox credentials to the subprocess env
blocklist so they're not leaked to agent terminal sessions via
printenv/env.
Cherry-picked from PR #1571 by ygd58.
* fix(gateway): cap interrupt recursion depth to prevent resource exhaustion (#816)
When a user sends multiple messages while the agent keeps failing,
_run_agent() calls itself recursively with no depth limit. This can
exhaust stack/memory if the agent is in a failure loop.
Add _MAX_INTERRUPT_DEPTH = 3. When exceeded, the pending message is
logged and the current result is returned instead of recursing deeper.
The log handler duplication bug described in #816 was already fixed
separately (AIAgent.__init__ deduplicates handlers).
* fix(gateway): /model shows active fallback model instead of config default (#1615)
When the agent falls back to a different model (e.g. due to rate
limiting), /model still showed the config default. Now tracks the
effective model/provider after each agent run and displays it.
Cleared when the primary model succeeds again or the user explicitly
switches via /model.
Cherry-picked from PR #1616 by MaxKerkula. Added hasattr guard for
test compatibility.
* feat(gateway): inject reply-to message context for out-of-session replies (#1594)
When a user replies to a Telegram message, check if the quoted text
exists in the current session transcript. If missing (from cron jobs,
background tasks, or old sessions), prepend [Replying to: "..."] to
the message so the agent has context about what's being referenced.
- Add reply_to_text field to MessageEvent (base.py)
- Populate from Telegram's reply_to_message (text or caption)
- Inject context in _handle_message when not found in history
Based on PR #1596 by anpicasso (cherry-picked reply-to feature only,
excluded unrelated /server command and background delegation changes).
* fix: recognize Claude Code OAuth credentials in startup gate (#1455)
The _has_any_provider_configured() startup check didn't look for
Claude Code OAuth credentials (~/.claude/.credentials.json). Users
with only Claude Code auth got the setup wizard instead of starting.
Cherry-picked from PR #1455 by kshitijk4poor.
---------
Co-authored-by: buray <ygd58@users.noreply.github.com>
Co-authored-by: lbn <llbn@users.noreply.github.com>
Co-authored-by: crazywriter1 <53251494+crazywriter1@users.noreply.github.com>
Co-authored-by: Max K <MaxKerkula@users.noreply.github.com>
Co-authored-by: Angello Picasso <angello.picasso@devsu.com>
Co-authored-by: kshitij <kshitijk4poor@users.noreply.github.com>
* fix(security): harden terminal safety and sandbox file writes
Two security improvements:
1. Dangerous command detection: expand shell -c pattern to catch
combined flags (bash -lc, bash -ic, ksh -c) that were previously
undetected. Pattern changed from matching only 'bash -c' to
matching any shell invocation with -c anywhere in the flags.
2. File write sandboxing: add HERMES_WRITE_SAFE_ROOT env var that
constrains all write_file/patch operations to a configured directory
tree. Opt-in — when unset, behavior is unchanged. Useful for
gateway/messaging deployments that should only touch a workspace.
Based on PR #1085 by ismoilh.
* fix: correct "POSIDEON" typo to "POSEIDON" in banner ASCII art
The poseidon skin's banner_logo had the E and I letters swapped,
spelling "POSIDEON-AGENT" instead of "POSEIDON-AGENT".
---------
Co-authored-by: ismoilh <ismoilh@users.noreply.github.com>
Co-authored-by: unmodeled-tyler <unmodeled.tyler@proton.me>
* feat(skills): add bundled neutts optional skill
Add NeuTTS optional skill with CLI scaffold, bootstrap helper, and
sample voice profile. Also fixes skills_hub.py to handle binary
assets (WAV files) during skill installation.
Changes:
- optional-skills/mlops/models/neutts/ — skill + CLI scaffold
- tools/skills_hub.py — binary asset support (read_bytes, write_bytes)
- tests/tools/test_skills_hub.py — regression tests for binary assets
* feat(tts): add NeuTTS as local TTS provider backend
Add NeuTTS as a fourth TTS provider option alongside Edge, ElevenLabs,
and OpenAI. NeuTTS runs fully on-device via neutts_cli — no API key
needed.
Provider behavior:
- Explicit: set tts.provider to 'neutts' in config.yaml
- Fallback: when Edge TTS is unavailable and neutts_cli is installed,
automatically falls back to NeuTTS instead of failing
- check_tts_requirements() now includes NeuTTS in availability checks
NeuTTS outputs WAV natively. For Telegram voice bubbles, ffmpeg
converts to Opus (same pattern as Edge TTS).
Changes:
- tools/tts_tool.py — _generate_neutts(), _check_neutts_available(),
provider dispatch, fallback logic, Opus conversion
- hermes_cli/config.py — tts.neutts config defaults
---------
Co-authored-by: unmodeled-tyler <unmodeled.tyler@proton.me>
Remove HERMES_API_MODE env var. api_mode is now configured where the
endpoint is defined:
- model.api_mode in config.yaml (for the active model config)
- custom_providers[].api_mode (for named custom providers)
Replace _get_configured_api_mode() with _parse_api_mode() which just
validates a value against the whitelist without reading env vars.
Both paths (model config and named custom providers) now read api_mode
from their respective config entries rather than a global override.