hermes-agent-features

Author	SHA1	Message	Date
Teknium	72250b5f62	feat: config-gated /verbose command for messaging gateway (#3262 ) * feat: config-gated /verbose command for messaging gateway Add gateway_config_gate field to CommandDef, allowing cli_only commands to be conditionally available in the gateway based on a config value. - CommandDef gains gateway_config_gate: str \| None — a config dotpath that, when truthy, overrides cli_only for gateway surfaces - /verbose uses gateway_config_gate='display.tool_progress_command' - Default is off (cli_only behavior preserved) - When enabled, /verbose cycles tool_progress mode (off/new/all/verbose) in the gateway, saving to config.yaml — same cycle as the CLI - Gateway helpers (help, telegram menus, slack mapping) dynamically check config to include/exclude config-gated commands - GATEWAY_KNOWN_COMMANDS always includes config-gated commands so the gateway recognizes them and can respond appropriately - Handles YAML 1.1 bool coercion (bare 'off' parses as False) - 8 new tests for the config gate mechanism + gateway handler * docs: document gateway_config_gate and /verbose messaging support - AGENTS.md: add gateway_config_gate to CommandDef fields - slash-commands.md: note /verbose can be enabled for messaging, update Notes - configuration.md: add tool_progress_command to display section + usage note - cli.md: cross-link to config docs for messaging enablement - messaging/index.md: show tool_progress_command in config snippet - plugins.md: add gateway_config_gate to register_command parameter table	2026-03-26 14:41:04 -07:00
Teknium	db241ae6ce	feat(sessions): add --source flag for third-party session isolation (#3255 ) When third-party tools (Paperclip orchestrator, etc.) spawn hermes chat as a subprocess, their sessions pollute user session history and search. - hermes chat --source <tag> (also HERMES_SESSION_SOURCE env var) - exclude_sources parameter on list_sessions_rich() and search_messages() - Sessions with source=tool hidden from sessions list/browse/search - Third-party adapters pass --source tool to isolate agent sessions Cherry-picked from PR #3208 by HenkDz. Co-authored-by: Henkey <noonou7@gmail.com>	2026-03-26 14:35:31 -07:00
Teknium	41ee207a5e	fix: catch KeyboardInterrupt in exit cleanup handlers (#3257 ) except Exception does not catch KeyboardInterrupt (inherits from BaseException). A second Ctrl+C during exit cleanup aborts pending writes — Honcho observations dropped, SQLite sessions left unclosed, cron job sessions never marked ended. Changed to except (Exception, KeyboardInterrupt) at all five sites: - cli.py: honcho.shutdown() and end_session() in finally exit block - run_agent.py: _flush_honcho_on_exit atexit handler - cron/scheduler.py: end_session() and close() in job finally block Tests exercise the actual production code paths and confirm KeyboardInterrupt propagates without the fix. Co-authored-by: dieutx <dangtc94@gmail.com>	2026-03-26 14:34:31 -07:00
Teknium	e9e7fb0683	fix(gateway): track background task references in GatewayRunner (#3254 ) Asyncio tasks created with create_task() but never stored can be garbage collected mid-execution. Add self._background_tasks set to hold references, with add_done_callback cleanup. Tracks: - /background command task - session-reset memory flush task - session-resume memory flush task Cancel all pending tasks in stop(). Update test fixtures that construct GatewayRunner via object.__new__() to include the new _background_tasks attribute. Cherry-picked from PR #3167 by memosr. The original PR also deleted the DM topic auto-skill loading code — that deletion was excluded from this salvage as it removes a shipped feature (#2598). Co-authored-by: memosr.eth <96793918+memosr@users.noreply.github.com>	2026-03-26 14:33:48 -07:00
Teknium	76ed15dd4d	fix(security): normalize input before dangerous command detection (#3260 ) detect_dangerous_command() ran regex patterns against raw command strings without normalization, allowing bypass via Unicode fullwidth chars, ANSI escape codes, null bytes, and 8-bit C1 controls. Adds _normalize_command_for_detection() that: - Strips ANSI escapes using the full ECMA-48 strip_ansi() from tools/ansi_strip (CSI, OSC, DCS, 8-bit C1, nF sequences) - Removes null bytes - Normalizes Unicode via NFKC (fullwidth Latin → ASCII, etc.) Includes 12 regression tests covering fullwidth, ANSI, C1, null byte, and combined obfuscation bypasses. Salvaged from PR #3089 by thakoreh — improved ANSI stripping to use existing comprehensive strip_ansi() instead of a weaker hand-rolled regex, and added test coverage. Co-authored-by: Hiren <hiren.thakore58@gmail.com>	2026-03-26 14:33:18 -07:00
Teknium	a8e02c7d49	fix: align Nous Portal model slugs with OpenRouter naming (#3253 ) Nous Portal now passes through OpenRouter model names and routes from there. Update the static fallback model list and auxiliary client default to use OpenRouter-format slugs (provider/model) instead of bare names. - _PROVIDER_MODELS['nous']: full OpenRouter catalog - _NOUS_MODEL: google/gemini-3-flash-preview (was gemini-3-flash) - Updated 4 test assertions for the new default model name	2026-03-26 13:49:43 -07:00
Teknium	b81d49dc45	fix(state): SQLite concurrency hardening + session transcript integrity (#3249 ) * fix(session-db): survive CLI/gateway concurrent write contention Closes #3139 Three layered fixes for the scenario where CLI and gateway write to state.db concurrently, causing create_session() to fail with 'database is locked' and permanently disabling session_search on the gateway side. 1. Increase SQLite connection timeout: 10s -> 30s hermes_state.py: longer window for the WAL writer to finish a batch flush before the other process gives up entirely. 2. INSERT OR IGNORE in create_session hermes_state.py: prevents IntegrityError on duplicate session IDs (e.g. gateway restarts while CLI session is still alive). 3. Don't null out _session_db on create_session failure (main fix) run_agent.py: a transient lock at agent startup must not permanently disable session_search for the lifetime of that agent instance. _session_db now stays alive so subsequent flushes and searches work once the lock clears. 4. New ensure_session() helper + call it during flush hermes_state.py: INSERT OR IGNORE for a minimal session row. run_agent.py _flush_messages_to_session_db: calls ensure_session() before appending messages, so the FK constraint is satisfied even when create_session() failed at startup. No-op when the row exists. * fix(state): release lock between context queries in search_messages The context-window queries (one per FTS5 match) were running inside the same lock acquisition as the primary FTS5 query, holding the lock for O(N) sequential SQLite round-trips. Move per-match context fetches outside the outer lock block so each acquires the lock independently, keeping critical sections short and allowing other threads to interleave. * fix(session): prefer longer source in load_transcript to prevent legacy truncation When a long-lived session pre-dates SQLite storage (e.g. sessions created before the DB layer was introduced, or after a clean deployment that reset the DB), _flush_messages_to_session_db only writes the new messages from the current turn to SQLite — it skips messages already present in conversation_history, assuming they are already persisted. That assumption fails for legacy JSONL-only sessions: Turn N (first after DB migration): load_transcript(id) → SQLite: 0 → falls back to JSONL: 994 ✓ _flush_messages_to_session_db: skip first 994, write 2 new → SQLite: 2 Turn N+1: load_transcript(id) → SQLite: 2 → returns immediately ✗ Agent sees 2 messages of history instead of 996 The same pattern causes the reported symptom: session JSON truncated to 4 messages (_save_session_log writes agent.messages which only has 2 history + 2 new = 4). Fix: always load both sources and return whichever is longer. For a fully-migrated session SQLite will always be ≥ JSONL, so there is no regression. For a legacy session that hasn't been bootstrapped yet, JSONL wins and the full history is restored. Closes #3212 * test: add load_transcript source preference tests for #3212 Covers: JSONL longer returns JSONL, SQLite longer returns SQLite, SQLite empty falls back to JSONL, both empty returns empty, equal length prefers SQLite (richer reasoning fields). --------- Co-authored-by: Mibayy <mibayy@hermes.ai> Co-authored-by: kewe63 <kewe.3217@gmail.com> Co-authored-by: Mibayy <mibayy@users.noreply.github.com>	2026-03-26 13:47:14 -07:00
Teknium	b7b3294c4a	fix(skills): preserve trust for skills-sh identifiers + reduce resolution churn (#3251 ) * fix(skills): reduce skills.sh resolution churn and preserve trust for wrapped identifiers - Accept common skills.sh prefix typos (skils-sh/, skils.sh/) - Strip skills-sh/ prefix in _resolve_trust_level() so trusted repos stay trusted when installed through skills.sh - Use resolved identifier (from bundle/meta) for scan_skill source - Prefer tree search before root scan in _discover_identifier() - Add _resolve_github_meta() consolidation for inspect flow Cherry-picked from PR #3001 by kshitijk4poor. * fix: restore candidate loop in SkillsShSource.fetch() for consistency The cherry-picked PR only tried the first candidate identifier in fetch() while inspect() (via _resolve_github_meta) tried all four. This meant skills at repo/skills/path would be found by inspect but missed by fetch, forcing it through the heavier _discover_identifier flow. Restore the candidate loop so both paths behave identically. Updated the test assertion to match. --------- Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>	2026-03-26 13:40:21 -07:00
Teknium	62f8aa9b03	fix: MCP toolset resolution for runtime and config (#3252 ) Gateway sessions had their own inline toolset resolution that only read platform_toolsets from config, which never includes MCP server names. MCP tools were discovered and registered but invisible to the model. - Replace duplicated gateway toolset resolution in _run_agent() and _run_background_task() with calls to the shared _get_platform_tools() - Extend _get_platform_tools() to include globally enabled MCP servers at runtime (include_default_mcp_servers=True), while config-editing flows use include_default_mcp_servers=False to avoid persisting implicit MCP defaults into platform_toolsets - Add homeassistant to PLATFORMS dict (was missing, caused KeyError) - Fix CLI entry point to use _get_platform_tools() as well, so MCP tools are visible in CLI mode too - Remove redundant platform_key reassignment in _run_background_task Co-authored-by: kshitijk4poor <kshitijk4poor@users.noreply.github.com>	2026-03-26 13:39:41 -07:00
Teknium	c6fe75e99b	fix(gateway): fingerprint full auth token in agent cache signature (#3247 ) Previously _agent_config_signature() used only the first 8 characters of the API key, which causes false cache hits for JWT/OAuth tokens that share a common prefix (e.g. 'eyJhbGci'). This led to cross-account cache collisions when switching OAuth accounts in multi-user gateway deployments. Replace the 8-char prefix with a SHA-256 hash of the full key so the signature is unique per credential while keeping secrets out of the cache key. Salvaged from PR #3117 by EmpireOperating. Co-authored-by: EmpireOperating <EmpireOperating@users.noreply.github.com>	2026-03-26 13:19:43 -07:00
Teknium	36af1f3baf	feat(telegram): Private Chat Topics with functional skill binding (#2598 ) Salvages PR #3005 by web3blind. Cherry-picked onto current main with functional skill binding and docs added. - DM topic creation via createForumTopic (Bot API 9.4, Feb 2026) - Config-driven topics with thread_id persistence across restarts - Session isolation via existing build_session_key thread_id support - auto_skill field on MessageEvent for topic-skill bindings - Gateway auto-loads bound skill on new sessions (same as /skill commands) - Docs: full Private Chat Topics section in Telegram messaging guide - 20 tests (17 original + 3 for auto_skill) Closes #2598 Co-authored-by: web3blind <web3blind@users.noreply.github.com>	2026-03-26 02:04:11 -07:00
Teknium	0426bb745f	fix: reset default SOUL.md to baseline identity text (#3159 ) The default SOUL.md seeded for new users should match DEFAULT_AGENT_IDENTITY — a short, neutral identity paragraph. The elaborate voice spec (avoid lists, dialogue examples, symbol conventions) was never intended as the default for all users. Users who want a custom persona write their own SOUL.md.	2026-03-26 01:34:27 -07:00
Teknium	c511e087e0	fix(agent): always prefer streaming for API calls to prevent hung subagents (#3120 ) The non-streaming API call path (_interruptible_api_call) had no wall-clock timeout. When providers keep connections alive with SSE keep-alive pings but never deliver a response, httpx's inactivity timeout never fires and the call hangs indefinitely. Subagents always used the non-streaming path because they have no stream consumers (quiet_mode=True). This caused delegate_task to hang for 40+ minutes in production. The streaming path has two layers of protection: - httpx read timeout (60s, HERMES_STREAM_READ_TIMEOUT) - Stale stream detection (90s, HERMES_STREAM_STALE_TIMEOUT) Both work because streaming sends chunks continuously — a 90-second gap between chunks genuinely means the connection is broken, even for reasoning models that take minutes to complete. Now run_conversation() always prefers the streaming path. The streaming method falls back to non-streaming automatically if the provider doesn't support it. Stream delta callbacks are no-ops when no consumers are registered, so there's no overhead for subagents.	2026-03-26 01:22:31 -07:00
Teknium	08d3be0412	fix: graceful return on max retries instead of crashing thread run_conversation raised the raw exception after exhausting retries, which crashed the background thread in cli.py (unhandled exception in Thread). Now returns a proper error result dict with failed=True and persists the session, matching the pattern used by other error paths (invalid responses, empty content, etc.). Also wraps cli.py's run_agent thread function in try/except as a safety net against any future unhandled exceptions from run_conversation. Made-with: Cursor	2026-03-25 19:00:39 -07:00
Teknium	59575d6a91	fix(gateway): recover from hung agents — /stop force-unlocks session (#3104 ) When an agent thread hangs (truly blocked, never checks _interrupt_requested), /stop now force-cleans _running_agents to unlock the session immediately. Two changes: - Early /stop intercept in the running-agent guard: bypasses normal command dispatch to force-interrupt and unlock the session. Follows the same pattern as the existing /new intercept. - Sentinel /stop: force-cleans the sentinel instead of returning 'nothing to stop yet', so /stop during slow startup actually unlocks the session. Follow-up improvements over original PR: - Consolidated duplicate resolve_command imports into single early resolution - Updated _handle_stop_command to also force-clean for consistency - Removed 10-minute hard timeout on the executor (would kill legitimate long-running agent tasks; the /stop force-clean handles recovery) Cherry-picked from Mibayy's PR #2498. Co-authored-by: Mibayy <Mibayy@users.noreply.github.com>	2026-03-25 18:46:50 -07:00
Teknium	7258311710	fix: stop recursive AGENTS.md walk, load top-level only (#3110 ) The recursive os.walk for AGENTS.md in subdirectories was undesired. Only load AGENTS.md from the working directory root, matching the behavior of CLAUDE.md and .cursorrules.	2026-03-25 18:30:45 -07:00
Teknium	910ec7eb38	chore: remove unused Hermes-native PKCE OAuth flow (#3107 ) Remove run_hermes_oauth_login(), refresh_hermes_oauth_token(), read_hermes_oauth_credentials(), _save_hermes_oauth_credentials(), _generate_pkce(), and associated constants/credential file path. This code was added in `63e88326` but never wired into any user-facing flow (setup wizard, hermes model, or any CLI command). Neither clawdbot/OpenClaw nor opencode implement PKCE for Anthropic — both use setup-token or API keys. Dead code that was never tested in production. Also removes the credential resolution step that checked ~/.hermes/.anthropic_oauth.json (step 3 in resolve_anthropic_token), renumbering remaining steps.	2026-03-25 18:29:47 -07:00
Teknium	b374f52063	fix(session): clear compressor summary and turn counter on /clear and /new (#3102 ) reset_session_state() was missing two fields added after it was written: - _user_turn_count: kept accumulating across sessions, affecting flush_min_turns guard behavior - context_compressor._previous_summary: old session's compression summary leaked into new session's iterative compression Cherry-picked from PR #2640 by dusterbloom. Closes #2635.	2026-03-25 18:22:21 -07:00
Teknium	bd43a43f07	fix(cli): handle EOFError in sessions delete/prune confirmation prompts (#3101 ) sessions delete and prune call input() for confirmation without catching EOFError. When stdin isn't a TTY (piped input, CI/CD, cron), input() throws EOFError and the command crashes. Extract a _confirm_prompt() helper that handles EOFError and KeyboardInterrupt, defaulting to cancel. Both call sites now use it. Salvaged from PR #2622 by dieutx (improved from duplicated try/except to shared helper). Closes #2565.	2026-03-25 18:06:04 -07:00
ctlst	281100e2df	fix(agent): prevent AsyncOpenAI/httpx cross-loop deadlock in gateway mode (#2701 ) In gateway mode, async tools (vision_analyze, web_extract, session_search) deadlock because _run_async() spawns a thread with asyncio.run(), creating a new event loop, but _get_cached_client() returns an AsyncOpenAI client bound to a different loop. httpx.AsyncClient cannot work across event loop boundaries, causing await client.chat.completions.create() to hang forever. Fix: include the event loop identity in the async client cache key so each loop gets its own AsyncOpenAI instance. Also fix session_search_tool.py which had its own broken asyncio.run()-in-thread pattern — now uses the centralized _run_async() bridge.	2026-03-25 17:31:56 -07:00
Teknium	9783c9d5c1	refactor: remove /model slash command from CLI and gateway (#3080 ) The /model command is removed from both the interactive CLI and messenger gateway (Telegram/Discord/Slack/WhatsApp). Users can still change models via 'hermes model' CLI subcommand or by editing config.yaml directly. Removed: - CommandDef entry from COMMAND_REGISTRY - CLI process_command() handler and model autocomplete logic - Gateway _handle_model_command() and dispatch - SlashCommandCompleter model_completer_provider parameter - Two-stage Tab completion and ghost text for /model - All /model-specific tests Unaffected: - /provider command (read-only, shows current model + providers) - ACP adapter _cmd_model (separate system for VS Code/Zed/JetBrains) - model_switch.py module (used by ACP) - 'hermes model' CLI subcommand Author: Teknium	2026-03-25 17:03:05 -07:00
Teknium	0cfc1f88a3	fix: add MCP tool name collision protection (#3077 ) - Registry now warns when a tool name is overwritten by a different toolset (silent dict overwrite was the previous behavior) - MCP tool registration checks for collisions with non-MCP (built-in) tools before registering. If an MCP tool's prefixed name matches an existing built-in, the MCP tool is skipped and a warning is logged. MCP-to-MCP collisions are allowed (last server wins). - Both regular MCP tools and utility tools (resources/prompts) are guarded. - Adds 5 tests covering: registry overwrite warning, same-toolset re-registration silence, built-in collision skip, normal registration, and MCP-to-MCP collision pass-through. Reported by k_sze (KONG) — MiniMax MCP server's web_search tool could theoretically shadow Hermes's built-in web_search if prefixing failed.	2026-03-25 16:52:04 -07:00
Teknium	37cabc47d3	test(skills): add regression tests for null metadata frontmatter Covers the case where a SKILL.md has `metadata:` (null) or `metadata.hermes:` (null), which caused an AttributeError before the fix in `d218cf91`. Made-with: Cursor	2026-03-25 16:09:27 -07:00
Teknium	e0cfc089da	fix(gateway/slack): send progress messages to correct thread (#3063 ) Co-authored-by: Jneeee <jneeee@outlook.com>	2026-03-25 15:51:15 -07:00
Teknium	ab548a9b5e	fix(security): add SSRF protection to browser_navigate (#3058 ) * fix(security): add SSRF protection to browser_navigate browser_navigate() only checked the website blocklist policy but did not call is_safe_url() to block private/internal addresses. This allowed the agent to navigate to localhost, cloud metadata endpoints (169.254.169.254), and private network IPs via the browser. web_tools and vision_tools already had this check. Added the same is_safe_url() pre-flight validation before the blocklist check in browser_navigate(). * fix: move SSRF import to module level, fix policy test mock Move is_safe_url import to module level so it can be monkeypatched in tests. Update test_browser_navigate_returns_policy_block to mock _is_safe_url so the SSRF check passes and the policy check is reached. * fix(security): harden browser SSRF protection Follow-up to cherry-picked PR #3041: 1. Fail-closed fallback: if url_safety module can't import, block all URLs instead of allowing all. Security guards should never fail-open. 2. Post-redirect SSRF check: after navigation, verify the final URL isn't a private/internal address. If a public URL redirected to 169.254.169.254 or localhost, navigate to about:blank and return an error — prevents the model from reading internal content via subsequent browser_snapshot calls. --------- Co-authored-by: 0xbyt4 <35742124+0xbyt4@users.noreply.github.com>	2026-03-25 15:16:57 -07:00
Teknium	861624d4e9	fix(cli): refresh TUI before background task output to prevent status bar overlap (#3048 ) When a background task (/bg command) prints its output while the main agent is processing with the thinking spinner visible, the status bar could render on the same row as the spinner, causing visual overlap. This fix adds an explicit app.invalidate() call with a brief pause before printing background task output, ensuring the TUI layout is in a consistent state before the output is written. Changes: - Add TUI refresh before success output in _handle_background_command - Add TUI refresh before error output in the exception handler - Add tests for the refresh behavior Closes #2718 Co-authored-by: Bartok9 <bartokmagic@proton.me>	2026-03-25 15:00:33 -07:00
Teknium	94e3d9adbf	fix(agent): restore safe non-streaming fallback after stream failures (#3020 ) After streaming retries are exhausted on transient errors, fall back to non-streaming instead of propagating the error. Also fall back for any other pre-delivery stream error (not just 'streaming not supported'). Added user-facing message when streaming is not supported by a model/ provider, directing users to set display.streaming: false in config.yaml to avoid the fallback delay. Cherry-picked from PR #3008 by kshitijk4poor. Added UX message for streaming-not-supported detection. Co-authored-by: kshitijk4poor <kshitijk4poor@users.noreply.github.com>	2026-03-25 12:46:04 -07:00
Siddharth Balyan	b6461903ff	feat: nix flake — uv2nix build, NixOS module, persistent container mode (#20 ) * feat: nix flake, uv2nix build, dev shell and home manager * fixed nix run, updated docs for setup * feat(nix): NixOS module with persistent container mode, managed guards, checks - Replace homeModules.nix with nixosModules.nix (two deployment modes) - Mode A (native): hardened systemd service with ProtectSystem=strict - Mode B (container): persistent Ubuntu container with /nix/store bind-mount, identity-hash-based recreation, GC root protection, symlink-based updates - Add HERMES_MANAGED guards blocking CLI config mutation (config set, setup, gateway install/uninstall) when running under NixOS module - Add nix/checks.nix with build-time verification (binary, CLI, managed guard) - Remove container.nix (no Nix-built OCI image; pulls ubuntu:24.04 at runtime) - Simplify packages.nix (drop fetchFromGitHub submodules, PYTHONPATH wrappers) - Rewrite docs/nixos-setup.md with full options reference, container architecture, secrets management, and troubleshooting guide Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * Update config.py * feat(nix): add CI workflow and enhanced build checks - GitHub Actions workflow for nix flake check + build on linux/macOS - Entry point sync check to catch pyproject.toml drift - Expanded managed-guard check to cover config edit - Wrap hermes-acp binary in Nix package - Fix Path type mismatch in is_managed() * Update MCP server package name; bundled skills support * fix reading .env. instead have container user a common mounted .env file * feat(nix): container entrypoint with privilege drop and sudo provisioning Container was running as non-root via --user, which broke apt/pip installs and caused crashes when $HOME didn't exist. Replace --user with a Nix-built entrypoint script that provisions the hermes user, sudo (NOPASSWD), and /home/hermes inside the container on first boot, then drops privileges via setpriv. Writable layer persists so setup only runs once. Also expands MCP server options to support HTTP transport and sampling. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> * fix group and user creation in container mode * feat(nix): persistent /home/hermes and MESSAGING_CWD in container mode Container mode now bind-mounts ${stateDir}/home to /home/hermes so the agent's home directory survives container recreation. Previously it lived in the writable layer and was lost on image/volume/options changes. Also passes MESSAGING_CWD to the container so the agent finds its workspace and documents, matching native mode behavior. Other changes: - Extract containerDataDir/containerHomeDir bindings (no more magic strings) - Fix entrypoint chown to run unconditionally (volume mounts always exist) - Add schema field to container identity hash for auto-recreation - Add idempotency test (Scenario G) to config-roundtrip check * docs: add Nix & NixOS setup guide to docs site Add comprehensive Nix documentation to the Docusaurus site at website/docs/getting-started/nix-setup.md, covering nix run/profile install, NixOS module (native + container modes), declarative settings, secrets management, MCP servers, managed mode, container architecture, dev shell, flake checks, and full options reference. - Register nix-setup in sidebar after installation page - Add Nix callout tip to installation.md linking to new guide - Add canonical version pointer in docs/nixos-setup.md * docs: remove docs/nixos-setup.md, consolidate into website docs Backfill missing details (restart/restartSec in full example, gateway.pid, 0750 permissions, docker inspect commands) into the canonical website/docs/getting-started/nix-setup.md and delete the old standalone file. * fix(nix): add compression.protect_last_n and target_ratio to config-keys.json New keys were added to DEFAULT_CONFIG on main, causing the config-drift check to fail in CI. * fix(nix): skip checks on aarch64-darwin (onnxruntime wheel missing) The full Python venv includes onnxruntime (via faster-whisper/STT) which lacks a compatible uv2nix wheel on aarch64-darwin. Gate all checks behind stdenv.hostPlatform.isLinux. The package and devShell still evaluate on macOS. * fix(nix): skip flake check and build on macOS CI onnxruntime (transitive dep via faster-whisper) lacks a compatible uv2nix wheel on aarch64-darwin. Run full checks and build on Linux only; macOS CI verifies the flake evaluates without building. * fix(nix): preserve container writable layer across nixos-rebuild The container identity hash included the entrypoint's Nix store path, which changes on every nixpkgs update (due to runtimeShell/stdenv input-addressing). This caused false-positive identity mismatches, triggering container recreation and losing the persistent writable layer. - Use stable symlink (current-entrypoint) like current-package already does - Remove entrypoint from identity hash (only image/volumes/options matter) - Add GC root for entrypoint so nix-collect-garbage doesn't break it - Remove global HERMES_HOME env var from addToSystemPackages (conflicted with interactive CLI use, service already sets its own) --------- Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-26 01:08:02 +05:30
Teknium	8f6ef042c1	fix(cli): buffer reasoning preview chunks and fix duplicate display (#3013 ) Three improvements to reasoning/thinking display in the CLI: 1. Buffer tiny reasoning chunks: providers like DeepSeek stream reasoning one word at a time, producing a separate [thinking] line per token. Add a buffer that coalesces chunks and flushes at natural boundaries (newlines, sentence endings, terminal width). 2. Fix duplicate reasoning display: centralize callback selection into _current_reasoning_callback() — one place instead of 4 scattered inline ternaries. Prevents both the streaming box AND the preview callback from firing simultaneously. 3. Fix post-response reasoning box guard: change the check from 'not self._stream_started' to 'not self._reasoning_stream_started' so the final reasoning box is only suppressed when reasoning was actually streamed live, not when any text was streamed. Cherry-picked from PR #2781 by juanfradb.	2026-03-25 12:16:39 -07:00
Teknium	099dfca6db	fix: GLM reasoning-only and max-length handling (#3010 ) - Add 'prompt exceeds max length' to context overflow detection for Z.AI/GLM 400 errors - Extract inline reasoning blocks from assistant content as fallback when no structured reasoning fields are present - Guard inline extraction so structured API reasoning takes priority - Update test for reasoning-only response salvage behavior Cherry-picked from PR #2993 by kshitijk4poor. Added priority guard to fix test_structured_reasoning_takes_priority failure. Co-authored-by: kshitijk4poor <kshitijk4poor@users.noreply.github.com>	2026-03-25 12:05:37 -07:00
Teknium	650b400c98	fix(cron): mark session as ended after job completes (#2998 ) Cron was the only execution path that never called end_session(), leaving ended_at = NULL permanently. This made cron sessions invisible to hermes prune --older-than and indistinguishable from active sessions. Captures session_id in a local variable before agent construction so it's available in the finally block even if AIAgent() fails, then calls end_session(session_id, 'cron_complete') before close(). Cherry-picked from PR #2979 by ygd58. Fixed bug: original PR called end_session() with zero arguments (TypeError — method requires session_id and end_reason). Fixes #2972. Co-authored-by: ygd58 <ygd58@users.noreply.github.com>	2026-03-25 11:13:21 -07:00
Teknium	61949f0af7	Fix (#2997 ) Co-authored-by: Jack <jvand@DESKTOP-JACK.localdomain>	2026-03-25 11:12:11 -07:00
Teknium	fba73a60e3	fix(skills): use Git Trees API to prevent silent subdirectory loss during install (#2995 ) * fix(skills): use Git Trees API to prevent silent subdirectory loss during install Refactors _download_directory() to use the Git Trees API (single call for the entire repo tree) as the primary path, falling back to the recursive Contents API when the tree endpoint is unavailable or truncated. Prevents silent subdirectory loss caused by per-directory rate limiting or transient failures. Cherry-picked from PR #2981 by tugrulguner. Fixes #2940. * fix: simplify tree API — use branch name directly as tree-ish Eliminates an extra git/ref/heads API call by passing the branch name directly to git/trees/{branch}?recursive=1, matching the pattern already used by _find_skill_in_repo_tree. --------- Co-authored-by: tugrulguner <tugrulguner@users.noreply.github.com>	2026-03-25 10:48:18 -07:00
Teknium	b2a6b012fe	fix(api_server): streaming breaks when agent makes tool calls (#2985 ) * fix(run_agent): ensure _fire_first_delta() is called for tool generation events Added calls to _fire_first_delta() in the AIAgent class to improve the handling of tool generation events, ensuring timely notifications during the processing of function calls and tool usage. * fix(run_agent): improve timeout handling for chat completions Enhanced the timeout configuration for chat completions in the AIAgent class by introducing customizable connection, read, and write timeouts using environment variables. This ensures more robust handling of API requests during streaming operations. * fix(run_agent): reduce default stream read timeout for chat completions Updated the default stream read timeout from 120 seconds to 60 seconds in the AIAgent class, enhancing the timeout configuration for chat completions. This change aims to improve responsiveness during streaming operations. * fix(run_agent): enhance streaming error handling and retry logic Improved the error handling and retry mechanism for streaming requests in the AIAgent class. Introduced a configurable maximum number of stream retries and refined the handling of transient network errors, allowing for retries with fresh connections. Non-transient errors now trigger a fallback to non-streaming only when appropriate, ensuring better resilience during API interactions. * fix(api_server): streaming breaks when agent makes tool calls The agent fires stream_delta_callback(None) to signal the CLI display to close its response box before tool execution begins. The API server's _on_delta callback was forwarding this None directly into the SSE queue, where the SSE writer treats it as end-of-stream and terminates the HTTP response prematurely. After tool calls complete, the agent streams the final answer through the same callback, but the SSE response was already closed — so Open WebUI (and similar frontends) never received the actual answer. Fix: filter out None in _on_delta so the SSE stream stays open. The SSE loop already detects completion via agent_task.done(), which handles stream termination correctly without needing the None sentinel. Reported by Rohit Paul on X.	2026-03-25 09:56:20 -07:00
Teknium	42fec19151	feat: persist reasoning across gateway session turns (schema v6) (#2974 ) feat: persist reasoning across gateway session turns (schema v6) Tested against OpenAI Codex (direct), Anthropic (direct + OAI-compat), and OpenRouter → 6 backends. All reasoning field types (reasoning, reasoning_details, codex_reasoning_items) round-trip through the DB correctly.	2026-03-25 09:47:28 -07:00
Teknium	5dbe2d9d73	fix: skills-sh install fails for deeply nested repo structures (#2980 ) * fix(run_agent): ensure _fire_first_delta() is called for tool generation events Added calls to _fire_first_delta() in the AIAgent class to improve the handling of tool generation events, ensuring timely notifications during the processing of function calls and tool usage. * fix(run_agent): improve timeout handling for chat completions Enhanced the timeout configuration for chat completions in the AIAgent class by introducing customizable connection, read, and write timeouts using environment variables. This ensures more robust handling of API requests during streaming operations. * fix(run_agent): reduce default stream read timeout for chat completions Updated the default stream read timeout from 120 seconds to 60 seconds in the AIAgent class, enhancing the timeout configuration for chat completions. This change aims to improve responsiveness during streaming operations. * fix(run_agent): enhance streaming error handling and retry logic Improved the error handling and retry mechanism for streaming requests in the AIAgent class. Introduced a configurable maximum number of stream retries and refined the handling of transient network errors, allowing for retries with fresh connections. Non-transient errors now trigger a fallback to non-streaming only when appropriate, ensuring better resilience during API interactions. * fix: skills-sh install fails for deeply nested repo structures Skills in repos with deep directory nesting (e.g. cli-tool/components/skills/development/senior-backend/) could not be installed because the candidate path generation and shallow root-dir scan never reached them. Added GitHubSource._find_skill_in_repo_tree() which uses the GitHub Trees API to recursively search the entire repo tree in a single API call. This is used as a final fallback in SkillsShSource._discover_identifier() when the standard candidate paths and shallow scan both fail. Fixes installation of skills from repos like davila7/claude-code-templates where skills are nested 4+ levels deep. Reported by user Samuraixheart.	2026-03-25 09:31:05 -07:00
Teknium	e5691eed38	feat(gateway): configurable Telegram reply threading mode (#2907 ) Add reply_to_mode setting (off/first/all) to control whether Telegram replies quote/thread to the user's original message. - 'off': Never thread replies (no quote bubble) - 'first': Only first chunk threads to user's message (default, preserves existing behavior) - 'all': All chunks in multi-part replies thread to user's message Configurable via: - reply_to_mode in platform config (gateway config YAML) - TELEGRAM_REPLY_TO_MODE env var Based on PR #855 by raulvidis.	2026-03-24 19:56:00 -07:00
Teknium	ab4ba8163a	feat(migration): comprehensive OpenClaw migration v2 — 17 new modules, terminal recap (#2906 ) * feat(migration): comprehensive OpenClaw -> Hermes migration v2 Extends the existing migration script from ~15% to ~95% coverage of OpenClaw's configuration surface. Adds 17 new migration modules: Direct migrations (written to config.yaml/.env): - MCP servers: full server definitions with transport, tools, sampling - Agent defaults: reasoning_effort, compression, human_delay, timezone - Session config: reset triggers (daily/idle) -> session_reset - Full model providers: custom_providers with base_url/api_mode - Deep channel config: Matrix, Mattermost, IRC, Discord deep settings - Browser config: timeout settings - Tools config: exec timeout -> terminal.timeout - Approvals: mode mapping (smart/manual/auto -> Hermes equivalents) Archived for manual review (no direct Hermes equivalent): - Plugins config + installed extensions - Cron jobs (with note to use 'hermes cron') - Hooks/webhooks config - Multi-agent list + routing bindings - Gateway config (port, auth, TLS) - Memory backend config (QMD, vector search) - Skills registry per-entry config - UI/identity settings - Logging/diagnostics preferences Also adds: - MIGRATION_NOTES.md generation with PM2 reassurance message - _set_env_var helper for consistent env file management - Updated presets to include all new options - Comprehensive mock test passing (12 migrated, 12 archived) * feat(migration): add terminal recap with visual summary Replaces raw JSON dump with a formatted box showing migrated/archived/ skipped/conflict/error counts, detailed item lists with labels, PM2 reassurance message, and actionable next steps. JSON output available via MIGRATION_JSON_OUTPUT=1 env var. * fix(test): allowlist python_os_environ as known false-positive in skills guard test MIGRATION_JSON_OUTPUT env var is a legitimate CLI feature flag that enables JSON output mode, not an env dump. Add it alongside agent_config_mod as an accepted finding in test_skill_installs_cleanly_under_skills_guard. * fix(test): add hermes_config_mod to known false-positives in skills guard test The scanner flags two print statements that tell the user to review ~/.hermes/config.yaml in the post-migration summary. The script never writes to that file — those are informational strings, not config mutations. --------- Co-authored-by: Hermes <hermes@nousresearch.ai>	2026-03-24 19:44:02 -07:00
Teknium	7ca22ea11b	fix(compression): restore sane defaults and cap summary at 12K tokens - threshold: 0.80 → 0.50 (compress at 50%, not 80%) - target_ratio: 0.40 → 0.20, now relative to threshold not total context (20% of 50% = 10% of context as tail budget) - summary ceiling: 32K → 12K (Gemini can't output more than ~12K) - Updated DEFAULT_CONFIG, config display, example config, and tests	2026-03-24 18:48:47 -07:00
Teknium	9231a335d4	fix(compression): replace dead summary_target_tokens with ratio-based scaling (#2554 ) The summary_target_tokens parameter was accepted in the constructor, stored on the instance, and never used — the summary budget was always computed from hardcoded module constants (_SUMMARY_RATIO=0.20, _MAX_SUMMARY_TOKENS=8000). This caused two compounding problems: 1. The config value was silently ignored, giving users no control over post-compression size. 2. Fixed budgets (20K tail, 8K summary cap) didn't scale with context window size. Switching from a 1M-context model to a 200K model would trigger compression that nuked 350K tokens of conversation history down to ~30K. Changes: - Replace summary_target_tokens with summary_target_ratio (default 0.40) which sets the post-compression target as a fraction of context_length. Tail token budget and summary cap now scale proportionally: MiniMax 200K → ~80K post-compression GPT-5 1M → ~400K post-compression - Change threshold_percent default: 0.50 → 0.80 (don't fire until 80% of context is consumed) - Change protect_last_n default: 4 → 20 (preserve ~10 full turns) - Summary token cap scales to 5% of context (was fixed 8K), capped at 32K ceiling - Read target_ratio and protect_last_n from config.yaml compression section (both are now configurable) - Remove hardcoded summary_target_tokens=500 from run_agent.py - Add 5 new tests for ratio scaling, clamping, and new defaults	2026-03-24 17:45:49 -07:00
Teknium	618f15dda9	fix: reorder setup wizard providers — OpenRouter first Move OpenRouter to position 1 in the setup wizard's provider list to match hermes model ordering. Update default selection index and fix test expectations for the new ordering. Setup order: OpenRouter → Nous Portal → Codex → Custom → ...	2026-03-24 12:50:24 -07:00
Teknium	745859babb	feat: env var passthrough for skills and user config (#2807 ) * feat: env var passthrough for skills and user config Skills that declare required_environment_variables now have those vars passed through to sandboxed execution environments (execute_code and terminal). Previously, execute_code stripped all vars containing KEY, TOKEN, SECRET, etc. and the terminal blocklist removed Hermes infrastructure vars — both blocked skill-declared env vars. Two passthrough sources: 1. Skill-scoped (automatic): when a skill is loaded via skill_view and declares required_environment_variables, vars that are present in the environment are registered in a session-scoped passthrough set. 2. Config-based (manual): terminal.env_passthrough in config.yaml lets users explicitly allowlist vars for non-skill use cases. Changes: - New module: tools/env_passthrough.py — shared passthrough registry - hermes_cli/config.py: add terminal.env_passthrough to DEFAULT_CONFIG - tools/skills_tool.py: register available skill env vars on load - tools/code_execution_tool.py: check passthrough before filtering - tools/environments/local.py: check passthrough in _sanitize_subprocess_env and _make_run_env - 19 new tests covering all layers * docs: add environment variable passthrough documentation Document the env var passthrough feature across four docs pages: - security.md: new 'Environment Variable Passthrough' section with full explanation, comparison table, and security considerations - code-execution.md: update security section, add passthrough subsection, fix comparison table - creating-skills.md: add tip about automatic sandbox passthrough - skills.md: add note about passthrough after secure setup docs Live-tested: launched interactive CLI, loaded a skill with required_environment_variables, verified TEST_SKILL_SECRET_KEY was accessible inside execute_code sandbox (value: passthrough-test-value-42).	2026-03-24 08:19:34 -07:00
Teknium	ad1bf16f28	chore: remove all remaining mini-swe-agent references Complete cleanup after dropping the mini-swe-agent submodule (PR #2804): - Remove MSWEA_SILENT_STARTUP and MSWEA_GLOBAL_CONFIG_DIR env var settings from cli.py, run_agent.py, hermes_cli/main.py, doctor.py - Remove mini-swe-agent health check from hermes doctor - Remove 'minisweagent' from logger suppression lists - Remove litellm/typer/platformdirs from requirements.txt - Remove mini-swe-agent install steps from install.ps1 (Windows) - Remove mini-swe-agent install steps from website docs - Update all stale comments/docstrings referencing mini-swe-agent in terminal_tool.py, tools/__init__.py, code_execution_tool.py, environments/README.md, environments/agent_loop.py - Remove mini_swe_runner from pyproject.toml py-modules (still exists as standalone script for RL training use) - Shrink test_minisweagent_path.py to empty stub The orphaned mini-swe-agent/ directory on disk needs manual removal: rm -rf mini-swe-agent/	2026-03-24 08:19:23 -07:00
Teknium	02b38b93cb	refactor: remove mini-swe-agent dependency — inline Docker/Modal backends (#2804 ) Drop the mini-swe-agent git submodule. All terminal backends now use hermes-agent's own environment implementations directly. Docker backend: - Inline the `docker run -d` container startup (was 15 lines in minisweagent's DockerEnvironment). Our wrapper already handled execute(), cleanup(), security hardening, volumes, and resource limits. Modal backend: - Import swe-rex's ModalDeployment directly instead of going through minisweagent's 90-line passthrough wrapper. - Bake the _AsyncWorker pattern (from environments/patches.py) directly into ModalEnvironment for Atropos compatibility without monkey-patching. Cleanup: - Remove minisweagent_path.py (submodule path resolution helper) - Remove submodule init/install from install.sh and setup-hermes.sh - Remove mini-swe-agent from .gitmodules - environments/patches.py is now a no-op (kept for backward compat) - terminal_tool.py no longer does sys.path hacking for minisweagent - mini_swe_runner.py guards imports (optional, for RL training only) - Update all affected tests to mock the new direct subprocess calls - Update README.md, CONTRIBUTING.md No functionality change — all Docker, Modal, local, SSH, Singularity, and Daytona backends behave identically. 6093 tests pass.	2026-03-24 07:30:25 -07:00
Teknium	ce39f9cc44	fix(gateway): detect virtualenv path instead of hardcoding venv/ (#2797 ) Fixes #2492. `generate_systemd_unit()` and `get_python_path()` hardcoded `venv` as the virtualenv directory name. When the virtualenv is `.venv` (which `setup-hermes.sh` and `.gitignore` both reference), the generated systemd unit had incorrect VIRTUAL_ENV and PATH variables. Introduce `_detect_venv_dir()` which: 1. Checks `sys.prefix` vs `sys.base_prefix` to detect the active venv 2. Falls back to probing `.venv` then `venv` under PROJECT_ROOT Both `get_python_path()` and `generate_systemd_unit()` now use this detection instead of hardcoded paths. Co-authored-by: Hermes <hermes@nousresearch.ai>	2026-03-24 07:05:57 -07:00
Teknium	b641ee88f4	feat(model): /model command overhaul — Phases 2, 3, 5 * feat(model): persist base_url on /model switch, auto-detect for bare /model custom Phase 2+3 of the /model command overhaul: Phase 2 — Persist base_url on model switch: - CLI: save model.base_url when switching to a non-OpenRouter endpoint; clear it when switching away from custom to prevent stale URLs leaking into the new provider's resolution - Gateway: same logic using direct YAML write Phase 3 — Better feedback and edge cases: - Bare '/model custom' now auto-detects the model from the endpoint using _auto_detect_local_model() and saves all three config values (model, provider, base_url) atomically - Shows endpoint URL in success messages when switching to/from custom providers (both CLI and gateway) - Clear error messages when no custom endpoint is configured - Updated test assertions for the additional save_config_value call Fixes #2562 (Phase 2+3) * feat(model): support custom:name:model triple syntax for named custom providers Phase 5 of the /model command overhaul. Extends parse_model_input() to handle the triple syntax: /model custom:local-server:qwen → provider='custom:local-server', model='qwen' /model custom:my-model → provider='custom', model='my-model' (unchanged) The 'custom:local-server' provider string is already supported by _get_named_custom_provider() in runtime_provider.py, which matches it against the custom_providers list in config.yaml. This just wires the parsing so users can do it from the /model slash command. Added 4 tests covering single, triple, whitespace, and empty model cases.	2026-03-24 06:58:04 -07:00
Teknium	2f1c4fb01f	fix(auth): preserve 'custom' provider instead of silently remapping to 'openrouter' resolve_provider('custom') was silently returning 'openrouter', causing users who set provider: custom in config.yaml to unknowingly route through OpenRouter instead of their local/custom endpoint. The display showed 'via openrouter' even when the user explicitly chose custom. Changes: - auth.py: Split the conditional so 'custom' returns 'custom' as-is - runtime_provider.py: _resolve_named_custom_runtime now returns provider='custom' instead of 'openrouter' - runtime_provider.py: _resolve_openrouter_runtime returns provider='custom' when that was explicitly requested - Add 'no-key-required' placeholder for keyless local servers - Update existing test + add 5 new tests covering the fix Fixes #2562	2026-03-24 06:41:11 -07:00
Teknium	1345e93393	fix: add macOS Homebrew paths to browser and terminal PATH resolution On macOS with Homebrew (Apple Silicon), Node.js and agent-browser binaries live under /opt/homebrew/bin/ which is not included in the _SANE_PATH fallback used by browser_tool.py and environments/local.py. When Hermes runs with a filtered PATH (e.g. as a systemd service), these binaries are invisible, causing 'env: node: No such file or directory' errors when using browser tools. Changes: - Add /opt/homebrew/bin and /opt/homebrew/sbin to _SANE_PATH in both browser_tool.py and environments/local.py - Add _discover_homebrew_node_dirs() to find versioned Node installs (e.g. brew install node@24) that aren't linked into /opt/homebrew/bin - Extend _find_agent_browser() to search Homebrew and Hermes-managed dirs when agent-browser isn't on the current PATH - Include discovered Homebrew node dirs in subprocess PATH when launching agent-browser - Add 11 new tests covering all Homebrew path discovery logic	2026-03-23 22:45:55 -07:00
Teknium	48b5bc6038	fix(gateway): prevent stale memory overwrites by flush agent (#2670 ) The gateway memory flush agent reviews old conversation history on session reset/expiry and writes to memory. It had no awareness of memory changes made after that conversation ended (by the live agent, cron jobs, or other sessions), causing silent overwrites of newer entries. Two fixes: 1. Skip memory flush entirely for cron sessions (session IDs starting with 'cron_'). Cron sessions are headless with no meaningful user conversation to extract memories from. 2. Inject the current live memory state (MEMORY.md + USER.md) directly into the flush prompt. The flush agent can now see what's already saved and make informed decisions — only adding genuinely new information rather than blindly overwriting entries that may have been updated since the conversation ended. Addresses the root cause identified in #2670: the flush agent was making memory decisions blind to the current state of memory, causing stale context to overwrite newer entries on gateway restarts and session resets. Co-authored-by: devorun <devorun@users.noreply.github.com> Co-authored-by: dlkakbs <dlkakbs@users.noreply.github.com>	2026-03-23 16:08:38 -07:00
Teknium	4ff73fb32c	feat(config): support ${ENV_VAR} substitution in config.yaml (#2684 ) * feat(config): support ${ENV_VAR} substitution in config.yaml * fix: extend env var expansion to CLI and gateway config loaders The original PR (#2680) only wired _expand_env_vars into load_config(), which is used by 'hermes tools' and 'hermes setup'. The two primary config paths were missed: - load_cli_config() in cli.py (interactive CLI) - Module-level _cfg in gateway/run.py (gateway — bridges api_keys to env vars) Also: - Remove redundant 'import re' (already imported at module level) - Add missing blank lines between top-level functions (PEP 8) - Add tests for load_cli_config() expansion --------- Co-authored-by: teyrebaz33 <hakanerten02@hotmail.com>	2026-03-23 16:02:06 -07:00
Teknium	0791efe2c3	fix(security): add SSRF protection to vision_tools and web_tools (hardened) * fix(security): add SSRF protection to vision_tools and web_tools Both vision_analyze and web_extract/web_crawl accept arbitrary URLs without checking if they target private/internal network addresses. A prompt-injected or malicious skill could use this to access cloud metadata endpoints (169.254.169.254), localhost services, or private network hosts. Adds a shared url_safety.is_safe_url() that resolves hostnames and blocks private, loopback, link-local, and reserved IP ranges. Also blocks known internal hostnames (metadata.google.internal). Integrated at the URL validation layer in vision_tools and before each website_policy check in web_tools (extract, crawl). * test(vision): update localhost test to reflect SSRF protection The existing test_valid_url_with_port asserted localhost URLs pass validation. With SSRF protection, localhost is now correctly blocked. Update the test to verify the block, and add a separate test for valid URLs with ports using a public hostname. * fix(security): harden SSRF protection — fail-closed, CGNAT, multicast, redirect guard Follow-up hardening on top of dieutx's SSRF protection (PR #2630): - Change fail-open to fail-closed: DNS errors and unexpected exceptions now block the request instead of allowing it (OWASP best practice) - Block CGNAT range (100.64.0.0/10): Python's ipaddress.is_private does NOT cover this range (returns False for both is_private and is_global). Used by Tailscale/WireGuard and carrier infrastructure. - Add is_multicast and is_unspecified checks: multicast (224.0.0.0/4) and unspecified (0.0.0.0) addresses were not caught by the original four-check chain - Add redirect guard for vision_tools: httpx event hook re-validates each redirect target against SSRF checks, preventing the classic redirect-based SSRF bypass (302 to internal IP) - Move SSRF filtering before backend dispatch in web_extract: now covers Parallel and Tavily backends, not just Firecrawl - Extract _is_blocked_ip() helper for cleaner IP range checking - Add 24 new tests (CGNAT, multicast, IPv4-mapped IPv6, fail-closed behavior, parametrized blocked/allowed IP lists) - Fix existing tests to mock DNS resolution for test hostnames --------- Co-authored-by: dieutx <dangtc94@gmail.com>	2026-03-23 15:40:42 -07:00
Teknium	934fbe3c06	fix: strip ANSI at the source — clean terminal output before it reaches the model Root cause: terminal_tool, execute_code, and process_registry returned raw subprocess output with ANSI escape sequences intact. The model saw these in tool results and copied them into file writes. Previous fix (PR #2532) stripped ANSI at the write point in file_tools.py, but this was a band-aid — regex on file content risks corrupting legitimate content, and doesn't prevent ANSI from wasting tokens in the model context. Source-level fix: - New tools/ansi_strip.py with comprehensive ECMA-48 regex covering CSI (incl. private-mode, colon-separated, intermediate bytes), OSC (both terminators), DCS/SOS/PM/APC strings, Fp/Fe/Fs/nF escapes, 8-bit C1 - terminal_tool.py: strip output before returning to model - code_execution_tool.py: strip stdout/stderr before returning - process_registry.py: strip output in poll/read_log/wait - file_tools.py: remove _strip_ansi band-aid (no longer needed) Verified: `ls --color=always` output returned as clean text to model, file written from that output contains zero ESC bytes.	2026-03-23 07:43:12 -07:00
Teknium	868b3c07e3	fix: platform default toolsets silently override tool deselection in hermes tools (#2624 ) Cherry-picked from PR #2576 by ereid7, plus read-side fix from `173a5c62`. Both fixes were originally landed in `173a5c62` but were inadvertently reverted by commit `34be3f8b` (a squash-merge that bundled unrelated tools_config.py changes). Save side (_save_platform_tools): exclude platform default toolset names (hermes-cli, hermes-telegram) from preserved entries so they don't silently re-enable everything. Read side (_get_platform_tools): when the saved list contains explicit configurable keys, use direct membership instead of subset inference. The subset approach is broken when composite toolsets like hermes-cli resolve to ALL tools.	2026-03-23 07:06:51 -07:00
Teknium	7da0822456	fix(approval): honor bare YAML approvals.mode: off (#2620 ) Cherry-picked from PR #2563 by tumf. YAML 1.1 parses unquoted 'off' as boolean False. Added _normalize_approval_mode() to map False -> 'off', True -> 'manual', and normalize string values. Includes regression tests.	2026-03-23 06:56:09 -07:00
Teknium	d35df0db71	fix(discord): ignore system messages in on_message handler (#2618 ) Cherry-picked from PR #2575 by ticketclosed-wontfix. Filters out Discord system messages (thread renames, pins, member joins, boosts) that were being treated as regular user messages. Follow-up fix: also allow MessageType.reply (value 19) — the original filter only allowed MessageType.default, which would silently drop all reply-based interactions. Added pytest.importorskip for discord dependency in tests.	2026-03-23 06:50:09 -07:00
Teknium	93dc5dee6f	fix: prevent agents from starting gateway outside systemd management (#2617 ) An agent session killed the systemd-managed gateway (PID 1605) and restarted it with '&disown', taking it outside systemd's Restart= management. When the orphaned process later received SIGTERM, nothing restarted it. Add dangerous command patterns to detect: - 'gateway run' with & (background), disown, nohup, or setsid - These should use 'systemctl --user restart hermes-gateway' instead Also applied directly to main repo and fixed the systemd service: - Changed Restart=on-failure to Restart=always (clean SIGTERM = exit 0 = not a 'failure', so on-failure never triggered) - RestartSec=10 for reasonable restart delay	2026-03-23 06:45:17 -07:00
Guts	2d8fad8230	fix(context): restrict @ references to safe workspace paths (#2601 ) fix(context): block @ references from reading secrets outside the workspace. Defaults allowed_root to cwd, adds sensitive file blocklist.	2026-03-23 06:40:05 -07:00
Mibay	ca2958ff98	fix: normalize repeat<=0 to None to prevent cron jobs deleting after first run (#2612 ) fix: normalize repeat<=0 to None — cron jobs deleted after first run when LLM passes -1	2026-03-23 06:35:43 -07:00
Teknium	f60ebc7bf2	fix: move activated skills line below welcome text Previously 'Activated skills: xxx' was printed above the banner in show_banner(). Now it prints directly after the 'Welcome to Hermes Agent!' line in run(), which is a more natural placement.	2026-03-23 06:20:19 -07:00
Teknium	b072737193	fix: expand tilde (~) in vision_analyze local file paths (#2585 ) Path('~/.hermes/image.png').is_file() returns False because Path doesn't expand tilde. This caused the tool to fall through to URL validation, which also failed, producing a confusing error: 'Invalid image source. Provide an HTTP/HTTPS URL or a valid local file path.' Fix: use os.path.expanduser() before constructing the Path object. Added two tests for tilde expansion (success and nonexistent file).	2026-03-22 23:48:32 -07:00
Teknium	3b509da571	feat: auto-reconnect failed gateway platforms with exponential backoff (#2584 ) When a messaging platform fails to connect at startup (e.g. transient DNS failure) or disconnects at runtime with a retryable error, the gateway now queues it for background reconnection instead of giving up permanently. - New _platform_reconnect_watcher background task runs alongside the existing session expiry watcher - Exponential backoff: 30s, 60s, 120s, 240s, 300s cap - Max 20 retry attempts before giving up on a platform - Non-retryable errors (bad auth token, etc.) are not retried - Runtime disconnections via _handle_adapter_fatal_error now queue retryable failures instead of triggering gateway shutdown - On successful reconnect, adapter is wired up and channel directory is rebuilt automatically Fixes the case where a DNS blip during gateway startup caused Telegram and Discord to be permanently unavailable until manual restart.	2026-03-22 23:48:24 -07:00
Teknium	b799bca7a3	refactor(gateway): remove broken 1.4x hygiene multiplier entirely The previous commit capped the 1.4x at 95% of context, but the multiplier itself is unnecessary and confusing: 85% threshold × 1.4 = 119% of context → never fires 95% warn × 1.4 = 133% of context → never warns The 85% hygiene threshold already provides ample headroom over the agent's own 50% compressor. Even if rough estimates overestimate by 50%, hygiene would fire at ~57% actual usage — safe and harmless. Remove the multiplier entirely. Both actual and estimated token paths now use the same 85% / 95% thresholds. Update tests and comments.	2026-03-22 15:21:18 -07:00
Teknium	b2b4a9ee7d	fix(gateway): hygiene compression ignores config context_length and 1.4x exceeds model limit Three bugs in gateway session hygiene pre-compression caused 'Session too large' errors for ~200K context models like GLM-5-turbo on z.ai: 1. Gateway hygiene called get_model_context_length(model) without passing config_context_length, provider, or base_url — so user overrides like model.context_length: 180000 were ignored, and provider-aware detection (models.dev, z.ai endpoint) couldn't fire. The agent's own compressor correctly passed all three (run_agent.py line 1038). 2. The 1.4x safety factor on rough token estimates pushed the compression threshold above the model's actual context limit: 200K * 0.85 * 1.4 = 238K > 200K (model limit) So hygiene never compressed, sessions grew past the limit, and the API rejected the request. 3. Same issue for the warn threshold: 200K * 0.95 * 1.4 = 266K. Fix: - Read model.context_length, provider, and base_url from config.yaml (same as run_agent.py does) and pass them to get_model_context_length() - Resolve provider/base_url from runtime when not in config - Cap the 1.4x-adjusted compress threshold at 95% of context_length - Cap the 1.4x-adjusted warn threshold at context_length Affects: z.ai GLM-5/GLM-5-turbo, any ~200K or smaller context model where the 1.4x factor would push 85% above 100%. Ref: Discord report from Ddox — glm-5-turbo on z.ai coding plan	2026-03-22 15:15:37 -07:00
Teknium	ed805f57ff	fix(mcp-oauth): port mismatch, path traversal, and shared handler state (salvage #2521 ) (#2552 ) * fix(mcp-oauth): port mismatch, path traversal, and shared state in OAuth flow Three bugs in the new MCP OAuth 2.1 PKCE implementation: 1. CRITICAL: OAuth redirect port mismatch — build_oauth_auth() calls _find_free_port() to register the redirect_uri, but _wait_for_callback() calls _find_free_port() again getting a DIFFERENT port. Browser redirects to port A, server listens on port B — callback never arrives, 120s timeout. Fix: share the port via module-level _oauth_port variable. 2. MEDIUM: Path traversal via unsanitized server_name — HermesTokenStorage uses server_name directly in filenames. A name like "../../.ssh/config" writes token files outside ~/.hermes/mcp-tokens/. Fix: sanitize server_name with the same regex pattern used elsewhere. 3. MEDIUM: Class-level auth_code/state on _CallbackHandler causes data races if concurrent OAuth flows run. Second callback overwrites first. Fix: factory function _make_callback_handler() returns a handler class with a closure-scoped result dict, isolating each flow. * test: add tests for MCP OAuth path traversal, handler isolation, and port sharing 7 new tests covering: - Path traversal blocked (../../.ssh/config stays in mcp-tokens/) - Dots/slashes sanitized and resolved within base dir - Normal server names preserved - Special characters sanitized (@, :, /) - Concurrent handler result dicts are independent - Handler writes to its own result dict, not class-level - build_oauth_auth stores port in module-level _oauth_port --------- Co-authored-by: 0xbyt4 <35742124+0xbyt4@users.noreply.github.com>	2026-03-22 15:02:26 -07:00
Teknium	cd2280d1a3	feat(gateway): notify users when session auto-resets (#2519 ) When a session expires (daily schedule or idle timeout) and is automatically reset, send a notification to the user explaining what happened: ◐ Session automatically reset (inactive for 24h). Conversation history cleared. Use /resume to browse and restore a previous session. Adjust reset timing in config.yaml under session_reset. Notifications are suppressed when: - The expired session had no activity (no tokens used) - The platform is excluded (api_server, webhook by default) - notify: false in config Changes: - session.py: _should_reset() returns reason string ('idle'/'daily') instead of bool; SessionEntry gains auto_reset_reason and reset_had_activity fields; old entry's total_tokens checked - config.py: SessionResetPolicy gains notify (bool, default: true) and notify_exclude_platforms (default: api_server, webhook) - run.py: sends notification via adapter.send() before processing the user's message, with activity + platform checks - 13 new tests Config (config.yaml): session_reset: notify: true notify_exclude_platforms: [api_server, webhook]	2026-03-22 09:33:39 -07:00
Teknium	afe2f0abe1	feat(discord): add document caching and text-file injection (#2503 ) - Download and cache .pdf, .docx, .xlsx, .pptx attachments locally instead of passing expiring CDN URLs to the agent - Inject .txt and .md content (≤100 KB) into event.text so the agent sees file content without needing to fetch the URL - Add 20 MB size guard and SUPPORTED_DOCUMENT_TYPES allowlist - Fix: unsupported types (.zip etc.) no longer get MessageType.DOCUMENT - Add 9 unit tests in test_discord_document_handling.py Mirrors the Slack implementation from PR #784. Discord CDN URLs are publicly accessible so no auth header is needed (unlike Slack). Co-authored-by: Dilee <uzmpsk.dilekakbas@gmail.com>	2026-03-22 07:38:14 -07:00
Teknium	be3eb62047	fix(tests): resolve all consistently failing tests - test_plugins.py: remove tests for unimplemented plugin command API (get_plugin_command_handler, register_command never existed) - test_redact.py: add autouse fixture to clear HERMES_REDACT_SECRETS env var leaked by cli.py import in other tests - test_signal.py: same HERMES_REDACT_SECRETS fix for phone redaction - test_mattermost.py: add @bot_user_id to test messages after the mention-only filter was added in #2443 - test_context_token_tracking.py: mock resolve_provider_client for openai-codex provider that requires real OAuth credentials Full suite: 5893 passed, 0 failed.	2026-03-22 05:58:26 -07:00
Teknium	c275aa4732	Merge pull request #2465 from NousResearch/hermes/hermes-31d7db3b feat(cli): MCP server management CLI + OAuth 2.1 PKCE auth	2026-03-22 04:56:48 -07:00
Teknium	ff071fc74c	fix(gateway): process /queue'd messages after agent completion (#2469 ) * fix: respect DashScope v1 runtime mode for alibaba Remove the hardcoded Alibaba branch from resolve_runtime_provider() that forced api_mode='anthropic_messages' regardless of the base URL. Alibaba now goes through the generic API-key provider path, which auto-detects the protocol from the URL: - /apps/anthropic → anthropic_messages (via endswith check) - /v1 → chat_completions (default) This fixes Alibaba setup with OpenAI-compatible DashScope endpoints (e.g. coding-intl.dashscope.aliyuncs.com/v1) that were broken because runtime always forced Anthropic mode even when setup saved a /v1 URL. Based on PR #2024 by @kshitijk4poor. * docs(skill): add split, merge, search examples to ocr-and-documents skill Adds pymupdf examples for PDF splitting, merging, and text search to the existing ocr-and-documents skill. No new dependencies — pymupdf already covers all three operations natively. * fix: replace all production print() calls with logger in rl_training_tool Replace all bare print() calls in production code paths with proper logger calls. - Add `import logging` and module-level `logger = logging.getLogger(__name__)` - Replace print() in _start_training_run() with logger.info() - Replace print() in _stop_training_run() with logger.info() - Replace print(Warning/Note) calls with logger.warning() and logger.info() Using the logging framework allows log level filtering, proper formatting, and log routing instead of always printing to stdout. * fix(gateway): process /queue'd messages after agent completion /queue stored messages in adapter._pending_messages but never consumed them after normal (non-interrupted) completion. The consumption path at line 5219 only checked pending messages when result.get('interrupted') was True — since /queue deliberately doesn't interrupt, queued messages were silently dropped. Now checks adapter._pending_messages after both interrupted AND normal completion. For queued messages (non-interrupt), the first response is delivered before recursing to process the queued follow-up. Skips the direct send when streaming already delivered the response. Reported by GhostMode on Discord. --------- Co-authored-by: kshitijk4poor <kshitijk4poor@users.noreply.github.com> Co-authored-by: memosr.eth <96793918+memosr@users.noreply.github.com>	2026-03-22 04:56:13 -07:00
Teknium	8d528e0045	fix(api_server): persist ResponseStore to SQLite across restarts (#2472 ) The /v1/responses endpoint used an in-memory OrderedDict that lost all conversation state on gateway restart. Replace with SQLite-backed storage at ~/.hermes/response_store.db. - Responses and conversation name mappings survive restarts - Same LRU eviction behavior (configurable max_size) - WAL mode for concurrent read performance - Falls back to in-memory SQLite if disk path unavailable - Conversation name→response_id mapping moved into the store	2026-03-22 04:56:06 -07:00
Teknium	34be3f8be6	revert: remove trailing empty assistant message stripping Reverts the sanitizer addition from PR #2466 (originally #2129). We already have _empty_content_retries handling for reasoning-only responses. The trailing strip risks silently eating valid messages and is redundant with existing empty-content handling.	2026-03-22 04:55:34 -07:00
Teknium	b7091f93b1	feat(cli): MCP server management CLI + OAuth 2.1 PKCE auth Add hermes mcp add/remove/list/test/configure CLI for managing MCP server connections interactively. Discovery-first 'add' flow connects, discovers tools, and lets users select which to enable via curses checklist. Add OAuth 2.1 PKCE authentication for MCP HTTP servers (RFC 7636). Supports browser-based and manual (headless) authorization, token caching with 0600 permissions, automatic refresh. Zero external deps. Add ${ENV_VAR} interpolation in MCP server config values, resolved from os.environ + ~/.hermes/.env at load time. Core OAuth module from PR #2021 by @imnotdev25. CLI and mcp_tool wiring rewritten against current main. Closes #497, #690.	2026-03-22 04:52:52 -07:00
Teknium	0e64a48743	Merge pull request #2460 from NousResearch/hermes/hermes-5d6932ba fix(discord): properly route slash event handling in threads	2026-03-22 04:28:53 -07:00
Teknium	ffa8b562e9	fix(discord): properly route slash event handling in threads Cherry-picked from PR #2017 by @simpolism. Fixes #2011. Discord slash commands in threads were missing thread_id in the SessionSource, causing them to route to the parent channel session. Commands like /usage and /reset returned wrong data or affected the wrong session. Detects discord.Thread channels in _build_slash_event and sets chat_type='thread' with thread_id. Two tests added.	2026-03-22 04:25:19 -07:00
Teknium	56b0104154	fix: respect DashScope v1 runtime mode for alibaba (#2459 ) Remove the hardcoded Alibaba branch from resolve_runtime_provider() that forced api_mode='anthropic_messages' regardless of the base URL. Alibaba now goes through the generic API-key provider path, which auto-detects the protocol from the URL: - /apps/anthropic → anthropic_messages (via endswith check) - /v1 → chat_completions (default) This fixes Alibaba setup with OpenAI-compatible DashScope endpoints (e.g. coding-intl.dashscope.aliyuncs.com/v1) that were broken because runtime always forced Anthropic mode even when setup saved a /v1 URL. Based on PR #2024 by @kshitijk4poor. Co-authored-by: kshitijk4poor <kshitijk4poor@users.noreply.github.com>	2026-03-22 04:24:43 -07:00
Teknium	c0c13e4ed4	fix(api-server): harden jobs API — input limits, field whitelist, startup check, tests (#2456 ) fix(api-server): harden jobs API — input limits, field whitelist, startup check, tests	2026-03-22 04:18:45 -07:00
Teknium	89befcaf33	fix(cron): support Telegram topic delivery via platform:chat_id:thread_id format (#2455 ) Parse thread_id from explicit deliver target (e.g. telegram:-1003724596514:17) and forward it to _send_to_platform and mirror_to_session. Previously _resolve_delivery_target() always set thread_id=None when parsing the platform:chat_id format, breaking cron job delivery to specific Telegram topics. Added tests: - test_explicit_telegram_topic_target_with_thread_id - test_explicit_telegram_chat_id_without_thread_id Also updated CRONJOB_SCHEMA deliver description to document the platform:chat_id:thread_id format. Co-authored-by: Alex Ferrari <alex@thealexferrari.com>	2026-03-22 04:18:28 -07:00
Teknium	0f1c970179	fix(api-server): harden jobs API — input limits, field whitelist, startup check, tests Five improvements to the /api/jobs endpoints: 1. Startup availability check — cron module imported once at class load, endpoints return 501 if unavailable (not 500 per-request import error) 2. Input limits — name ≤ 200 chars, prompt ≤ 5000 chars, repeat must be positive int 3. Update field whitelist — only name/schedule/prompt/deliver/skills/ repeat/enabled pass through to cron.jobs.update_job, preventing arbitrary key injection 4. Deduplicated validation — _check_job_id and _check_jobs_available helpers replace repeated boilerplate 5. 32 new tests covering all endpoints, validation, auth, and cron-unavailable cases	2026-03-22 04:18:18 -07:00
Teknium	e109a8b502	fix(security): block untrusted browser access to api server (#2451 ) Co-authored-by: ifrederico <fr@tecompanytea.com>	2026-03-22 04:08:48 -07:00
Teknium	2c2334d4db	Merge pull request #2449 from NousResearch/hermes/hermes-31d7db3b fix(cron): scale missed-job grace window with schedule frequency	2026-03-22 04:04:42 -07:00
Teknium	21ffadc2a6	fix: dynamic grace window for missed cron job catch-up Replace hardcoded 120-second grace period with a dynamic window that scales with the job's scheduling frequency (half the period, clamped to [120s, 2h]). Daily jobs now catch up if missed by up to 2 hours instead of being silently skipped after just 2 minutes.	2026-03-22 04:04:24 -07:00
Teknium	0b370f2dd9	fix(skills_guard): agent-created dangerous skills ask instead of block Changes the policy for agent-created skills with critical security findings from 'block' (silently rejected) to 'ask' (allowed with warning logged). The agent created the skill, so blocking it entirely is too aggressive — let it through but log the findings. - Policy: agent-created dangerous changed from block to ask - should_allow_install returns None for 'ask' (vs True/False) - format_scan_report shows 'NEEDS CONFIRMATION' for ask - skill_manager_tool.py caller handles None (allows with warning) - force=True still overrides as before Based on PR #2271 by redhelix (closed — 3200 lines of unrelated Mission Control code excluded).	2026-03-22 03:56:02 -07:00
Teknium	887e8a8d84	Merge pull request #2444 from NousResearch/hermes/hermes-31d7db3b fix(tests): replace FakePath with monkeypatch for Python 3.12 compat	2026-03-22 03:52:56 -07:00
Teknium	189214a69d	fix(tests): replace FakePath subclass with monkeypatch for Python 3.12 compat Python 3.12 changed PosixPath.__new__ to ignore the redirected path argument, breaking the FakePath subclass pattern. Use monkeypatch on Path.exists instead. Based on PR #2261 by @dieutx, fixed NameError (bare Path not imported).	2026-03-22 03:52:39 -07:00
Teknium	c01cfe4f9a	fix(cron): silent jobs return empty response for delivery skip (#2442 ) Fixes #2234 The placeholder '(No response generated)' was overwriting the actual final_response, causing it to be delivered to Discord even when the agent completed work silently via tools. Changes: - Separate logged_response for output template display - Keep final_response clean (empty when agent has no text) - Delivery logic now correctly skips when final_response is empty Test added to verify empty response stays empty for delivery. Co-authored-by: Bartok9 <bartokmagic@proton.me>	2026-03-22 03:50:27 -07:00
0xbyt4	dbc25a386e	fix: auxiliary client skips expired Codex JWT and propagates Anthropic OAuth flag Two bugs in the auxiliary provider auto-detection chain: 1. Expired Codex JWT blocks the auto chain: _read_codex_access_token() returned any stored token without checking expiry, preventing fallback to working providers. Now decodes JWT exp claim and returns None for expired tokens. 2. Auxiliary Anthropic client missing OAuth identity transforms: _AnthropicCompletionsAdapter always called build_anthropic_kwargs with is_oauth=False, causing 400 errors for OAuth tokens. Now detects OAuth tokens via _is_oauth_token() and propagates the flag through the adapter chain. Cherry-picked from PR #2378 by 0xbyt4. Fixed test_api_key_no_oauth_flag to mock resolve_anthropic_token directly (env var alone was insufficient).	2026-03-21 17:36:25 -07:00
Teknium	0ea7d0ec80	fix(terminal): log disk warning check failures at debug level (salvage #2372 ) (#2394 ) * fix(terminal): log disk warning check failures at debug level * fix(terminal): guard _check_disk_usage_warning by moving scratch_dir into try --------- Co-authored-by: aydnOktay <xaydinoktay@gmail.com>	2026-03-21 17:10:17 -07:00
Teknium	1d28b4699b	fix(redact): safely handle non-string inputs (salvage #2369 ) fix(redact): safely handle non-string inputs (salvage #2369)	2026-03-21 17:10:14 -07:00
aydnOktay	40c9a13476	fix(redact): safely handle non-string inputs redact_sensitive_text() now returns early for None and coerces other non-string values to str before applying regex-based redaction, preventing TypeErrors in logging/tool-output paths. Cherry-picked from PR #2369 by aydnOktay.	2026-03-21 16:55:02 -07:00
teyrebaz33	bd49bce278	fix(prompt-caching): skip top-level cache_control on role:tool for OpenRouter On the native Anthropic Messages API path, convert_messages_to_anthropic() moves top-level cache_control on role:tool messages inside the tool_result block. On OpenRouter (chat_completions), no such conversion happens — the unexpected top-level field causes a silent hang on the second tool call. Add native_anthropic parameter to _apply_cache_marker() and apply_anthropic_cache_control(). When False (OpenRouter), role:tool messages are skipped entirely. When True (native Anthropic), existing behaviour is preserved. Fixes #2362	2026-03-21 16:54:43 -07:00
Teknium	52dd479214	Merge pull request #2361 from NousResearch/hermes/hermes-5d6932ba feat(gateway): cache AIAgent per session for prompt caching	2026-03-21 16:53:21 -07:00
Teknium	c57d5cbdde	fix(update): prompt before resetting working tree on stash conflicts (#2390 ) When 'hermes update' stashes local changes and the restore hits conflicts, the previous behavior silently ran 'git reset --hard HEAD' to clean up. This could surprise users who didn't realize their working tree was being nuked. Now the conflict handler: - Lists the specific conflicted files - Reassures the user their stash is preserved - Asks before resetting (interactive mode) - Auto-resets in non-interactive mode (prompt_user=False) - If declined, leaves the working tree as-is with guidance	2026-03-21 16:49:19 -07:00
Teknium	525caadd8c	fix: prevent Anthropic token leaking to third-party anthropic_messages providers (salvage #2383 ) (#2389 ) * fix: prevent Anthropic token fallback leaking to third-party anthropic_messages providers When provider is minimax/alibaba/etc and MINIMAX_API_KEY is not set, the code fell back to resolve_anthropic_token() sending Anthropic OAuth credentials to third-party endpoints, causing 401 errors. Now only provider=="anthropic" triggers the fallback. Generalizes the Alibaba-specific guard from #1739 to all non-Anthropic providers. * fix: set provider='anthropic' in credential refresh tests Follow-up for cherry-picked PR #2383 — existing tests didn't set agent.provider, which the new guard requires to allow Anthropic token refresh. --------- Co-authored-by: 0xbyt4 <35742124+0xbyt4@users.noreply.github.com>	2026-03-21 16:42:46 -07:00
Teknium	342096b4bd	feat(gateway): cache AIAgent per session for prompt caching The gateway created a fresh AIAgent per message, rebuilding the system prompt (including memory, skills, context files) every turn. This broke prompt prefix caching — providers like Anthropic charge ~10x more for uncached prefixes. Now caches AIAgent instances per session_key with a config signature. The cached agent is reused across messages in the same session, preserving the frozen system prompt and tool schemas. Cache is invalidated when: - Config changes (model, provider, toolsets, reasoning, ephemeral prompt) — detected via signature mismatch - /new, /reset, /clear — explicit session reset - /model — global model change clears all cached agents - /reasoning — global reasoning change clears all cached agents Per-message state (callbacks, stream consumers, progress queues) is set on the agent instance before each run_conversation() call. This matches CLI behavior where a single AIAgent lives across all turns in a session, with _cached_system_prompt built once and reused.	2026-03-21 16:21:06 -07:00
Teknium	55510cbad2	Merge pull request #2388 from NousResearch/hermes/hermes-31d7db3b fix(provider): prevent Anthropic fallback from inheriting non-Anthropic base_url + fix(update): reset on stash conflict	2026-03-21 16:20:08 -07:00
Teknium	3ab50376b0	fix(update): reset working tree when stash restore leaves conflict markers When `hermes update` stashes local changes and the subsequent `git stash apply` fails or leaves unmerged files, the conflict markers (<<<<<<< etc.) were left in the working tree, making Hermes unrunnable until manually cleaned up. Now the update command runs `git reset --hard HEAD` to restore a clean working tree before exiting, and also detects unmerged files even when git stash apply reports success. Closes #2348	2026-03-21 16:16:35 -07:00
Teknium	2a5f86ed6d	Merge pull request #2343 from NousResearch/hermes/hermes-31d7db3b feat: @ context references + Honcho config fixes	2026-03-21 16:10:19 -07:00
Teknium	8da410ed95	feat(plugins): add slash command registration for plugins (#2359 ) Plugins can now register slash commands via ctx.register_command() in their register() function. Commands automatically appear in: - /help and COMMANDS_BY_CATEGORY (under 'Plugins' category) - Tab autocomplete in CLI - Telegram bot menu - Slack subcommand mapping - Gateway dispatch Handler signature: handler(args: str) -> str \| None Async handlers are supported in gateway context. Changes: - commands.py: add register_plugin_command() and rebuild_lookups() - plugins.py: add register_command() to PluginContext, track in PluginManager._plugin_commands and LoadedPlugin.commands_registered - cli.py: dispatch plugin commands in process_command() - gateway/run.py: dispatch plugin commands before skill commands - tests: 5 new tests for registration, help, tracking, handler, gateway - docs: update plugins feature page and build guide	2026-03-21 16:00:30 -07:00
Teknium	da44c196b6	feat: @ context references — inline file, folder, diff, git, and URL injection Add @file:path, @folder:dir, @diff, @staged, @git:N, and @url: references that expand inline before the message reaches the LLM. Supports line ranges (@file:main.py:10-50), token budget enforcement (soft warn at 25%, hard block at 50%), and path sandboxing for gateway. Core module from PR #2090 by @kshitijk4poor. CLI and gateway wiring rewritten against current main. Fixed asyncio.run() crash when called from inside a running event loop (gateway). Closes #682.	2026-03-21 15:57:13 -07:00
Gutslabs	0b9526b476	fix(acp): preserve session provider when switching models	2026-03-21 15:54:10 -07:00
Teknium	b73d221324	fix: Alibaba/DashScope: preserve model dots, fix 401 auth, fix dead provider check (salvage #1748 + fix #2314 ) fix: Alibaba/DashScope: preserve model dots, fix 401 auth, fix dead provider check (salvage #1748 + fix #2314)	2026-03-21 09:51:40 -07:00
Teknium	cc51ffdb57	Merge pull request #2340 from NousResearch/feat/streaming-default feat: enable streaming by default in CLI	2026-03-21 09:50:54 -07:00
unmodeled-tyler	fb48b8f0c5	fix(gateway): pass message_thread_id in send_image_file, send_document, send_video Fixes #1803. send_image_file, send_document, and send_video were missing message_thread_id forwarding, causing them to fail in Telegram forum/supergroups where thread_id is required. send_voice already handled this correctly. Adds metadata parameter + message_thread_id to all three methods, and adds tests covering the thread_id forwarding path.	2026-03-21 09:49:33 -07:00
Angello Picasso	5a9ab09bc3	feat(cli): add hermes plugins install/remove/list command Plugin management via git repos: - hermes plugins install <git-url\|owner/repo> - hermes plugins update <name> - hermes plugins remove <name> (aliases: rm, uninstall) - hermes plugins list (alias: ls) Security: path traversal protection, no shell injection, manifest version guard, insecure URL warnings. 42 tests covering security, dispatch, helpers, and commands. Based on work by Angello Picasso in PR #1785. Closes #1789.	2026-03-21 09:47:33 -07:00
Teknium	d70e07fc45	refactor(cli): add protected TUI extension hooks for wrapper CLIs Based on PR #1749 by @erosika (reimplemented on current main). Extracts three protected methods from run() so wrapper CLIs can extend the TUI without overriding the entire method: - _get_extra_tui_widgets(): inject widgets between spacer and status bar - _register_extra_tui_keybindings(kb, input_area): add keybindings - _build_tui_layout_children(**widgets): full control over ordering Default implementations reproduce existing layout exactly. The inline HSplit in run() now delegates to _build_tui_layout_children(). 5 tests covering defaults, widget insertion position, and keybinding registration.	2026-03-21 09:42:07 -07:00
Himess	5663980015	fix(mistral-parser): handle nested JSON in fallback extraction	2026-03-21 09:41:17 -07:00
Teknium	8304a7716d	fix(gateway): restart on whatsapp bridge child exit (#2334 ) Co-authored-by: Frederico Ribeiro <fr@tecompanytea.com>	2026-03-21 09:38:52 -07:00
crazywriter1	523d8c38f9	fix: Alibaba/DashScope: preserve model dots (qwen3.5-plus) and fix 401 auth When using Alibaba (DashScope) with an anthropic-compatible endpoint, model names like qwen3.5-plus were being normalized to qwen3-5-plus. Alibaba's API expects the dot. Added preserve_dots parameter to normalize_model_name() and build_anthropic_kwargs(). Also fixed 401 auth: when provider is alibaba or base_url contains dashscope/aliyuncs, use only the resolved API key (DASHSCOPE_API_KEY). Never fall back to resolve_anthropic_token(), and skip Anthropic credential refresh for DashScope endpoints. Cherry-picked from PR #1748 by crazywriter1. Fixes #1739.	2026-03-21 09:38:04 -07:00
Teknium	e183744cb5	feat(honcho): instance-local config via HERMES_HOME, default session strategy to per-directory - Add resolve_config_path(): checks $HERMES_HOME/honcho.json first, falls back to ~/.honcho/config.json. Enables isolated Hermes instances with independent Honcho credentials and settings. - Update CLI and doctor to use resolved path instead of hardcoded global. - Change default session_strategy from per-session to per-directory. Part 1 of #1962 by @erosika.	2026-03-21 09:34:00 -07:00
Himess	bc15f6cca3	fix(mattermost): use MIME types for media attachments Bare strings like "image", "audio", "document" were appended to media_types, but downstream run.py checks mtype.startswith("image/") and mtype.startswith("audio/"), which never matched. This caused all Mattermost file attachments to be silently dropped from vision/STT processing. Use the actual MIME type from file_info instead.	2026-03-21 09:31:15 -07:00
Teknium	28bb0e770f	fix(voice): enable TTS voice reply when streaming is active (#2322 ) When streaming is enabled, the base adapter receives None from _handle_message (already_sent=True) and cannot run auto-TTS for voice input. The runner was unconditionally skipping voice input TTS assuming the base adapter would handle it. Now the runner takes over TTS responsibility when streaming has already delivered the text response, so voice channel playback works with both streaming on and off. Streaming off behavior is unchanged (default already_sent=False preserves the original code path exactly). Co-authored-by: 0xbyt4 <35742124+0xbyt4@users.noreply.github.com>	2026-03-21 08:08:37 -07:00
Teknium	453f4c5175	Merge pull request #2312 from NousResearch/hermes/hermes-31d7db3b fix(gateway): retry Telegram 409 polling conflicts before giving up	2026-03-21 07:19:43 -07:00
Teknium	37a9979459	fix(cron): stop injecting cron outputs into gateway session history (#2313 ) Cron deliveries were mirrored into the target gateway session as assistant-role messages, causing consecutive assistant messages that violate message alternation (issue #2221). Instead of fixing the role, remove the mirror injection entirely. Cron outputs already live in their own cron session and don't belong in the interactive conversation history. Delivered messages are now wrapped with a header (task name) and a footer noting the agent cannot see or respond to the message, so users have clear context about what they're reading. Closes #2221	2026-03-21 07:18:36 -07:00
Teknium	488a30e879	fix(gateway): retry Telegram 409 polling conflicts before giving up A single Telegram 409 Conflict from getUpdates permanently killed Telegram polling with no recovery possible (retryable=False on first occurrence). This is too aggressive for production use with process supervisors. Transient 409s are expected during: - --replace handoffs where the old long-poll session lingers on Telegram servers for a few seconds after SIGTERM - systemd Restart=on-failure respawns that overlap with the dying instance cleanup Now _handle_polling_conflict() retries up to 3 times with a 10-second delay between attempts. The 30-second total retry window lets stale server-side sessions expire. If all retries fail, the error is still marked as permanently fatal — preserving the original protection against genuine dual-instance conflicts. Tests updated: split the single conflict test into two — one verifying retry on transient conflict, one verifying fatal after exhausted retries. Closes #2296	2026-03-21 07:11:06 -07:00
Teknium	58b52dfb2f	Merge pull request #2303 from NousResearch/hermes/hermes-31d7db3b fix: remove synthetic error message injection, fix session resume after repeated failures	2026-03-21 07:03:54 -07:00
Teknium	2da79b13df	feat: priority-based context file selection + CLAUDE.md support (#2301 ) Previously, all project context files (AGENTS.md, .cursorrules, .hermes.md) were loaded and concatenated into the system prompt. This bloated the prompt with potentially redundant or conflicting instructions. Now only ONE project context type is loaded, using priority order: 1. .hermes.md / HERMES.md (walk to git root) 2. AGENTS.md / agents.md (recursive directory walk) 3. CLAUDE.md / claude.md (cwd only, NEW) 4. .cursorrules / .cursor/rules/*.mdc (cwd only) SOUL.md from HERMES_HOME remains independent and always loads. Also adds CLAUDE.md as a recognized context file format, matching the convention popularized by Claude Code. Refactored the monolithic function into four focused helpers: _load_hermes_md, _load_agents_md, _load_claude_md, _load_cursorrules. Tests: replaced 1 coexistence test with 10 new tests covering priority ordering, CLAUDE.md loading, case sensitivity, injection blocking.	2026-03-21 06:26:20 -07:00
Test	1870069f80	fix(session_search): exclude current session lineage Cherry-picked from PR #2201 by @Gutslabs. session_search resolved hits to parent/root sessions but only excluded the exact current_session_id. If the active session was a child continuation (compression/delegation), its parent could still appear as a 'past' conversation result. Fix: resolve current_session_id to its lineage root before filtering, so the entire active lineage (parent and children) is excluded.	2026-03-20 21:07:48 -07:00
Test	10d719ac1b	fix(security): require opt-in for project plugin discovery	2026-03-20 20:50:30 -07:00
Teknium	4263350c5b	fix: remove post-compression file-read history injection (#2226 ) Remove the [Files already read — do NOT re-read these] user message that was injected into the conversation after context compression. This message used role='user' for system-generated content, creating a fake user turn that confused models about conversation state and could contribute to task-redo behavior. The file_tools.py read tracker (warn on 3rd consecutive read, block on 4th+) already handles re-read prevention inline without injecting synthetic messages. Closes #2224. Co-authored-by: Test <test@test.com>	2026-03-20 14:54:25 -07:00
Teknium	ba0b77a803	Merge pull request #2214 from NousResearch/fix/event-loop-closed-delegate Completes the event loop lifecycle fix trilogy (#2190 → #2207 → #2214). Per-thread persistent loops for worker threads prevent GC crashes on cached async clients.	2026-03-20 12:54:19 -07:00
Teknium	f853e50589	Merge pull request #2199 from llbn/fix/telegram-markdownv2-features Clean PR, well-tested. Adds MarkdownV2 strikethrough, spoiler, and blockquote support to Telegram adapter.	2026-03-20 12:45:47 -07:00
emozilla	ab6abc2c13	fix: use per-thread persistent event loops in worker threads Replace asyncio.run() with thread-local persistent event loops for worker threads (e.g., delegate_task's ThreadPoolExecutor). asyncio.run() creates and closes a fresh loop on every call, leaving cached httpx/AsyncOpenAI clients bound to a dead loop — causing 'Event loop is closed' errors during GC when parallel subagents clean up connections. The fix mirrors the main thread's _get_tool_loop() pattern but uses threading.local() so each worker thread gets its own long-lived loop, avoiding both cross-thread contention and the create-destroy lifecycle. Added 4 regression tests covering worker loop persistence, reuse, per-thread isolation, and separation from the main thread's loop.	2026-03-20 15:41:06 -04:00
llbn	43b3a0ac66	fix(telegram): escape backslashes and backticks inside code entities for MarkdownV2 - Escape \ → \\ inside inline code and fenced code blocks - Escape ` → \` inside fenced code block bodies (not delimiters) - Add regression tests for code entity backslash handling	2026-03-20 18:32:45 +01:00
llbn	02f639e561	fix(telegram): add MarkdownV2 support for strikethrough, spoiler, and blockquotes - Convert ~~text~~ to ~text~ (MarkdownV2 strikethrough) - Protect \|\|text\|\| from pipe escaping (MarkdownV2 spoiler) - Preserve > at line start as blockquote instead of escaping it - Update _strip_mdv2() to strip ~strikethrough~ and \|\|spoiler\|\| markers - Add tests covering new formatting paths and edge cases	2026-03-20 18:21:24 +01:00
Teknium	7a427d7b03	fix: persistent event loop in _run_async prevents 'Event loop is closed' (#2190 ) Cherry-picked from PR #2146 by @crazywriter1. Fixes #2104. asyncio.run() creates and closes a fresh event loop each call. Cached httpx/AsyncOpenAI clients bound to the dead loop crash on GC with 'Event loop is closed'. This hit vision_analyze on first use in CLI. Two-layer fix: - model_tools._run_async(): replace asyncio.run() with persistent loop via _get_tool_loop() + run_until_complete() - auxiliary_client._get_cached_client(): track which loop created each async client, discard stale entries if loop is closed 6 regression tests covering loop lifecycle, reuse, and full vision dispatch chain. Co-authored-by: Test <test@test.com>	2026-03-20 09:44:50 -07:00
Teknium	2ea4dd30c6	fix(gateway): strip orphaned tool_results + let /reset bypass running agent (#2180 ) Two fixes for Telegram/gateway-specific bugs: 1. Anthropic adapter: strip orphaned tool_result blocks (mirror of existing tool_use stripping). Context compression or session truncation can remove an assistant message containing a tool_use while leaving the subsequent tool_result intact. Anthropic rejects these with a 400: 'unexpected tool_use_id found in tool_result blocks'. The adapter now collects all tool_use IDs and filters out any tool_result blocks referencing IDs not in that set. 2. Gateway: /reset and /new now bypass the running-agent guard (like /status already does). Previously, sending /reset while an agent was running caused the raw text to be queued and later fed back as a user message with the same broken history — replaying the corrupted session instead of resetting it. Now the running agent is interrupted, pending messages are cleared, and the reset command dispatches immediately. Tests updated: existing tests now include proper tool_use→tool_result pairs; two new tests cover orphaned tool_result stripping. Co-authored-by: Test <test@test.com>	2026-03-20 08:39:49 -07:00
Teknium	c52353cf8a	feat: context pressure warnings for CLI and gateway (#2159 ) * feat: context pressure warnings for CLI and gateway User-facing notifications as context approaches the compaction threshold. Warnings fire at 60% and 85% of the way to compaction — relative to the configured compression threshold, not the raw context window. CLI: Formatted line with a progress bar showing distance to compaction. Cyan at 60% (approaching), bold yellow at 85% (imminent). ◐ context ▰▰▰▰▰▰▰▰▰▰▰▰▱▱▱▱▱▱▱▱ 60% to compaction 100k threshold (50%) · approaching compaction ⚠ context ▰▰▰▰▰▰▰▰▰▰▰▰▰▰▰▰▰▱▱▱ 85% to compaction 100k threshold (50%) · compaction imminent Gateway: Plain-text notification sent to the user's chat via the new status_callback mechanism (asyncio.run_coroutine_threadsafe bridge, same pattern as step_callback). Does NOT inject into the message stream. The LLM never sees these warnings. Flags reset after each compaction cycle. Files changed: - agent/display.py — format_context_pressure(), format_context_pressure_gateway() - run_agent.py — status_callback param, _context_50/70_warned flags, _emit_context_pressure(), flag reset in _compress_context() - gateway/run.py — _status_callback_sync bridge, wired to AIAgent - tests/test_context_pressure.py — 23 tests * Merge remote-tracking branch 'origin/main' into hermes/hermes-7ea545bf --------- Co-authored-by: Test <test@test.com>	2026-03-20 08:37:36 -07:00
Test	e140c02d51	feat(gateway): add webhook platform adapter for external event triggers Add a generic webhook platform adapter that receives HTTP POSTs from external services (GitHub, GitLab, JIRA, Stripe, etc.), validates HMAC signatures, transforms payloads into agent prompts, and routes responses back to the source or to another platform. Features: - Configurable routes with per-route HMAC secrets, event filters, prompt templates with dot-notation payload access, skill loading, and pluggable delivery (github_comment, telegram, discord, log) - HMAC signature validation (GitHub SHA-256, GitLab token, generic) - Rate limiting (30 req/min per route, configurable) - Idempotency cache (1hr TTL, prevents duplicate runs on retries) - Body size limits (1MB default, checked before reading payload) - Setup wizard integration with security warnings and docs links - 33 tests (29 unit + 4 integration), all passing Security: - HMAC secret required per route (startup validation) - Setup wizard warns about internet exposure for webhook/SMS platforms - Sandboxing (Docker/VM) recommended in docs for public-facing deployments Files changed: - gateway/config.py — Platform.WEBHOOK enum + env var overrides - gateway/platforms/webhook.py — WebhookAdapter (~420 lines) - gateway/run.py — factory wiring + auth bypass for webhook events - hermes_cli/config.py — WEBHOOK_* env var definitions - hermes_cli/setup.py — webhook section in setup_gateway() - tests/gateway/test_webhook_adapter.py — 29 unit tests - tests/gateway/test_webhook_integration.py — 4 integration tests - website/docs/user-guide/messaging/webhooks.md — full user docs - website/docs/reference/environment-variables.md — WEBHOOK_* vars - website/sidebars.ts — nav entry	2026-03-20 06:33:36 -07:00
Teknium	88643a1ba9	feat: overhaul context length detection with models.dev and provider-aware resolution (#2158 ) Replace the fragile hardcoded context length system with a multi-source resolution chain that correctly identifies context windows per provider. Key changes: - New agent/models_dev.py: Fetches and caches the models.dev registry (3800+ models across 100+ providers with per-provider context windows). In-memory cache (1hr TTL) + disk cache for cold starts. - Rewritten get_model_context_length() resolution chain: 0. Config override (model.context_length) 1. Custom providers per-model context_length 2. Persistent disk cache 3. Endpoint /models (local servers) 4. Anthropic /v1/models API (max_input_tokens, API-key only) 5. OpenRouter live API (existing, unchanged) 6. Nous suffix-match via OpenRouter (dot/dash normalization) 7. models.dev registry lookup (provider-aware) 8. Thin hardcoded defaults (broad family patterns) 9. 128K fallback (was 2M) - Provider-aware context: same model now correctly resolves to different context windows per provider (e.g. claude-opus-4.6: 1M on Anthropic, 128K on GitHub Copilot). Provider name flows through ContextCompressor. - DEFAULT_CONTEXT_LENGTHS shrunk from 80+ entries to ~16 broad patterns. models.dev replaces the per-model hardcoding. - CONTEXT_PROBE_TIERS changed from [2M, 1M, 512K, 200K, 128K, 64K, 32K] to [128K, 64K, 32K, 16K, 8K]. Unknown models no longer start at 2M. - hermes model: prompts for context_length when configuring custom endpoints. Supports shorthand (32k, 128K). Saved to custom_providers per-model config. - custom_providers schema extended with optional models dict for per-model context_length (backward compatible). - Nous Portal: suffix-matches bare IDs (claude-opus-4-6) against OpenRouter's prefixed IDs (anthropic/claude-opus-4.6) with dot/dash normalization. Handles all 15 current Nous models. - Anthropic direct: queries /v1/models for max_input_tokens. Only works with regular API keys (sk-ant-api*), not OAuth tokens. Falls through to models.dev for OAuth users. Tests: 5574 passed (18 new tests for models_dev + updated probe tiers) Docs: Updated configuration.md context length section, AGENTS.md Co-authored-by: Test <test@test.com>	2026-03-20 06:04:33 -07:00
Teknium	b7b585656b	Merge pull request #2110 from NousResearch/hermes/hermes-5d6932ba fix: session reset + custom provider model switch + honcho base_url	2026-03-20 06:01:44 -07:00
Teknium	3ec6c71e43	fix: update claude 4.6 context length from 200K to 1M (#2155 ) * fix: preserve Ollama model:tag colons in context length detection The colon-split logic in get_model_context_length() and _query_local_context_length() assumed any colon meant provider:model format (e.g. "local:my-model"). But Ollama uses model:tag format (e.g. "qwen3.5:27b"), so the split turned "qwen3.5:27b" into just "27b" — which matches nothing, causing a fallback to the 2M token probe tier. Now only recognised provider prefixes (local, openrouter, anthropic, etc.) are stripped. Ollama model:tag names pass through intact. * fix: update claude-opus-4-6 and claude-sonnet-4-6 context length from 200K to 1M Both models support 1,000,000 token context windows. The hardcoded defaults were set before Anthropic expanded the context for the 4.6 generation. Verified via models.dev and OpenRouter API data. --------- Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com> Co-authored-by: Test <test@test.com>	2026-03-20 04:38:59 -07:00
Test	4ad0083118	fix(honcho): read HONCHO_BASE_URL for local/self-hosted instances Cherry-picked from PR #2120 by @unclebumpy. - from_env() now reads HONCHO_BASE_URL and enables Honcho when base_url is set, even without an API key - from_global_config() reads baseUrl from config root with HONCHO_BASE_URL env var as fallback - get_honcho_client() guard relaxed to allow base_url without api_key for no-auth local instances - Added HONCHO_BASE_URL to OPTIONAL_ENV_VARS registry Result: Setting HONCHO_BASE_URL=http://localhost:8000 in ~/.hermes/.env now correctly routes the Honcho client to a local instance.	2026-03-20 04:36:06 -07:00
Test	1055d4356a	fix: skip model auto-detection for custom/local providers When the user is on a custom provider (provider=custom, localhost, or 127.0.0.1 endpoint), /model <name> no longer tries to auto-detect a provider switch. The model name changes on the current endpoint as-is. To switch away from a custom endpoint, users must use explicit provider:model syntax (e.g. /model openai-codex:gpt-5.2-codex). A helpful tip is printed when changing models on a custom endpoint. This prevents the confusing case where someone on LM Studio types /model gpt-5.2-codex, the auto-detection tries to switch providers, fails or partially succeeds, and requests still go to the old endpoint. Also fixes the missing prompt_toolkit.auto_suggest mock stub in test_cli_init.py (same issue already fixed in test_cli_new_session.py).	2026-03-20 04:35:17 -07:00
Test	5822711ae6	fix: complete session reset — missing compressor counters + test Follow-up to PR #2101 (InB4DevOps). Adds three missing context compressor resets in reset_session_state(): - compression_count (displayed in status bar) - last_total_tokens - _context_probed (stale context-error flag) Also fixes the test_cli_new_session.py prompt_toolkit mock (missing auto_suggest stub) and adds a regression test for #2099 that verifies all token counters and compressor state are zeroed on /new.	2026-03-20 04:35:17 -07:00
Teknium	471ea81a7d	fix: preserve Ollama model:tag colons in context length detection (#2149 ) The colon-split logic in get_model_context_length() and _query_local_context_length() assumed any colon meant provider:model format (e.g. "local:my-model"). But Ollama uses model:tag format (e.g. "qwen3.5:27b"), so the split turned "qwen3.5:27b" into just "27b" — which matches nothing, causing a fallback to the 2M token probe tier. Now only recognised provider prefixes (local, openrouter, anthropic, etc.) are stripped. Ollama model:tag names pass through intact. Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>	2026-03-20 03:19:31 -07:00
Teknium	3a9a1bbb84	Merge pull request #2091 from dusterbloom/fix/lmstudio-context-length-detection feat: query local servers for actual context window size	2026-03-19 19:08:21 -07:00
Teknium	d8081790f3	Merge pull request #2102 from NousResearch/hermes/hermes-6757a563 fix(tools,cli): normalise MCP schemas + expand session list columns	2026-03-19 19:06:56 -07:00
Test	fc061c2fee	fix: harden sentinel guard for /stop during setup and shutdown - /stop during sentinel returns helpful message instead of queuing - Shutdown loop skips sentinel entries instead of catching AttributeError - _handle_stop_command guards against sentinel (defensive) - Added tests for both edge cases (7 total race guard tests)	2026-03-19 18:26:09 -07:00
Gutslabs	aaa96713d4	fix(gateway): prevent concurrent agent runs for the same session Place a sentinel in _running_agents immediately after the "already running" guard check passes — before any await. Without this, the numerous await points between the guard (line 1324) and agent registration (track_agent at line 4790) create a window where a second message for the same session can bypass the guard and start a duplicate agent, corrupting the transcript. The await gap includes: hook emissions, vision enrichment (external API call), audio transcription (external API call), session hygiene compression, and the run_in_executor call itself. For messages with media attachments the window can be several seconds wide. The sentinel is wrapped in try/finally so it is always cleaned up — even if the handler raises or takes an early-return path. When the real AIAgent is created, track_agent() overwrites the sentinel with the actual instance (preserving interrupt support). Also handles the edge case where a message arrives while the sentinel is set but no real agent exists yet: the message is queued via the adapter's pending-message mechanism instead of attempting to call interrupt() on the sentinel object.	2026-03-19 18:23:24 -07:00
Teknium	6bcec1ac25	fix: resolve MiniMax 401 auth error by defaulting to anthropic_messages (#2103 ) MiniMax's default base URL was /v1 which caused runtime_provider to default to chat_completions mode (OpenAI-style Authorization: Bearer header). MiniMax rejects this with a 401 because they require the Anthropic-style x-api-key header. Changes: - auth.py: Change default inference_base_url for minimax and minimax-cn from /v1 to /anthropic - runtime_provider.py: Auto-correct stale /v1 URLs from existing .env files to /anthropic, and always default minimax/minimax-cn providers to anthropic_messages mode - Update tests to reflect new defaults, add tests for stale URL auto-correction and explicit api_mode override Based on PR #2100 by @devorun. Fixes #2094. Co-authored-by: Test <test@test.com>	2026-03-19 17:47:05 -07:00
hermes	4d2c93a04f	fix: normalize MCP object schemas without properties	2026-03-19 16:23:45 -07:00
Peppi Littera	c030ac1d85	fix: prefer loaded instance context size over max for LM Studio When LM Studio has a model loaded with a custom context size (e.g., 122K), prefer that over the model's max_context_length (e.g., 1M). This makes the TUI status bar show the actual runtime context window.	2026-03-19 21:24:53 +01:00
Peppi Littera	d223f7388d	feat: query local server for actual context window size Instead of defaulting to 2M for unknown local models, query the server API for the real context length. Supports Ollama (/api/show), vLLM (max_model_len), and LM Studio (/v1/models). Results are cached to avoid repeated queries.	2026-03-19 21:24:05 +01:00
Teknium	e84d952dc0	fix(codex): handle reasoning-only responses and replay path (#2070 ) * fix(codex): treat reasoning-only responses as incomplete, not stop When a Codex Responses API response contains only reasoning items (encrypted thinking state) with no message text or tool calls, the _normalize_codex_response method was setting finish_reason='stop'. This sent the response into the empty-content retry loop, which burned 3 retries and then failed — exactly the pattern Nester reported in Discord. Two fixes: 1. _normalize_codex_response: reasoning-only responses (reasoning_items_raw non-empty but no final_text) now get finish_reason='incomplete', routing them to the Codex continuation path instead of the retry loop. 2. Incomplete handling: also checks for codex_reasoning_items when deciding whether to preserve an interim message, so encrypted reasoning state is not silently dropped when there is no visible reasoning text. Adds 4 regression tests covering: - Unit: reasoning-only → incomplete, reasoning+content → stop - E2E: reasoning-only → continuation → final answer succeeds - E2E: encrypted reasoning items preserved in interim messages * fix(codex): ensure reasoning items have required following item in API input Follow-up to the reasoning-only response fix. Three additional issues found by tracing the full replay path: 1. _chat_messages_to_responses_input: when a reasoning-only interim message was converted to Responses API input, the reasoning items were emitted as the last items with no following item. The Responses API requires a following item after each reasoning item (otherwise: 'missing_following_item' error, as seen in OpenHands #11406). Now emits an empty assistant message as the required following item when content is empty but reasoning items were added. 2. Duplicate detection: two consecutive reasoning-only incomplete messages with identical empty content/reasoning but different encrypted codex_reasoning_items were incorrectly treated as duplicates, silently dropping the second response's reasoning state. Now includes codex_reasoning_items in the duplicate comparison. 3. Added tests for both the API input conversion path and the duplicate detection edge case. Research context: verified against OpenCode (uses Vercel AI SDK, no retry loop so avoids the issue), Clawdbot (drops orphaned reasoning blocks entirely), and OpenHands (hit the missing_following_item error). Our approach preserves reasoning continuity while satisfying the API constraint. --------- Co-authored-by: Test <test@test.com>	2026-03-19 10:34:44 -07:00
Teknium	388130a122	fix: persist ACP sessions to SessionDB so they survive process restarts * fix: persist ACP sessions to disk so they survive process restarts The ACP adapter stored sessions entirely in-memory. When the editor restarted the ACP subprocess (idle timeout, crash, system sleep/wake, editor restart), all sessions were lost. The editor's load_session / resume_session calls would fail to find the session, forcing a new empty session and losing all conversation history. Changes: - SessionManager now persists each session as a JSON file under ~/.hermes/acp_sessions/<session_id>.json - get_session() transparently restores from disk when not in memory - update_cwd(), fork_session(), list_sessions() all check disk - server.py calls save_session() after prompt completion, /reset, /compact, and model switches - cleanup() and remove_session() delete disk files too - Sessions have a 7-day TTL; expired sessions are pruned on startup - Atomic writes via tempfile + os.replace to prevent corruption - 11 new tests covering persistence, disk restoration, and TTL expiry * refactor: use SessionDB instead of JSON files for ACP session persistence Replace the standalone JSON file persistence layer with SessionDB (~/.hermes/state.db) integration. ACP sessions now: - Share the same DB as CLI and gateway sessions - Are searchable via session_search (FTS5) - Get token tracking, cost tracking, and session titles for free - Follow existing session pruning policies Key changes: - _get_db() lazily creates a SessionDB, resolving HERMES_HOME dynamically (not at import time) for test compatibility - _persist() creates session record + replaces messages in DB - _restore() loads from DB with source='acp' filter - cwd stored in model_config JSON field (no schema migration) - Model values coerced to str to handle mock agents in tests - Removed: json files, sessions_dir, ttl_days, _expire logic - Tests updated: DB-backed persistence, FTS search, tool_call round-tripping, source filtering --------- Co-authored-by: Test <test@test.com>	2026-03-19 10:30:50 -07:00
cmcleay	bb59057d5d	fix: normalize live Chrome CDP endpoints for browser tools	2026-03-19 10:17:03 -07:00
Test	7f3a567259	Merge PR #2063 : fix(daytona): migrate sandbox lookup from find_one to get/list Authored by Lovre Pešut (rovle). Migrates from deprecated find_one(labels=...) to get(sandbox_name) with deterministic naming (hermes-{task_id}), plus legacy fallback via list(labels=...) for pre-migration sandboxes.	2026-03-19 10:01:40 -07:00
Yannick Stephan	defbe0f9e9	fix(cron): warn and skip missing skills instead of crashing job When a cron job references a skill that is no longer installed, _build_job_prompt() now logs a warning and injects a user-visible notice into the prompt instead of raising RuntimeError. The job continues with any remaining valid skills and the user prompt. Adds 4 regression tests for missing skill handling.	2026-03-19 09:56:16 -07:00
rovle	18862145e4	fix(daytona): migrate sandbox lookup from find_one to get/list find_one is being deprecated. Primary lookup now uses get() with a deterministic sandbox name (hermes-{task_id}). A legacy fallback via list(labels=...) ensures sandboxes created before this migration are still resumable.	2026-03-19 17:54:46 +01:00
Test	35558dadf4	Merge PR #2061 : fix(security): eliminate SQL string formatting in execute() calls Authored by dusterbloom. Closes #1911. Pre-computes SQL query strings at class definition time in insights.py, adds identifier quoting for ALTER TABLE DDL in hermes_state.py, and adds 4 regression tests verifying query construction safety.	2026-03-19 09:52:00 -07:00
Test	ae8059ca24	fix(delegate): move _saved_tool_names assignment to correct scope The merge at `e7844e9c` re-introduced a line in _build_child_agent() that references _saved_tool_names — a variable only defined in _run_single_child(). This caused NameError on every delegate_task call, completely breaking subagent delegation. Moves the child._delegate_saved_tool_names assignment to _run_single_child() where _saved_tool_names is actually defined, keeping the save/restore in the same scope as the try/finally block. Adds two regression tests from PR #2038 (YanSte). Also fixes the same issue reported in PR #2048 (Gutslabs). Co-authored-by: Yannick Stephan <yannick.stephan@gmail.com> Co-authored-by: Guts <gutslabs@users.noreply.github.com>	2026-03-19 09:26:05 -07:00
Peppi Littera	219af75704	fix(security): eliminate SQL string formatting in execute() calls Closes #1911 - insights.py: Pre-compute SELECT queries as class constants instead of f-string interpolation at runtime. _SESSION_COLS is now evaluated once at class definition time. - hermes_state.py: Add identifier quoting and whitelist validation for ALTER TABLE column names in schema migrations. - Add 4 tests verifying no injection vectors in SQL query construction.	2026-03-19 15:16:35 +01:00
Teknium	d76fa7fc37	fix: detect context length for custom model endpoints via fuzzy matching + config override (#2051 ) * fix: detect context length for custom model endpoints via fuzzy matching + config override Custom model endpoints (non-OpenRouter, non-known-provider) were silently falling back to 2M tokens when the model name didn't exactly match what the endpoint's /v1/models reported. This happened because: 1. Endpoint metadata lookup used exact match only — model name mismatches (e.g. 'qwen3.5:9b' vs 'Qwen3.5-9B-Q4_K_M.gguf') caused a miss 2. Single-model servers (common for local inference) required exact name match even though only one model was loaded 3. No user escape hatch to manually set context length Changes: - Add fuzzy matching for endpoint model metadata: single-model servers use the only available model regardless of name; multi-model servers try substring matching in both directions - Add model.context_length config override (highest priority) so users can explicitly set their model's context length in config.yaml - Log an informative message when falling back to 2M probe, telling users about the config override option - Thread config_context_length through ContextCompressor and AIAgent init Tests: 6 new tests covering fuzzy match, single-model fallback, config override (including zero/None edge cases). * fix: auto-detect local model name and context length for local servers Cherry-picked from PR #2043 by sudoingX. - Auto-detect model name from local server's /v1/models when only one model is loaded (no manual model name config needed) - Add n_ctx_train and n_ctx to context length detection keys for llama.cpp - Query llama.cpp /props endpoint for actual allocated context (not just training context from GGUF metadata) - Strip .gguf suffix from display in banner and status bar - _auto_detect_local_model() in runtime_provider.py for CLI init Co-authored-by: sudo <sudoingx@users.noreply.github.com> * fix: revert accidental summary_target_tokens change + add docs for context_length config - Revert summary_target_tokens from 2500 back to 500 (accidental change during patching) - Add 'Context Length Detection' section to Custom & Self-Hosted docs explaining model.context_length config override --------- Co-authored-by: Test <test@test.com> Co-authored-by: sudo <sudoingx@users.noreply.github.com>	2026-03-19 06:01:16 -07:00
Teknium	7b6d14e62a	fix(gateway): replace bare text approval with /approve and /deny commands (#2002 ) The gateway approval system previously intercepted bare 'yes'/'no' text from the user's next message to approve/deny dangerous commands. This was fragile and dangerous — if the agent asked a clarify question and the user said 'yes' to answer it, the gateway would execute the pending dangerous command instead. (Fixes #1888) Changes: - Remove bare text matching ('yes', 'y', 'approve', 'ok', etc.) from _handle_message approval check - Add /approve and /deny as gateway-only slash commands in the command registry - /approve supports scoping: /approve (one-time), /approve session, /approve always (permanent) - Add 5-minute timeout for stale approvals - Gateway appends structured instructions to the agent response when a dangerous command is pending, telling the user exactly how to respond - 9 tests covering approve, deny, timeout, scoping, and verification that bare 'yes' no longer triggers execution Credit to @solo386 and @FlyByNight69420 for identifying and reporting this security issue in PR #1971 and issue #1888. Co-authored-by: Test <test@test.com>	2026-03-18 16:58:20 -07:00
Teknium	a7cc1cf309	fix: support Anthropic-compatible endpoints for third-party providers (#1997 ) Three bugs prevented providers like MiniMax from using their Anthropic-compatible endpoints (e.g. api.minimax.io/anthropic): 1. _VALID_API_MODES was missing 'anthropic_messages', so explicit api_mode config was silently rejected and defaulted to chat_completions. 2. API-key provider resolution hardcoded api_mode to 'chat_completions' without checking model config or detecting Anthropic-compatible URLs. 3. run_agent.py auto-detection only recognized api.anthropic.com, not third-party endpoints using the /anthropic URL convention. Fixes: - Add 'anthropic_messages' to _VALID_API_MODES - API-key providers now check model config api_mode and auto-detect URLs ending in /anthropic - run_agent.py and fallback logic detect /anthropic URL convention - 5 new tests covering all scenarios Users can now either: - Set MINIMAX_BASE_URL=https://api.minimax.io/anthropic (auto-detected) - Set api_mode: anthropic_messages in model config (explicit) - Use custom_providers with api_mode: anthropic_messages Co-authored-by: Test <test@test.com>	2026-03-18 16:26:06 -07:00
Teknium	f24db23458	fix: custom provider uses config base_url and api_key over env vars (#1760 ) (#1994 ) When provider: custom is set in config.yaml with base_url and api_key, those values are now used instead of falling back to OPENAI_BASE_URL and OPENAI_API_KEY env vars. Also reads the 'api' field as an alternative to 'api_key' for config compatibility. Cherry-picked from PR #1762 by crazywriter1. Co-authored-by: crazywriter1 <53251494+crazywriter1@users.noreply.github.com>	2026-03-18 16:00:14 -07:00
Teknium	d132e344d7	fix(agent): prevent silent tool result loss during context compression (#1993 ) _align_boundary_backward only checked messages[idx-1] to decide if the compress-end boundary splits a tool_call/result group. When an assistant issues 3+ parallel tool calls, their results span multiple consecutive messages. If the boundary fell in the middle of that group, the parent assistant was summarized away and orphaned tool results were silently deleted by _sanitize_tool_pairs. Now walks backward through all consecutive tool results to find the parent assistant, then pulls the boundary before the entire group. 6 regression tests added in tests/test_compression_boundary.py. Co-authored-by: Guts <Gutslabs@users.noreply.github.com>	2026-03-18 15:22:51 -07:00
Test	e7844e9c8d	Merge origin/main, resolve conflicts (self._base_url_lower)	2026-03-18 04:09:00 -07:00
Teknium	0a247a50f2	feat: support ignoring unauthorized gateway DMs (#1919 ) Add unauthorized_dm_behavior config (pair\|ignore) with global default and per-platform override. WhatsApp can silently drop unknown DMs instead of sending pairing codes. Adapted config bridging to work with gw_data dict (pre-construction) rather than config object. Dropped implementation plan document. Co-authored-by: Frederico Ribeiro <fr@tecompanytea.com>	2026-03-18 04:06:08 -07:00
Teknium	0e2714acea	fix(cron): recover recent one-shot jobs (#1918 ) Co-authored-by: Frederico Ribeiro <fr@tecompanytea.com>	2026-03-18 04:06:02 -07:00
Test	36921a3e98	fix: correct Copilot API mode selection to match opencode The previous copilot_model_api_mode() checked the catalog's supported_endpoints first and picked /chat/completions when a model supported both endpoints. This is wrong — GPT-5+ models should use the Responses API even when the catalog lists both. Replicate opencode's shouldUseCopilotResponsesApi() logic: - GPT-5+ models (gpt-5.4, gpt-5.3-codex, etc.) → Responses API - gpt-5-mini → Chat Completions (explicit exception) - Everything else (gpt-4o, claude, gemini, etc.) → Chat Completions - Model ID pattern is the primary signal, catalog is secondary The catalog fallback now only matters for non-GPT-5 models that might exclusively support /v1/messages (e.g. Claude via Copilot). Models are auto-detected from the live catalog at api.githubcopilot.com/models — no hardcoded list required for supported models, only a static fallback for when the API is unreachable.	2026-03-18 03:54:50 -07:00
Test	21c45ba0ac	feat: proper Copilot auth with OAuth device code flow and token validation Builds on PR #1879's Copilot integration with critical auth improvements modeled after opencode's implementation: - Add hermes_cli/copilot_auth.py with: - OAuth device code flow (copilot_device_code_login) using the same client_id (Ov23li8tweQw6odWQebz) as opencode and Copilot CLI - Token type validation: reject classic PATs (ghp_*) with a clear error message explaining supported token types - Proper env var priority: COPILOT_GITHUB_TOKEN > GH_TOKEN > GITHUB_TOKEN (matching Copilot CLI documentation) - copilot_request_headers() with Openai-Intent, x-initiator, and Copilot-Vision-Request headers (matching opencode) - Update auth.py: - PROVIDER_REGISTRY copilot entry uses correct env var order - _resolve_api_key_provider_secret delegates to copilot_auth for the copilot provider with proper token validation - Update models.py: - copilot_default_headers() now includes Openai-Intent and x-initiator - Update main.py: - _model_flow_copilot offers OAuth device code login when no token is found, with manual token entry as fallback - Shows supported vs unsupported token types - 22 new tests covering token validation, env var priority, header generation, and integration with existing auth infrastructure	2026-03-18 03:25:58 -07:00
Teknium	c0c14e60b4	fix: make concurrent tool batching path-aware for file mutations (#1914 ) * Improve tool batching independence checks * fix: address review feedback on path-aware batching - Log malformed/non-dict tool arguments at debug level before falling back to sequential, instead of silently swallowing the error into an empty dict - Guard empty paths in _paths_overlap (unreachable in practice due to upstream filtering, but makes the invariant explicit) - Add tests: malformed JSON args, non-dict args, _paths_overlap unit tests including empty path edge cases - web_crawl is not a registered tool (only web_search/web_extract are); no addition needed to _PARALLEL_SAFE_TOOLS --------- Co-authored-by: kshitij <82637225+kshitijk4poor@users.noreply.github.com>	2026-03-18 03:25:38 -07:00
Test	f814787144	fix(banner): normalize toolset labels and use skin colors - Strip '_tools' suffix from internal toolset identifiers in the banner (e.g. 'web_tools' -> 'web', 'homeassistant_tools' -> 'homeassistant') - Stop appending '_tools' to unavailable toolset names - Replace 6 hardcoded hex colors (#B8860B, #FFBF00, #FFF8DC) in toolset rows, overflow line, and MCP server rows with the skin variables (dim, accent, text) already resolved at the top of the function Inspired by PR #1871 by @kshitijk4poor. Adds 4 tests.	2026-03-18 03:22:58 -07:00
Test	8422196e89	Merge PR #1879 : feat: integrate GitHub Copilot providers	2026-03-18 03:18:33 -07:00
Teknium	b70dd51cfa	fix: disabled skills respected across banner, system prompt, slash commands, and skill_view (#1897 ) * fix: banner skill count now respects disabled skills and platform filtering The banner's get_available_skills() was doing a raw rglob scan of ~/.hermes/skills/ without checking: - Whether skills are disabled (skills.disabled config) - Whether skills match the current platform (platforms: frontmatter) This caused the banner to show inflated skill counts (e.g. '100 skills' when many are disabled) and list macOS-only skills on Linux. Fix: delegate to _find_all_skills() from tools/skills_tool which already handles both platform gating and disabled-skill filtering. * fix: system prompt and slash commands now respect disabled skills Two more places where disabled skills were still surfaced: 1. build_skills_system_prompt() in prompt_builder.py — disabled skills appeared in the <available_skills> system prompt section, causing the agent to suggest/load them despite being disabled. 2. scan_skill_commands() in skill_commands.py — disabled skills still registered as /skill-name slash commands in CLI help and could be invoked. Both now load _get_disabled_skill_names() and filter accordingly. * fix: skill_view blocks disabled skills skill_view() checked platform compatibility but not disabled state, so the agent could still load and read disabled skills directly. Now returns a clear error when a disabled skill is requested, telling the user to enable it via hermes skills or inspect the files manually. --------- Co-authored-by: Test <test@test.com>	2026-03-18 03:17:37 -07:00
TheSameCat2	5c4c4b8b7d	fix(gateway): detect script-style gateway processes for --replace Recognize hermes_cli/main.py gateway command lines in gateway process detection and PID validation so --replace reliably finds existing gateway instances. Adds a regression test covering script-style cmdline detection. Closes #1830	2026-03-18 03:12:59 -07:00
Teknium	ee4cc8ee3b	Merge pull request #1907 from NousResearch/hermes/hermes-b29f73b2 feat(mcp): expose MCP servers as standalone toolsets	2026-03-18 03:04:34 -07:00
Test	4b53b89f09	feat(mcp): expose MCP servers as standalone toolsets Each configured MCP server now registers as its own toolset in TOOLSETS (e.g. TOOLSETS['github'] = {tools: ['mcp_github_list_files', ...]}), making raw server names resolvable in platform_toolsets overrides. Previously MCP tools were only injected into hermes-* umbrella toolsets, so gateway sessions using raw toolset names like ['terminal', 'github'] in platform_toolsets couldn't resolve MCP tools. Skips server names that collide with built-in toolsets. Also handles idempotent reloads (syncs toolsets even when no new servers connect). Inspired by PR #1876 by @kshitijk4poor. Adds 2 tests (standalone toolset creation + built-in collision guard).	2026-03-18 03:04:17 -07:00
Teknium	a2440f72f6	feat: use endpoint metadata for custom model context and pricing (#1906 ) * perf: cache base_url.lower() via property, consolidate triple load_config(), hoist set constant run_agent.py: - Add base_url property that auto-caches _base_url_lower on every assignment, eliminating 12+ redundant .lower() calls per API cycle across __init__, _build_api_kwargs, _supports_reasoning_extra_body, and the main conversation loop - Consolidate three separate load_config() disk reads in __init__ (memory, skills, compression) into a single call, reusing the result dict for all three config sections model_tools.py: - Hoist _READ_SEARCH_TOOLS set to module level (was rebuilt inside handle_function_call on every tool invocation) * Use endpoint metadata for custom model context and pricing --------- Co-authored-by: kshitij <82637225+kshitijk4poor@users.noreply.github.com>	2026-03-18 03:04:07 -07:00
Test	ace2cc6257	fix(gateway): PID-based wait with force-kill for gateway restart Add _wait_for_gateway_exit() that polls get_running_pid() to confirm the old gateway process has actually exited before starting a new one. If the process doesn't exit within 5s, sends SIGKILL to the specific PID. Uses the saved PID from gateway.pid (not launchd labels) so it works correctly with multiple gateway instances under separate HERMES_HOME directories. Applied to both launchd_restart() and the manual restart path (replaces the blind time.sleep(2)). Inspired by PR #1881 by @AzothZephyr (race condition diagnosis). Adds 4 tests.	2026-03-18 02:54:18 -07:00
Teknium	24ac577046	fix: respect model.default from config.yaml for openai-codex provider (#1896 ) When config.yaml had a non-default model (e.g. gpt-5.3-codex) and the provider was openai-codex, _normalize_model_for_provider() would replace it with the latest available codex model because _model_is_default only checked the CLI argument, not the config value. Now _model_is_default is False when config.yaml has a model that differs from the global fallback (anthropic/claude-opus-4.6), so the user's explicit config choice is preserved. Fixes #1887 Co-authored-by: Test <test@test.com>	2026-03-18 02:50:31 -07:00
octo-patch	e4043633fc	feat: upgrade MiniMax default to M2.7 + add new OpenRouter models MiniMax: Add M2.7 and M2.7-highspeed as new defaults across provider model lists, auxiliary client, metadata, setup wizard, RL training tool, fallback tests, and docs. Retain M2.5/M2.1 as alternatives. OpenRouter: Add grok-4.20-beta, nemotron-3-super-120b-a12b:free, trinity-large-preview:free, glm-5-turbo, and hunter-alpha to the model catalog. MiniMax changes based on PR #1882 by @octo-patch (applied manually due to stale conflicts in refactored pricing module).	2026-03-18 02:42:58 -07:00
Test	a8132d1252	fix: respect model.default from config.yaml for openai-codex provider When config.yaml had a non-default model (e.g. gpt-5.3-codex) and the provider was openai-codex, _normalize_model_for_provider() would replace it with the latest available codex model because _model_is_default only checked the CLI argument, not the config value. Now _model_is_default is False when config.yaml has a model that differs from the global fallback (anthropic/claude-opus-4.6), so the user's explicit config choice is preserved. Fixes #1887	2026-03-18 02:24:41 -07:00
Teknium	6fc4e36625	fix: search all sources by default in session_search (#1892 ) * fix: include ACP sessions in default search sources * fix: remove hardcoded source allowlist from session search The default source_filter was a hardcoded list that silently excluded any platform not explicitly listed. Instead of maintaining an ever-growing allowlist, remove it entirely so all sources are searched by default. Callers can still pass source_filter explicitly to narrow results. Follow-up to cherry-picked PR #1817. --------- Co-authored-by: someoneexistsontheinternet <154079416+someoneexistsontheinternet@users.noreply.github.com> Co-authored-by: Test <test@test.com>	2026-03-18 02:21:29 -07:00
Test	5b74df2bfc	fix: OAuth flag stale after refresh/fallback, memory nudge never fires, dead code - Update _is_anthropic_oauth in _try_refresh_anthropic_client_credentials() when token type changes during credential refresh - Set _is_anthropic_oauth in _try_activate_fallback() Anthropic path - Move _turns_since_memory and _iters_since_skill init to __init__ so nudge counters accumulate across run_conversation() calls in CLI mode - Remove unreachable retry_count >= max_retries block after raise Adds 7 regression tests. Salvaged from PR #1797 by @0xbyt4.	2026-03-18 02:19:57 -07:00
max	0c392e7a87	feat: integrate GitHub Copilot providers across Hermes Add first-class GitHub Copilot and Copilot ACP provider support across model selection, runtime provider resolution, CLI sessions, delegated subagents, cron jobs, and the Telegram gateway. This also normalizes Copilot model catalogs and API modes, introduces a Copilot ACP OpenAI-compatible shim, and fixes service-mode auth by resolving Homebrew-installed gh binaries under launchd. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>	2026-03-17 23:40:22 -07:00
Test	0fab46f65c	fix: allow agent-created skills with caution-level findings Agent-created skills were using the same policy as community hub installs, blocking any skill with medium/high severity findings (e.g. docker pull, pip install, git clone). This meant the agent couldn't create skills that reference Docker or other common tools. Changed agent-created policy from (allow, block, block) to (allow, allow, block) — matching the trusted policy. Caution-level findings (medium/high severity) are now allowed through, while dangerous findings (critical severity like exfiltration, prompt injection, reverse shells) remain blocked. Added 4 tests covering the agent-created policy: safe allowed, caution allowed, dangerous blocked, force override.	2026-03-17 16:32:25 -07:00
Teknium	7f85b2914d	Merge pull request #1824 from cutepawss/fix/search-files-pagination Clean fix — adds pagination args to search_key for parity with read_file. Thanks @cutepawss!	2026-03-17 16:16:47 -07:00
Test	d35d923c76	feat: cron agents can suppress delivery with [SILENT] response Every cron job prompt now includes guidance that the agent can respond with [SILENT] when it has nothing new or noteworthy to report. The scheduler checks for this marker and skips delivery, while still saving output to disk for audit. Failed jobs always deliver regardless. This replaces the notify parameter approach from PR #1807 with a simpler always-on design — the model is smart enough to decide when there's nothing worth reporting without needing a per-job flag.	2026-03-17 16:06:49 -07:00
darya	a654bc04f7	fix(file_tools): include pagination args in repeated search key	2026-03-18 01:19:05 +03:00
Teknium	dd60bcbfb7	feat: OpenAI-compatible API server + WhatsApp configurable reply prefix (#1756 ) * feat: OpenAI-compatible API server platform adapter Salvaged from PR #956, updated for current main. Adds an HTTP API server as a gateway platform adapter that exposes hermes-agent via the OpenAI Chat Completions and Responses APIs. Any OpenAI-compatible frontend (Open WebUI, LobeChat, LibreChat, AnythingLLM, NextChat, ChatBox, etc.) can connect by pointing at http://localhost:8642/v1. Endpoints: - POST /v1/chat/completions — stateless Chat Completions API - POST /v1/responses — stateful Responses API with chaining - GET /v1/responses/{id} — retrieve stored response - DELETE /v1/responses/{id} — delete stored response - GET /v1/models — list hermes-agent as available model - GET /health — health check Features: - Real SSE streaming via stream_delta_callback (uses main's streaming) - In-memory LRU response store for Responses API conversation chaining - Named conversations via 'conversation' parameter - Bearer token auth (optional, via API_SERVER_KEY) - CORS support for browser-based frontends - System prompt layering (frontend system messages on top of core) - Real token usage tracking in responses Integration points: - Platform.API_SERVER in gateway/config.py - _create_adapter() branch in gateway/run.py - API_SERVER_* env vars in hermes_cli/config.py - Env var overrides in gateway/config.py _apply_env_overrides() Changes vs original PR #956: - Removed streaming infrastructure (already on main via stream_consumer.py) - Removed Telegram reply_to_mode (separate feature, not included) - Updated _resolve_model() -> _resolve_gateway_model() - Updated stream_callback -> stream_delta_callback - Updated connect()/disconnect() to use _mark_connected()/_mark_disconnected() - Adapted to current Platform enum (includes MATTERMOST, MATRIX, DINGTALK) Tests: 72 new tests, all passing Docs: API server guide, Open WebUI integration guide, env var reference * feat(whatsapp): make reply prefix configurable via config.yaml Reworked from PR #1764 (ifrederico) to use config.yaml instead of .env. The WhatsApp bridge prepends a header to every outgoing message. This was hardcoded to '⚕ Hermes Agent'. Users can now customize or disable it via config.yaml: whatsapp: reply_prefix: '' # disable header reply_prefix: '🤖 My Bot\n───\n' # custom prefix How it works: - load_gateway_config() reads whatsapp.reply_prefix from config.yaml and stores it in PlatformConfig.extra['reply_prefix'] - WhatsAppAdapter reads it from config.extra at init - When spawning bridge.js, the adapter passes it as WHATSAPP_REPLY_PREFIX in the subprocess environment - bridge.js handles undefined (default), empty (no header), or custom values with \\n escape support - Self-chat echo suppression uses the configured prefix Also fixes _config_version: was 9 but ENV_VARS_BY_VERSION had a key 10 (TAVILY_API_KEY), so existing users at v9 would never be prompted for Tavily. Bumped to 10 to close the gap. Added a regression test to prevent this from happening again. Credit: ifrederico (PR #1764) for the bridge.js implementation and the config version gap discovery. --------- Co-authored-by: Test <test@test.com>	2026-03-17 10:44:37 -07:00
Teknium	b5cf0f0aef	fix: preserve parent agent's tool list after subagent delegation (#1778 ) Save and restore the process-global _last_resolved_tool_names in _run_single_child() so the parent's execute_code sandbox generates correct tool imports after delegation completes. The global was already mostly mitigated (run_agent.py passes enabled_tools via self.valid_tool_names), but the global itself remained corrupted — a footgun for any code that reads it directly. Co-authored-by: shane9coy <shane9coy@users.noreply.github.com>	2026-03-17 10:31:38 -07:00
Teknium	9a1e971126	fix(stt): respect explicit provider config instead of env-var fallback (#1775 ) * fix(session): skip corrupt lines in load_transcript instead of crashing Wrap json.loads() in load_transcript() with try/except JSONDecodeError so that partial JSONL lines (from mid-write crashes like OOM/SIGKILL) are skipped with a warning instead of crashing the entire transcript load. The rest of the history loads fine. Adds a logger.warning with the session ID and truncated corrupt line content for debugging visibility. Salvaged from PR #1193 by alireza78a. Closes #1193 * fix(stt): respect explicit provider config instead of env-var fallback Rework _get_provider() to separate explicit config from auto-detect. When stt.provider is explicitly set in config.yaml, that choice is authoritative — no silent cross-provider fallback based on which env vars happen to be set. When no provider is configured, auto-detect still tries: local > groq > openai. This fixes the reported scenario where provider: local + a placeholder OPENAI_API_KEY caused the system to silently select OpenAI and fail with a 401. Closes #1774	2026-03-17 10:30:58 -07:00
teknium1	c881209b92	Revert "feat(cli): skin-aware light/dark theme mode with terminal auto-detection" This reverts commit `a1c81360a5`.	2026-03-17 10:04:53 -07:00
Teknium	d7a2e3ddae	fix: handle hyphenated FTS5 queries and preserve quoted literals (#1776 ) _sanitize_fts5_query() was stripping ALL double quotes (including properly paired ones), breaking user-provided quoted phrases like "exact phrase". Hyphenated terms like chat-send also silently expanded to chat AND send, returning unexpected or zero results. Fix: 1. Extract balanced quoted phrases into placeholders before stripping FTS5-special characters, then restore them. 2. Wrap unquoted hyphenated terms (word-word) in double quotes so FTS5 matches them as exact phrases instead of splitting on the hyphen. 3. Unmatched quotes are still stripped as before. Based on issue report by @bailob (#1770) and PR #1773 by @Jah-yee (whose branch contained unrelated changes and couldn't be merged directly). Closes #1770 Closes #1773 Co-authored-by: Jah-yee <Jah-yee@users.noreply.github.com>	2026-03-17 09:44:01 -07:00
Teknium	d5af593769	Merge pull request #1769 from sai-samarth/fix/whatsapp-send-message-support Clean merge — PR is current against main, tests pass, implementation matches existing gateway WhatsApp bridge pattern.	2026-03-17 09:42:01 -07:00
Teknium	df74f86955	Merge pull request #1767 from sai-samarth/fix/systemd-node-path-whatsapp Clean fix for nvm/non-standard Node.js paths in systemd units. Merges cleanly.	2026-03-17 09:41:39 -07:00
sai-samarth	a3de843fdb	test: replace real-looking WhatsApp jid in regression test	2026-03-17 15:38:37 +00:00
sai-samarth	dc15bc508f	fix(tools): add outbound WhatsApp send_message routing	2026-03-17 15:31:13 +00:00
sai-samarth	b8eb7c5fed	fix(gateway): include resolved node path in systemd unit	2026-03-17 15:11:28 +00:00
Teknium	548cedb869	fix(context_compressor): prevent consecutive same-role messages after compression (#1743 ) compress() checks both the head and tail neighbors when choosing the summary message role. When only the tail collides, the role is flipped. When BOTH roles would create consecutive same-role messages (e.g. head=assistant, tail=user), the summary is merged into the first tail message instead of inserting a standalone message that breaks role alternation and causes API 400 errors. The previous code handled head-side collision but left the tail-side uncovered — long conversations would crash mid-reply with no useful error, forcing the user to /reset and lose session history. Based on PR #1186 by @alireza78a, with improved double-collision handling (merge into tail instead of unconditional 'user' fallback). Co-authored-by: alireza78a <alireza78.crypto@gmail.com>	2026-03-17 05:18:52 -07:00
Teknium	702191049f	fix(session): skip corrupt lines in load_transcript instead of crashing (#1744 ) Wrap json.loads() in load_transcript() with try/except JSONDecodeError so that partial JSONL lines (from mid-write crashes like OOM/SIGKILL) are skipped with a warning instead of crashing the entire transcript load. The rest of the history loads fine. Adds a logger.warning with the session ID and truncated corrupt line content for debugging visibility. Salvaged from PR #1193 by alireza78a. Closes #1193	2026-03-17 05:18:12 -07:00
Teknium	d1d17f4f0a	feat(compression): add summary_base_url + move compression config to YAML-only - Add summary_base_url config option to compression block for custom OpenAI-compatible endpoints (e.g. zai, DeepSeek, Ollama) - Remove compression env var bridges from cli.py and gateway/run.py (CONTEXT_COMPRESSION_* env vars no longer set from config) - Switch run_agent.py to read compression config directly from config.yaml instead of env vars - Fix backwards-compat block in _resolve_task_provider_model to also fire when auxiliary.compression.provider is 'auto' (DEFAULT_CONFIG sets this, which was silently preventing the compression section's summary_* keys from being read) - Add test for summary_base_url config-to-client flow - Update docs to show compression as config.yaml-only Closes #1591 Based on PR #1702 by @uzaylisak	2026-03-17 04:46:15 -07:00
teknium1	0897e4350e	merge: resolve conflicts with origin/main	2026-03-17 04:30:37 -07:00
Teknium	d2b10545db	feat(web): add Tavily as web search/extract/crawl backend (#1731 ) Salvage of PR #1707 by @kshitijk4poor (cherry-picked with authorship preserved). Adds Tavily as a third web backend alongside Firecrawl and Parallel, using the Tavily REST API via httpx. - Backend selection via hermes tools → saved as web.backend in config.yaml - All three tools supported: search, extract, crawl - TAVILY_API_KEY in config registry, doctor, status, setup wizard - 15 new Tavily tests + 9 backend selection tests + 5 config tests - Backward compatible Closes #1707	2026-03-17 04:28:03 -07:00
Teknium	85993fbb5a	feat: pre-call sanitization and post-call tool guardrails (#1732 ) Salvage of PR #1321 by @alireza78a (cherry-picked concept, reimplemented against current main). Phase 1 — Pre-call message sanitization: _sanitize_api_messages() now runs unconditionally before every LLM call. Previously gated on context_compressor being present, so sessions loaded from disk or running without compression could accumulate dangling tool_call/tool_result pairs causing API errors. Phase 2a — Delegate task cap: _cap_delegate_task_calls() truncates excess delegate_task calls per turn to MAX_CONCURRENT_CHILDREN. The existing cap in delegate_tool.py only limits the task array within a single call; this catches multiple separate delegate_task tool_calls in one turn. Phase 2b — Tool call deduplication: _deduplicate_tool_calls() drops duplicate (tool_name, arguments) pairs within a single turn when models stutter. All three are static methods on AIAgent, independently testable. 29 tests covering happy paths and edge cases.	2026-03-17 04:24:27 -07:00
Teknium	618ed2c65f	fix(update): use .[all] extras with fallback in hermes update (#1728 ) Both update paths now try .[all] first, fall back to . if extras fail. Fixes #1336. Inspired by PR #1342 by @baketnk.	2026-03-17 04:22:37 -07:00
ch3ronsa	695eb04243	feat(agent): .hermes.md per-repository project config discovery Adds .hermes.md / HERMES.md discovery for per-project agent configuration. When the agent starts, it walks from cwd to the git root looking for .hermes.md (preferred) or HERMES.md, strips any YAML frontmatter, and injects the markdown body into the system prompt as project context. - Nearest-first discovery (subdirectory configs shadow parent) - Stops at git root boundary (no leaking into parent repos) - YAML frontmatter stripped (structured config deferred to Phase 2) - Same injection scanning and 20K truncation as other context files - 22 comprehensive tests Original implementation by ch3ronsa. Cherry-picked and adapted for current main. Closes #681 (Phase 1)	2026-03-17 04:16:32 -07:00
teknium1	e5fc916814	feat: auto-generate session titles after first exchange After the first user→assistant exchange, Hermes now generates a short descriptive session title via the auxiliary LLM (compression task config). Title generation runs in a background thread so it never delays the user-facing response. Key behaviors: - Fires only on the first 1-2 exchanges (checks user message count) - Skips if a title already exists (user-set titles are never overwritten) - Uses call_llm with compression task config (cheapest/fastest model) - Truncates long messages to keep the title generation request small - Cleans up LLM output: strips quotes, 'Title:' prefixes, enforces 80 char max - Works in both CLI and gateway (Telegram/Discord/etc.) Also updates /title (no args) to show the session ID alongside the title in both CLI and gateway. Implements #1426	2026-03-17 04:14:40 -07:00
Teknium	4433b83378	feat(web): add Parallel as alternative web search/extract backend (#1696 ) * feat(web): add Parallel as alternative web search/extract backend Adds Parallel (parallel.ai) as a drop-in alternative to Firecrawl for web_search and web_extract tools using the official parallel-web SDK. - Backend selection via WEB_SEARCH_BACKEND env var (auto/parallel/firecrawl) - Auto mode prefers Firecrawl when both keys present; Parallel when sole backend - web_crawl remains Firecrawl-only with clear error when unavailable - Lazy SDK imports, interrupt support, singleton clients - 16 new unit tests for backend selection and client config Co-authored-by: s-jag <s-jag@users.noreply.github.com> * fix: add PARALLEL_API_KEY to config registry and fix web_crawl policy tests Follow-up for Parallel backend integration: - Add PARALLEL_API_KEY to OPTIONAL_ENV_VARS (hermes doctor, env blocklist) - Add to set_config_value api_keys list (hermes config set) - Add to doctor keys display - Fix 2 web_crawl policy tests that didn't set FIRECRAWL_API_KEY (needed now that web_crawl has a Firecrawl availability guard) * refactor: explicit backend selection via hermes tools, not auto-detect Replace the auto-detect backend selection with explicit user choice: - hermes tools saves WEB_SEARCH_BACKEND to .env when user picks a provider - _get_backend() reads the explicit choice first - Fallback only for manual/legacy config (uses whichever key is present) - _is_provider_active() shows [active] for the selected web backend - Updated tests, docs, and .env.example to remove 'auto' mode language * refactor: use config.yaml for web backend, not env var Match the TTS/browser pattern — web.backend is stored in config.yaml (set by hermes tools), not as a WEB_SEARCH_BACKEND env var. - _load_web_config() reads web: section from config.yaml - _get_backend() reads web.backend from config, falls back to key detection - _configure_provider() saves to config dict (saved to config.yaml) - _is_provider_active() reads from config dict - Removed WEB_SEARCH_BACKEND from .env.example, set_config_value, docs - Updated all tests to mock _load_web_config instead of env vars --------- Co-authored-by: s-jag <s-jag@users.noreply.github.com>	2026-03-17 04:02:02 -07:00
crazywriter1	7049dba778	fix(docker): remove container on cleanup when container_persistent=false When container_persistent=false, the inner mini-swe-agent cleanup only runs 'docker stop' in the background, leaving containers in Exited state. Now cleanup() also runs 'docker rm -f' to fully remove the container. Also fixes pre-existing test failures in model_metadata (gpt-4.1 1M context), setup tests (TTS provider step), and adds MockInnerDocker.cleanup(). Original fix by crazywriter1. Cherry-picked and adapted for current main. Fixes #1679	2026-03-17 04:02:01 -07:00
Teknium	6405d389aa	test: align Hermes setup and full-suite expectations (#1710 ) Salvaged from PR #1708 by @kartikkabadi. Cherry-picked with authorship preserved. Fixes pre-existing test failures from setup TTS prompt flow changes and environment-sensitive assumptions. Co-authored-by: Kartik <user2@RentKars-MacBook-Air.local>	2026-03-17 04:01:37 -07:00
Teknium	b16186a32a	feat(telegram): auto-detect HTML tags and use parse_mode=HTML in send_message (#1709 ) * feat: interactive MCP tool configuration in hermes tools Add the ability to selectively enable/disable individual MCP server tools through the interactive 'hermes tools' TUI. Changes: - tools/mcp_tool.py: Add probe_mcp_server_tools() — lightweight function that temporarily connects to configured MCP servers, discovers their tools (names + descriptions), and disconnects. No registry side effects. - hermes_cli/tools_config.py: Add 'Configure MCP tools' option to the interactive menu. When selected: 1. Probes all enabled MCP servers for their available tools 2. Shows a per-server curses checklist with tool descriptions 3. Pre-selects tools based on existing include/exclude config 4. Writes changes back as tools.exclude entries in config.yaml 5. Reports which servers failed to connect The existing CLI commands (hermes tools enable/disable server:tool) continue to work unchanged. This adds the interactive TUI counterpart so users can browse and toggle MCP tools visually. Tests: 22 new tests covering probe function edge cases and interactive flow (pre-selection, exclude/include modes, description truncation, multi-server handling, error paths). * feat(telegram): auto-detect HTML tags and use parse_mode=HTML in send_message When _send_telegram detects HTML tags in the message body, it now sends with parse_mode='HTML' instead of converting to MarkdownV2. This allows cron jobs and agents to send rich HTML-formatted Telegram messages with bold, italic, code blocks, etc. that render correctly. Detection uses the same regex from PR #1568 by @ashaney: re.search(r'<[a-zA-Z/][^>]*>', message) Plain-text and markdown messages continue through the existing MarkdownV2 pipeline. The HTML fallback path also catches HTML parse errors and falls back to plain text, matching the existing MarkdownV2 error handling. Inspired by: github.com/ashaney — PR #1568	2026-03-17 03:56:06 -07:00
Teknium	d87655afff	fix(gateway): persist watcher metadata in checkpoint for crash recovery (#1706 ) Salvaged from PR #1573 by @eren-karakus0. Cherry-picked with authorship preserved. Fixes #1143 — background process notifications resume after gateway restart. Co-authored-by: Muhammet Eren Karakuş <erenkar950@gmail.com>	2026-03-17 03:52:15 -07:00
Teknium	ce7418e274	feat: interactive MCP tool configuration in hermes tools (#1694 ) Add the ability to selectively enable/disable individual MCP server tools through the interactive 'hermes tools' TUI. Changes: - tools/mcp_tool.py: Add probe_mcp_server_tools() — lightweight function that temporarily connects to configured MCP servers, discovers their tools (names + descriptions), and disconnects. No registry side effects. - hermes_cli/tools_config.py: Add 'Configure MCP tools' option to the interactive menu. When selected: 1. Probes all enabled MCP servers for their available tools 2. Shows a per-server curses checklist with tool descriptions 3. Pre-selects tools based on existing include/exclude config 4. Writes changes back as tools.exclude entries in config.yaml 5. Reports which servers failed to connect The existing CLI commands (hermes tools enable/disable server:tool) continue to work unchanged. This adds the interactive TUI counterpart so users can browse and toggle MCP tools visually. Tests: 22 new tests covering probe function edge cases and interactive flow (pre-selection, exclude/include modes, description truncation, multi-server handling, error paths).	2026-03-17 03:48:44 -07:00
Teknium	d417ba2a48	feat: add route-aware pricing estimates (#1695 ) Salvaged from PR #1563 by @kshitijk4poor. Cherry-picked with authorship preserved. - Route-aware pricing architecture replacing static MODEL_PRICING + heuristics - Canonical usage normalization (Anthropic/OpenAI/Codex API shapes) - Cache-aware billing (separate cache_read/cache_write rates) - Cost status tracking (estimated/included/unknown/actual) - OpenRouter live pricing via models API - Schema migration v4→v5 with billing metadata columns - Removed speculative forward-looking entries - Removed cost display from CLI status bar - Threaded OpenRouter metadata pre-warm Co-authored-by: kshitij <82637225+kshitijk4poor@users.noreply.github.com>	2026-03-17 03:44:44 -07:00
teknium1	c3ce6108e3	test: add comprehensive tests for Mattermost and Matrix adapters 77 tests covering: Mattermost (37 tests): - Platform enum and config loading - Message formatting (image markdown stripping) - Message chunking at 4000 chars - Send with mocked aiohttp (payload, threading, errors) - WebSocket event parsing (double-encoded JSON!) - File upload flow - Post dedup cache (TTL, pruning) - Requirements check Matrix (40 tests): - Platform enum and config loading (token + password auth, E2EE) - mxc:// to HTTP URL conversion (authenticated v1.11+ endpoint) - DM detection via m.direct cache - Reply fallback stripping - Thread detection from m.relates_to - Message formatting and markdown to HTML - Display name resolution - Requirements check	2026-03-17 03:18:16 -07:00
Teknium	07549c967a	feat: add SMS (Twilio) platform adapter Add SMS as a first-class messaging platform via the Twilio API. Shares credentials with the existing telephony skill — same TWILIO_ACCOUNT_SID, TWILIO_AUTH_TOKEN, TWILIO_PHONE_NUMBER env vars. Adapter (gateway/platforms/sms.py): - aiohttp webhook server for inbound (Twilio form-encoded POSTs) - Twilio REST API with Basic auth for outbound - Markdown stripping, smart chunking at 1600 chars - Echo loop prevention, phone number redaction in logs Integration (13 files): - gateway config, run, channel_directory - agent prompt_builder (SMS platform hint) - cron scheduler, cronjob tools - send_message_tool (_send_sms via Twilio API) - toolsets (hermes-sms + hermes-gateway) - gateway setup wizard, status display - pyproject.toml (sms optional extra) - 21 tests Docs: - website/docs/user-guide/messaging/sms.md (full setup guide) - Updated messaging index (architecture, toolsets, security, links) - Updated environment-variables.md reference Inspired by PR #1575 (@sunsakis), rewritten for Twilio.	2026-03-17 03:14:53 -07:00
teknium1	6fc76ef954	fix: harden website blocklist — default off, TTL cache, fail-open, guarded imports - Default enabled: false (zero overhead when not configured) - Fast path: cached disabled state skips all work immediately - TTL cache (30s) for parsed policy — avoids re-reading config.yaml on every URL check - Missing shared files warn + skip instead of crashing all web tools - Lazy yaml import — missing PyYAML doesn't break browser toolset - Guarded browser_tool import — fail-open lambda fallback - check_website_access never raises for default path (fail-open with warning log); only raises with explicit config_path (test mode) - Simplified enforcement code in web_tools/browser_tool — no more try/except wrappers since errors are handled internally	2026-03-17 03:11:26 -07:00
Teknium	a6dcc231f8	feat(gateway): add DingTalk platform adapter (#1685 ) Add DingTalk as a messaging platform using the dingtalk-stream SDK for real-time message reception via Stream Mode (no webhook needed). Replies are sent via session webhook using markdown format. Features: - Stream Mode connection (long-lived WebSocket, no public URL needed) - Text and rich text message support - DM and group chat support - Message deduplication with 5-minute window - Auto-reconnection with exponential backoff - Session webhook caching for reply routing Configuration: export DINGTALK_CLIENT_ID=your-app-key export DINGTALK_CLIENT_SECRET=your-app-secret # or in config.yaml: platforms: dingtalk: enabled: true extra: client_id: your-app-key client_secret: your-app-secret Files: - gateway/platforms/dingtalk.py (340 lines) — adapter implementation - gateway/config.py — add DINGTALK to Platform enum - gateway/run.py — add DingTalk to _create_adapter - hermes_cli/config.py — add env vars to _EXTRA_ENV_KEYS - hermes_cli/tools_config.py — add dingtalk to PLATFORMS - tests/gateway/test_dingtalk.py — 21 tests	2026-03-17 03:04:58 -07:00
Teknium	c3d626eb07	Revert "feat: add inference.sh integration (infsh tool + skill) (#1682 )" (#1684 ) This reverts commit `6020db0243`.	2026-03-17 03:01:30 -07:00
teknium1	30c417fe70	feat: add website blocklist enforcement for web/browser tools (#1064 ) Adds security.website_blocklist config for user-managed domain blocking across URL-capable tools. Enforced at the tool level (not monkey-patching) so it's safe and predictable. - tools/website_policy.py: shared policy loader with domain normalization, wildcard support (.tracking.example), shared file imports, and structured block metadata - web_extract: pre-fetch URL check + post-redirect recheck - web_crawl: pre-crawl URL check + per-page URL recheck - browser_navigate: pre-navigation URL check - Blocked responses include blocked_by_policy metadata so the agent can explain exactly what was denied Config: security: website_blocklist: enabled: true domains: ["evil.com", ".tracking.example"] shared_files: ["team-blocklist.txt"] Salvaged from PR #1086 by @kshitijk4poor. Browser post-redirect checks deferred (browser_tool was fully rewritten since the PR branched). Co-authored-by: kshitijk4poor <kshitijk4poor@users.noreply.github.com>	2026-03-17 02:59:39 -07:00
Teknium	6020db0243	feat: add inference.sh integration (infsh tool + skill) (#1682 ) Add inference.sh CLI (infsh) as a tool integration, giving agents access to 150+ AI apps through a single CLI — image gen (FLUX, Reve, Seedream), video (Veo, Wan, Seedance), LLMs, search (Tavily, Exa), 3D, avatar/lipsync, and more. One API key manages all services. Tools: - infsh: run any infsh CLI command (app list, app run, etc.) - infsh_install: install the CLI if not present Registered as an 'inference' toolset (opt-in, not in core tools). Includes comprehensive skill docs with examples for all app categories. Changes from original PR: - NOT added to _HERMES_CORE_TOOLS (available via --toolsets inference) - Added 12 tests covering tool registration, command execution, error handling, timeout, JSON parsing, and install flow Inspired by PR #1021 by @okaris. Co-authored-by: okaris <okaris@users.noreply.github.com>	2026-03-17 02:59:21 -07:00
Teknium	1d5a39e002	fix: thread safety for concurrent subagent delegation (#1672 ) * fix: thread safety for concurrent subagent delegation Four thread-safety fixes that prevent crashes and data races when running multiple subagents concurrently via delegate_task: 1. Remove redirect_stdout/stderr from delegate_tool — mutating global sys.stdout races with the spinner thread when multiple children start concurrently, causing segfaults. Children already run with quiet_mode=True so the redirect was redundant. 2. Split _run_single_child into _build_child_agent (main thread) + _run_single_child (worker thread). AIAgent construction creates httpx/SSL clients which are not thread-safe to initialize concurrently. 3. Add threading.Lock to SessionDB — subagents share the parent's SessionDB and call create_session/append_message from worker threads with no synchronization. 4. Add _active_children_lock to AIAgent — interrupt() iterates _active_children while worker threads append/remove children. 5. Add _client_cache_lock to auxiliary_client — multiple subagent threads may resolve clients concurrently via call_llm(). Based on PR #1471 by peteromallet. * feat: Honcho base_url override via config.yaml + quick command alias type Two features salvaged from PR #1576: 1. Honcho base_url override: allows pointing Hermes at a remote self-hosted Honcho deployment via config.yaml: honcho: base_url: "http://192.168.x.x:8000" When set, this overrides the Honcho SDK's environment mapping (production/local), enabling LAN/VPN Honcho deployments without requiring the server to live on localhost. Uses config.yaml instead of env var (HONCHO_URL) per project convention. 2. Quick command alias type: adds a new 'alias' quick command type that rewrites to another slash command before normal dispatch: quick_commands: sc: type: alias target: /context Supports both CLI and gateway. Arguments are forwarded to the target command. Based on PR #1576 by redhelix. --------- Co-authored-by: peteromallet <peteromallet@users.noreply.github.com> Co-authored-by: redhelix <redhelix@users.noreply.github.com>	2026-03-17 02:53:33 -07:00
Teknium	fd61ae13e5	revert: revert SMS (Telnyx) platform adapter for review This reverts commit `ef67037f8e`.	2026-03-17 02:53:30 -07:00
Teknium	ef67037f8e	feat: add SMS (Telnyx) platform adapter Implement SMS as a first-class messaging platform following ADDING_A_PLATFORM.md checklist. All 16 integration points covered: - gateway/platforms/sms.py: Core adapter with aiohttp webhook server, Telnyx REST API send, markdown stripping, 1600-char chunking, echo loop prevention, multi-number reply-from tracking - gateway/config.py: Platform.SMS enum + env override block - gateway/run.py: Adapter factory + auth maps (SMS_ALLOWED_USERS, SMS_ALLOW_ALL_USERS) - toolsets.py: hermes-sms toolset + included in hermes-gateway - cron/scheduler.py: SMS in platform_map for cron delivery - tools/send_message_tool.py: SMS routing + _send_sms() standalone sender - tools/cronjob_tools.py: 'sms' in deliver description - gateway/channel_directory.py: SMS in session-based discovery - agent/prompt_builder.py: SMS platform hint (plain text, concise) - hermes_cli/status.py: SMS in platforms status display - hermes_cli/gateway.py: SMS in setup wizard with Telnyx instructions - pyproject.toml: sms optional dependency group (aiohttp>=3.9.0) - tests/gateway/test_sms.py: Unit tests for config, format, truncate, echo prevention, requirements, toolset integration Co-authored-by: sunsakis <teo@sunsakis.com> Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-17 02:52:34 -07:00
teknium1	a1c81360a5	feat(cli): skin-aware light/dark theme mode with terminal auto-detection Add display.theme_mode setting (auto/light/dark) that makes the CLI readable on light terminal backgrounds. - Auto-detect terminal background via COLORFGBG, OSC 11, and macOS appearance (fallback chain in hermes_cli/colors.py) - Add colors_light overrides to all 7 built-in skins with dark/readable colors for light backgrounds - SkinConfig.get_color() now returns light overrides when theme is light - get_prompt_toolkit_style_overrides() uses light bg colors for completion menus in light mode - init_skin_from_config() reads display.theme_mode from config - 7 new tests covering theme mode resolution, detection fallbacks, and light-mode skin overrides Salvaged from PR #1187 by @peteromallet. Core design preserved; adapted to current main (kept all existing helpers, tool_emojis, convenience functions that were added after the PR branched). Co-authored-by: Peter O'Mallet <peteromallet@users.noreply.github.com>	2026-03-17 02:51:40 -07:00
Teknium	d156942419	fix(telegram): aggregate split text messages before dispatching (#1674 ) When a user sends a long message, Telegram clients split it into multiple updates that arrive within milliseconds of each other. Previously each chunk was dispatched independently — the first would start the agent, and subsequent chunks would interrupt or queue as separate turns, causing the agent to only see part of the message. Add text message batching to TelegramAdapter following the same pattern as the existing photo burst batching: - _enqueue_text_event() buffers text by session key, concatenating chunks that arrive in rapid succession - _flush_text_batch() dispatches the combined message after a 0.6s quiet period (configurable via HERMES_TELEGRAM_TEXT_BATCH_DELAY_SECONDS) - Timer resets on each new chunk, so all parts of a split arrive before the batch is dispatched Reported by NulledVector on Discord.	2026-03-17 02:49:57 -07:00
Teknium	35d948b6e1	feat: add Kilo Code (kilocode) as first-class inference provider (#1666 ) Add Kilo Gateway (kilo.ai) as an API-key provider with OpenAI-compatible endpoint at https://api.kilo.ai/api/gateway. Supports 500+ models from Anthropic, OpenAI, Google, xAI, Mistral, MiniMax via a single API key. - Register kilocode in PROVIDER_REGISTRY with aliases (kilo, kilo-code, kilo-gateway) and KILOCODE_API_KEY / KILOCODE_BASE_URL env vars - Add to model catalog, CLI provider menu, setup wizard, doctor checks - Add google/gemini-3-flash-preview as default aux model - 12 new tests covering registration, aliases, credential resolution, runtime config - Documentation updates (env vars, config, fallback providers) - Fix setup test index shift from provider insertion Inspired by PR #1473 by @amanning3390. Co-authored-by: amanning3390 <amanning3390@users.noreply.github.com>	2026-03-17 02:40:34 -07:00
Teknium	556e0f4b43	fix(docker): add explicit env allowlist for container credentials (#1436 ) Docker terminal sessions are secret-dark by default. This adds terminal.docker_forward_env as an explicit allowlist for env vars that may be forwarded into Docker containers. Values resolve from the current shell first, then fall back to ~/.hermes/.env. Only variables the user explicitly lists are forwarded — nothing is auto-exposed. Cherry-picked from PR #1449 by @teknium1, conflict-resolved onto current main. Fixes #1436 Supersedes #1439	2026-03-17 02:34:35 -07:00
Teknium	36a76bf9db	Merge pull request #1661 from NousResearch/fix/discord-thread-persistence fix(discord): persist thread participation across gateway restarts	2026-03-17 02:27:09 -07:00
teknium1	c8582fc4a2	fix(discord): persist thread participation across gateway restarts _bot_participated_threads was an in-memory set — lost on every restart. After restart, the bot forgot which threads it was active in, requiring fresh @mentions and potentially creating duplicate threads instead of continuing existing conversations. Changes: - Persist thread IDs to ~/.hermes/discord_threads.json - Load on adapter init, save on every new thread participation - _track_thread() replaces direct .add() calls for atomic persist - Cap at 500 tracked threads to prevent unbounded growth - /thread slash command also tracks participation - 7 new tests covering persistence, restart survival, corruption recovery, cap enforcement	2026-03-17 02:26:34 -07:00
Teknium	2c7c30be69	fix(security): harden terminal safety and sandbox file writes (#1653 ) * fix(security): harden terminal safety and sandbox file writes Two security improvements: 1. Dangerous command detection: expand shell -c pattern to catch combined flags (bash -lc, bash -ic, ksh -c) that were previously undetected. Pattern changed from matching only 'bash -c' to matching any shell invocation with -c anywhere in the flags. 2. File write sandboxing: add HERMES_WRITE_SAFE_ROOT env var that constrains all write_file/patch operations to a configured directory tree. Opt-in — when unset, behavior is unchanged. Useful for gateway/messaging deployments that should only touch a workspace. Based on PR #1085 by ismoilh. * fix: correct "POSIDEON" typo to "POSEIDON" in banner ASCII art The poseidon skin's banner_logo had the E and I letters swapped, spelling "POSIDEON-AGENT" instead of "POSEIDON-AGENT". --------- Co-authored-by: ismoilh <ismoilh@users.noreply.github.com> Co-authored-by: unmodeled-tyler <unmodeled.tyler@proton.me>	2026-03-17 02:22:12 -07:00
Teknium	6a320e8bfe	fix(security): block sandbox backend creds from subprocess env (#1264 ) * fix: prevent infinite 400 failure loop on context overflow (#1630) When a gateway session exceeds the model's context window, Anthropic may return a generic 400 invalid_request_error with just 'Error' as the message. This bypassed the phrase-based context-length detection, causing the agent to treat it as a non-retryable client error. Worse, the failed user message was still persisted to the transcript, making the session even larger on each attempt — creating an infinite loop. Three-layer fix: 1. run_agent.py — Fallback heuristic: when a 400 error has a very short generic message AND the session is large (>40% of context or >80 messages), treat it as a probable context overflow and trigger compression instead of aborting. 2. run_agent.py + gateway/run.py — Don't persist failed messages: when the agent returns failed=True before generating any response, skip writing the user's message to the transcript/DB. This prevents the session from growing on each failure. 3. gateway/run.py — Smarter error messages: detect context-overflow failures and suggest /compact or /reset specifically, instead of a generic 'try again' that will fail identically. * fix(skills): detect prompt injection patterns and block cache file reads Adds two security layers to prevent prompt injection via skills hub cache files (#1558): 1. read_file: blocks direct reads of ~/.hermes/skills/.hub/ directory (index-cache, catalog files). The 3.5MB clawhub_catalog_v1.json was the original injection vector — untrusted skill descriptions in the catalog contained adversarial text that the model executed. 2. skill_view: warns when skills are loaded from outside the trusted ~/.hermes/skills/ directory, and detects common injection patterns in skill content ("ignore previous instructions", "<system>", etc.). Cherry-picked from PR #1562 by ygd58. * fix(tools): chunk long messages in send_message_tool before dispatch (#1552) Long messages sent via send_message tool or cron delivery silently failed when exceeding platform limits. Gateway adapters handle this via truncate_message(), but the standalone senders in send_message_tool bypassed that entirely. - Apply truncate_message() chunking in _send_to_platform() before dispatching to individual platform senders - Remove naive message[i:i+2000] character split in _send_discord() in favor of centralized smart splitting - Attach media files to last chunk only for Telegram - Add regression tests for chunking and media placement Cherry-picked from PR #1557 by llbn. * fix(approval): show full command in dangerous command approval (#1553) Previously the command was truncated to 80 chars in CLI (with a [v]iew full option), 500 chars in Discord embeds, and missing entirely in Telegram/Slack approval messages. Now the full command is always displayed everywhere: - CLI: removed 80-char truncation and [v]iew full menu option - Gateway (TG/Slack): approval_required message includes full command in a code block - Discord: embed shows full command up to 4096-char limit - Windows: skip SIGALRM-based test timeout (Unix-only) - Updated tests: replaced view-flow tests with direct approval tests Cherry-picked from PR #1566 by crazywriter1. * fix(cli): flush stdout during agent loop to prevent macOS display freeze (#1624) The interrupt polling loop in chat() waited on the queue without invalidating the prompt_toolkit renderer. On macOS, the StdoutProxy buffer only flushed on input events, causing the CLI to appear frozen during tool execution until the user typed a key. Fix: call _invalidate() on each queue timeout (every ~100ms, throttled to 150ms) to force the renderer to flush buffered agent output. * fix(claw): warn when API keys are skipped during OpenClaw migration (#1580) When --migrate-secrets is not passed (the default), API keys like OPENROUTER_API_KEY are silently skipped with no warning. Users don't realize their keys weren't migrated until the agent fails to connect. Add a post-migration warning with actionable instructions: either re-run with --migrate-secrets or add the key manually via hermes config set. Cherry-picked from PR #1593 by ygd58. * fix(security): block sandbox backend creds from subprocess env (#1264) Add Modal and Daytona sandbox credentials to the subprocess env blocklist so they're not leaked to agent terminal sessions via printenv/env. Cherry-picked from PR #1571 by ygd58. --------- Co-authored-by: buray <ygd58@users.noreply.github.com> Co-authored-by: lbn <llbn@users.noreply.github.com> Co-authored-by: crazywriter1 <53251494+crazywriter1@users.noreply.github.com>	2026-03-17 02:20:42 -07:00
Teknium	cb0deb5f9d	feat: add NeuTTS optional skill + local TTS provider backend * feat(skills): add bundled neutts optional skill Add NeuTTS optional skill with CLI scaffold, bootstrap helper, and sample voice profile. Also fixes skills_hub.py to handle binary assets (WAV files) during skill installation. Changes: - optional-skills/mlops/models/neutts/ — skill + CLI scaffold - tools/skills_hub.py — binary asset support (read_bytes, write_bytes) - tests/tools/test_skills_hub.py — regression tests for binary assets * feat(tts): add NeuTTS as local TTS provider backend Add NeuTTS as a fourth TTS provider option alongside Edge, ElevenLabs, and OpenAI. NeuTTS runs fully on-device via neutts_cli — no API key needed. Provider behavior: - Explicit: set tts.provider to 'neutts' in config.yaml - Fallback: when Edge TTS is unavailable and neutts_cli is installed, automatically falls back to NeuTTS instead of failing - check_tts_requirements() now includes NeuTTS in availability checks NeuTTS outputs WAV natively. For Telegram voice bubbles, ffmpeg converts to Opus (same pattern as Edge TTS). Changes: - tools/tts_tool.py — _generate_neutts(), _check_neutts_available(), provider dispatch, fallback logic, Opus conversion - hermes_cli/config.py — tts.neutts config defaults --------- Co-authored-by: unmodeled-tyler <unmodeled.tyler@proton.me>	2026-03-17 02:13:34 -07:00
Teknium	766f4aae2b	refactor: tie api_mode to provider config instead of env var (#1656 ) Remove HERMES_API_MODE env var. api_mode is now configured where the endpoint is defined: - model.api_mode in config.yaml (for the active model config) - custom_providers[].api_mode (for named custom providers) Replace _get_configured_api_mode() with _parse_api_mode() which just validates a value against the whitelist without reading env vars. Both paths (model config and named custom providers) now read api_mode from their respective config entries rather than a global override.	2026-03-17 02:13:26 -07:00
Teknium	49043b7b7d	feat: add /tools disable/enable/list slash commands with session reset (#1652 ) Add in-session tool management via /tools disable/enable/list, plus hermes tools list/disable/enable CLI subcommands. Supports both built-in toolsets (web, memory) and MCP tools (github:create_issue). To preserve prompt caching, /tools disable/enable in a chat session saves the change to config and resets the session cleanly — the user is asked to confirm before the reset happens. Also improves prefix matching: /qui now dispatches to /quit instead of showing ambiguous when longer skill commands like /quint-pipeline are installed. Based on PR #1520 by @YanSte. Co-authored-by: Yannick Stephan <YanSte@users.noreply.github.com>	2026-03-17 02:05:26 -07:00
Teknium	f2414bfd45	feat: allow custom endpoints to use responses API via api_mode override (#1651 ) Add HERMES_API_MODE env var and model.api_mode config field to let custom OpenAI-compatible endpoints opt into codex_responses mode without requiring the OpenAI Codex OAuth provider path. - _get_configured_api_mode() reads HERMES_API_MODE env (precedence) then model.api_mode from config.yaml; validates against whitelist - Applied in both _resolve_openrouter_runtime() and _resolve_named_custom_runtime() (original PR only covered openrouter) - Fix _dump_api_request_debug() to show /responses URL when in codex_responses mode instead of always showing /chat/completions - Tests for config override, env override, invalid values, named custom providers, and debug dump URL for both API modes Inspired by PR #1041 by @mxyhi. Co-authored-by: mxyhi <mxyhi@users.noreply.github.com>	2026-03-17 02:04:36 -07:00
0xbyt4	68fbcdaa06	fix: add browser_console to browser toolset and core tools list (#1084 ) browser_console was registered in the tool registry but missing from all toolset definitions (TOOLSETS, _HERMES_CORE_TOOLS, _LEGACY_TOOLSET_MAP), so the agent could never discover or use it. Added to all 4 locations + 4 wiring tests. Cherry-picked from PR #1084 by @0xbyt4 (authorship preserved in tests).	2026-03-17 02:02:57 -07:00
teknium1	7d91b436e4	fix: exclude hidden directories from find/grep search backends (#1558 ) The primary injection vector in #1558 was search_files discovering catalog cache files in .hub/index-cache/ via find or grep, which don't skip hidden directories like ripgrep does by default. Three-layer fix: 1. _search_files (find): add -not -path '/.' to exclude hidden directories, matching ripgrep's default behavior. 2. _search_with_grep: add --exclude-dir='.*' to skip hidden directories in the grep fallback path. 3. _write_index_cache: write a .ignore file to .hub/ so ripgrep also skips it even when invoked with --hidden (belt-and-suspenders). This makes all three search backends (rg, grep, find) consistently exclude hidden directories, preventing the agent from discovering and reading unvetted community content in hub cache files.	2026-03-17 02:02:57 -07:00
Teknium	4cb6735541	fix(approval): show full command in dangerous command approval (#1553 ) * fix: prevent infinite 400 failure loop on context overflow (#1630) When a gateway session exceeds the model's context window, Anthropic may return a generic 400 invalid_request_error with just 'Error' as the message. This bypassed the phrase-based context-length detection, causing the agent to treat it as a non-retryable client error. Worse, the failed user message was still persisted to the transcript, making the session even larger on each attempt — creating an infinite loop. Three-layer fix: 1. run_agent.py — Fallback heuristic: when a 400 error has a very short generic message AND the session is large (>40% of context or >80 messages), treat it as a probable context overflow and trigger compression instead of aborting. 2. run_agent.py + gateway/run.py — Don't persist failed messages: when the agent returns failed=True before generating any response, skip writing the user's message to the transcript/DB. This prevents the session from growing on each failure. 3. gateway/run.py — Smarter error messages: detect context-overflow failures and suggest /compact or /reset specifically, instead of a generic 'try again' that will fail identically. * fix(skills): detect prompt injection patterns and block cache file reads Adds two security layers to prevent prompt injection via skills hub cache files (#1558): 1. read_file: blocks direct reads of ~/.hermes/skills/.hub/ directory (index-cache, catalog files). The 3.5MB clawhub_catalog_v1.json was the original injection vector — untrusted skill descriptions in the catalog contained adversarial text that the model executed. 2. skill_view: warns when skills are loaded from outside the trusted ~/.hermes/skills/ directory, and detects common injection patterns in skill content ("ignore previous instructions", "<system>", etc.). Cherry-picked from PR #1562 by ygd58. * fix(tools): chunk long messages in send_message_tool before dispatch (#1552) Long messages sent via send_message tool or cron delivery silently failed when exceeding platform limits. Gateway adapters handle this via truncate_message(), but the standalone senders in send_message_tool bypassed that entirely. - Apply truncate_message() chunking in _send_to_platform() before dispatching to individual platform senders - Remove naive message[i:i+2000] character split in _send_discord() in favor of centralized smart splitting - Attach media files to last chunk only for Telegram - Add regression tests for chunking and media placement Cherry-picked from PR #1557 by llbn. * fix(approval): show full command in dangerous command approval (#1553) Previously the command was truncated to 80 chars in CLI (with a [v]iew full option), 500 chars in Discord embeds, and missing entirely in Telegram/Slack approval messages. Now the full command is always displayed everywhere: - CLI: removed 80-char truncation and [v]iew full menu option - Gateway (TG/Slack): approval_required message includes full command in a code block - Discord: embed shows full command up to 4096-char limit - Windows: skip SIGALRM-based test timeout (Unix-only) - Updated tests: replaced view-flow tests with direct approval tests Cherry-picked from PR #1566 by crazywriter1. --------- Co-authored-by: buray <ygd58@users.noreply.github.com> Co-authored-by: lbn <llbn@users.noreply.github.com> Co-authored-by: crazywriter1 <53251494+crazywriter1@users.noreply.github.com>	2026-03-17 02:02:33 -07:00
Teknium	1b2d6c424c	fix: add --yes flag to bypass confirmation in /skills install and uninstall (#1647 ) Fixes hanging when using /skills install or /skills uninstall from the TUI — bare input() calls hang inside prompt_toolkit's event loop. Changes: - Add skip_confirm parameter to do_install() and do_uninstall() - Separate --yes/-y (confirmation bypass) from --force (scan override) in both argparse and slash command handlers - Update usage hint for /skills uninstall to show [--yes] The original PR (#1595) accidentally deleted the install_from_quarantine() call, which would have broken all installs. That bug is not present here. Based on PR #1595 by 333Alden333. Co-authored-by: 333Alden333 <333Alden333@users.noreply.github.com>	2026-03-17 01:59:07 -07:00
Teknium	12afccd9ca	fix(tools): chunk long messages in send_message_tool before dispatch (#1552 ) * fix: prevent infinite 400 failure loop on context overflow (#1630) When a gateway session exceeds the model's context window, Anthropic may return a generic 400 invalid_request_error with just 'Error' as the message. This bypassed the phrase-based context-length detection, causing the agent to treat it as a non-retryable client error. Worse, the failed user message was still persisted to the transcript, making the session even larger on each attempt — creating an infinite loop. Three-layer fix: 1. run_agent.py — Fallback heuristic: when a 400 error has a very short generic message AND the session is large (>40% of context or >80 messages), treat it as a probable context overflow and trigger compression instead of aborting. 2. run_agent.py + gateway/run.py — Don't persist failed messages: when the agent returns failed=True before generating any response, skip writing the user's message to the transcript/DB. This prevents the session from growing on each failure. 3. gateway/run.py — Smarter error messages: detect context-overflow failures and suggest /compact or /reset specifically, instead of a generic 'try again' that will fail identically. * fix(skills): detect prompt injection patterns and block cache file reads Adds two security layers to prevent prompt injection via skills hub cache files (#1558): 1. read_file: blocks direct reads of ~/.hermes/skills/.hub/ directory (index-cache, catalog files). The 3.5MB clawhub_catalog_v1.json was the original injection vector — untrusted skill descriptions in the catalog contained adversarial text that the model executed. 2. skill_view: warns when skills are loaded from outside the trusted ~/.hermes/skills/ directory, and detects common injection patterns in skill content ("ignore previous instructions", "<system>", etc.). Cherry-picked from PR #1562 by ygd58. * fix(tools): chunk long messages in send_message_tool before dispatch (#1552) Long messages sent via send_message tool or cron delivery silently failed when exceeding platform limits. Gateway adapters handle this via truncate_message(), but the standalone senders in send_message_tool bypassed that entirely. - Apply truncate_message() chunking in _send_to_platform() before dispatching to individual platform senders - Remove naive message[i:i+2000] character split in _send_discord() in favor of centralized smart splitting - Attach media files to last chunk only for Telegram - Add regression tests for chunking and media placement Cherry-picked from PR #1557 by llbn. --------- Co-authored-by: buray <ygd58@users.noreply.github.com> Co-authored-by: lbn <llbn@users.noreply.github.com>	2026-03-17 01:52:43 -07:00
Teknium	81f76111b0	Merge pull request #1560 from eren-karakus0/fix/singularity-preflight-check fix(terminal): add Singularity/Apptainer preflight availability check	2026-03-17 01:52:03 -07:00
Teknium	96dac22194	fix: prevent infinite 400 loop on context overflow + block prompt injection via cache files (#1630 , #1558 ) * fix: prevent infinite 400 failure loop on context overflow (#1630) When a gateway session exceeds the model's context window, Anthropic may return a generic 400 invalid_request_error with just 'Error' as the message. This bypassed the phrase-based context-length detection, causing the agent to treat it as a non-retryable client error. Worse, the failed user message was still persisted to the transcript, making the session even larger on each attempt — creating an infinite loop. Three-layer fix: 1. run_agent.py — Fallback heuristic: when a 400 error has a very short generic message AND the session is large (>40% of context or >80 messages), treat it as a probable context overflow and trigger compression instead of aborting. 2. run_agent.py + gateway/run.py — Don't persist failed messages: when the agent returns failed=True before generating any response, skip writing the user's message to the transcript/DB. This prevents the session from growing on each failure. 3. gateway/run.py — Smarter error messages: detect context-overflow failures and suggest /compact or /reset specifically, instead of a generic 'try again' that will fail identically. * fix(skills): detect prompt injection patterns and block cache file reads Adds two security layers to prevent prompt injection via skills hub cache files (#1558): 1. read_file: blocks direct reads of ~/.hermes/skills/.hub/ directory (index-cache, catalog files). The 3.5MB clawhub_catalog_v1.json was the original injection vector — untrusted skill descriptions in the catalog contained adversarial text that the model executed. 2. skill_view: warns when skills are loaded from outside the trusted ~/.hermes/skills/ directory, and detects common injection patterns in skill content ("ignore previous instructions", "<system>", etc.). Cherry-picked from PR #1562 by ygd58. --------- Co-authored-by: buray <ygd58@users.noreply.github.com>	2026-03-17 01:50:59 -07:00
Teknium	4920c5940f	feat: auto-detect local file paths in gateway responses for native media delivery (#1640 ) Small models (7B-14B) can't reliably use MEDIA: or IMAGE: syntax. This adds extract_local_files() to BasePlatformAdapter that regex-detects bare local file paths ending in image/video extensions, validates them with os.path.isfile(), and delivers them as native platform attachments. Hardened over the original PR: - Code-block exclusion: paths inside fenced blocks and inline code are skipped so code samples are never mutilated - URL rejection: negative lookbehind prevents matching path segments inside HTTP URLs - Relative path rejection: ./foo.png no longer matches - Tilde path cleanup: raw ~/... form is removed from response text - Deduplication by expanded path - Added .webm to _VIDEO_EXTS - Fallback to send_document for unrecognized media extensions Based on PR #1636 by sudoingX. Co-authored-by: sudoingX <sudoingX@users.noreply.github.com>	2026-03-17 01:47:34 -07:00
Teknium	3744118311	feat(cli): two-stage /model autocomplete with ghost text suggestions (#1641 ) * feat(cli): two-stage /model autocomplete with ghost text suggestions - SlashCommandCompleter: Tab-complete providers first (anthropic:, openrouter:, etc.) then models within the selected provider - SlashCommandAutoSuggest: inline ghost text for slash commands, subcommands, and /model provider:model two-stage suggestions - Custom Tab key binding: accepts provider completion and immediately re-triggers completions to show that provider's models - COMMANDS_BY_CATEGORY: structured format with explicit subcommands for tab completion and ghost text (prompt, reasoning, voice, skills, cron, browser) - SUBCOMMANDS dict auto-extracted from command definitions - Model/provider info cached 60s for responsive completions * fix: repair test regression and restore gold color from PR #1622 - Fix test_unknown_command_still_shows_error: patch _cprint instead of console.print to match the _cprint switch in process_command() - Restore gold color on 'Type /help' hint using _DIM + _GOLD constants instead of bare \033[2m (was losing the #B8860B gold) - Use _GOLD constant for ambiguous command message for consistency - Add clarifying comment on SUBCOMMANDS regex fallback --------- Co-authored-by: Lars van der Zande <lmvanderzande@gmail.com>	2026-03-17 01:47:32 -07:00
Teknium	5ada0b95e9	Merge pull request #1609 from 0xbyt4/fix/context-counter-cache-tokens fix: context counter shows cached token count in status bar	2026-03-17 01:45:12 -07:00
teknium1	19eaf5d956	test: fix telegram mock to include ParseMode constant The MarkdownV2 formatting change imports telegram.constants.ParseMode, which the test mock didn't provide. Add ParseMode to the mock so existing tests continue working.	2026-03-17 01:44:11 -07:00
Teknium	c3ca68d25b	Merge pull request #1614 from PeterFile/fix/launchd-service-recovery fix(gateway): recover stale launchd service state	2026-03-17 01:43:07 -07:00
Teknium	eaa9ceeb43	Merge pull request #1621 from Death-Incarnate/main fix: isolate test_anthropic_adapter from local credentials	2026-03-17 01:40:39 -07:00
Teknium	949fac192f	fix(tools): remove unnecessary crontab requirement from cronjob tool (#1638 ) * fix(tools): remove unnecessary crontab requirement from cronjob tool The hermes cron system is internal — it uses a JSON-based scheduler ticked by the gateway (cron/scheduler.py), not system crontab. The check for shutil.which('crontab') was preventing the cronjob tool from being available in environments without crontab installed (e.g. minimal Ubuntu containers). Changes: - Remove shutil.which('crontab') check from check_cronjob_requirements() - Remove unused shutil import - Update docstring to clarify internal scheduler is used - Update tests to reflect new behavior and add coverage for all session modes (interactive, gateway, exec_ask) Fixes #1589 * test: add HERMES_EXEC_ASK coverage for cronjob requirements Adds missing test for the exec_ask session mode, complementing the cherry-picked fix from PR #1633. --------- Co-authored-by: Bartok9 <bartokmagic@proton.me>	2026-03-17 01:40:02 -07:00
teknium1	c16870277c	test: add regression test for stale PID in gateway_state.json (#1631 ) Verifies that write_runtime_status() overwrites pid and start_time from a previous process rather than preserving them via setdefault(). Covers the fix from PR #1632.	2026-03-17 01:35:02 -07:00
Teknium	2af4af6390	Merge pull request #1635 from NousResearch/hermes/hermes-a86162db fix: sanitize corrupted .env files on read and during migration	2026-03-17 01:33:36 -07:00
teknium1	1c61ab6bd9	fix: unconditionally clear ANTHROPIC_TOKEN on v8→v9 migration No conditional checks — just clear it. The new auth flow doesn't use this env var. Anyone upgrading gets it wiped once, then it's done.	2026-03-17 01:31:20 -07:00
teknium1	e9f1a8e39b	fix: gate ANTHROPIC_TOKEN cleanup to config version 8→9 migration - Bump _config_version 8 → 9 - Move stale ANTHROPIC_TOKEN clearing into 'if current_ver < 9' block so it only runs once during the upgrade, not on every migrate_config() - ANTHROPIC_TOKEN is still a valid auth path (OAuth flow), so we don't want to clear it repeatedly — only during the one-time migration from old setups that left it stale - Add test_skips_on_version_9_or_later to verify one-time behavior - All tests set config version 8 to trigger migration	2026-03-17 01:28:38 -07:00
teknium1	b6a51c955e	fix: clear stale ANTHROPIC_TOKEN during migration, remove false * detection - Remove * placeholder detection from _sanitize_env_lines (was based on confusing terminal redaction with literal file content) - Add migrate_config() logic to clear stale ANTHROPIC_TOKEN when better credentials exist (ANTHROPIC_API_KEY or Claude Code auto-discovery) - Old ANTHROPIC_TOKEN values shadow Claude Code credential fallthrough, breaking auth for users who updated without re-running setup - Preserves ANTHROPIC_TOKEN when it's the only auth method available - 3 new migration tests, updated existing tests	2026-03-17 01:26:23 -07:00
teknium1	634c1f6752	fix: sanitize corrupted .env files on read and during migration Fixes two corruption patterns that break API keys during updates: 1. Concatenated KEY=VALUE pairs on a single line due to missing newlines (e.g. ANTHROPIC_API_KEY=sk-...OPENAI_BASE_URL=https://...). Uses a known-keys set to safely detect and split concatenated entries without false-splitting values that contain uppercase text. 2. Stale KEY=* placeholder entries left by incomplete setup runs that never get updated and shadow real credentials. Changes: - Add _sanitize_env_lines() that splits concatenated known keys and drops * placeholders - Add sanitize_env_file() public API for explicit repair - Call sanitization in save_env_value() on every read (self-healing) - Call sanitize_env_file() at the start of migrate_config() so existing corrupted files are repaired on update - 12 new tests covering splits, placeholders, edge cases, and integration	2026-03-17 01:13:34 -07:00
Teknium	3576f44a57	feat: add Vercel AI Gateway provider (#1628 ) * feat: add Vercel AI Gateway as a first-class provider Adds AI Gateway (ai-gateway.vercel.sh) as a new inference provider with AI_GATEWAY_API_KEY authentication, live model discovery, and reasoning support via extra_body.reasoning. Based on PR #1492 by jerilynzheng. * feat: add AI Gateway to setup wizard, doctor, and fallback providers * test: add AI Gateway to api_key_providers test suite * feat: add AI Gateway to hermes model CLI and model metadata Wire AI Gateway into the interactive model selection menu and add context lengths for AI Gateway model IDs in model_metadata.py. * feat: use claude-haiku-4.5 as AI Gateway auxiliary model * revert: use gemini-3-flash as AI Gateway auxiliary model * fix: move AI Gateway below established providers in selection order --------- Co-authored-by: jerilynzheng <jerilynzheng@users.noreply.github.com> Co-authored-by: jerilynzheng <zheng.jerilyn@gmail.com>	2026-03-17 00:12:16 -07:00

... 3 4 5 6 7 ...

1205 Commits