hermes-agent-features

Author	SHA1	Message	Date
0xbyt4	68fbcdaa06	fix: add browser_console to browser toolset and core tools list (#1084 ) browser_console was registered in the tool registry but missing from all toolset definitions (TOOLSETS, _HERMES_CORE_TOOLS, _LEGACY_TOOLSET_MAP), so the agent could never discover or use it. Added to all 4 locations + 4 wiring tests. Cherry-picked from PR #1084 by @0xbyt4 (authorship preserved in tests).	2026-03-17 02:02:57 -07:00
teknium1	7d91b436e4	fix: exclude hidden directories from find/grep search backends (#1558 ) The primary injection vector in #1558 was search_files discovering catalog cache files in .hub/index-cache/ via find or grep, which don't skip hidden directories like ripgrep does by default. Three-layer fix: 1. _search_files (find): add -not -path '/.' to exclude hidden directories, matching ripgrep's default behavior. 2. _search_with_grep: add --exclude-dir='.*' to skip hidden directories in the grep fallback path. 3. _write_index_cache: write a .ignore file to .hub/ so ripgrep also skips it even when invoked with --hidden (belt-and-suspenders). This makes all three search backends (rg, grep, find) consistently exclude hidden directories, preventing the agent from discovering and reading unvetted community content in hub cache files.	2026-03-17 02:02:57 -07:00
Teknium	4cb6735541	fix(approval): show full command in dangerous command approval (#1553 ) * fix: prevent infinite 400 failure loop on context overflow (#1630) When a gateway session exceeds the model's context window, Anthropic may return a generic 400 invalid_request_error with just 'Error' as the message. This bypassed the phrase-based context-length detection, causing the agent to treat it as a non-retryable client error. Worse, the failed user message was still persisted to the transcript, making the session even larger on each attempt — creating an infinite loop. Three-layer fix: 1. run_agent.py — Fallback heuristic: when a 400 error has a very short generic message AND the session is large (>40% of context or >80 messages), treat it as a probable context overflow and trigger compression instead of aborting. 2. run_agent.py + gateway/run.py — Don't persist failed messages: when the agent returns failed=True before generating any response, skip writing the user's message to the transcript/DB. This prevents the session from growing on each failure. 3. gateway/run.py — Smarter error messages: detect context-overflow failures and suggest /compact or /reset specifically, instead of a generic 'try again' that will fail identically. * fix(skills): detect prompt injection patterns and block cache file reads Adds two security layers to prevent prompt injection via skills hub cache files (#1558): 1. read_file: blocks direct reads of ~/.hermes/skills/.hub/ directory (index-cache, catalog files). The 3.5MB clawhub_catalog_v1.json was the original injection vector — untrusted skill descriptions in the catalog contained adversarial text that the model executed. 2. skill_view: warns when skills are loaded from outside the trusted ~/.hermes/skills/ directory, and detects common injection patterns in skill content ("ignore previous instructions", "<system>", etc.). Cherry-picked from PR #1562 by ygd58. * fix(tools): chunk long messages in send_message_tool before dispatch (#1552) Long messages sent via send_message tool or cron delivery silently failed when exceeding platform limits. Gateway adapters handle this via truncate_message(), but the standalone senders in send_message_tool bypassed that entirely. - Apply truncate_message() chunking in _send_to_platform() before dispatching to individual platform senders - Remove naive message[i:i+2000] character split in _send_discord() in favor of centralized smart splitting - Attach media files to last chunk only for Telegram - Add regression tests for chunking and media placement Cherry-picked from PR #1557 by llbn. * fix(approval): show full command in dangerous command approval (#1553) Previously the command was truncated to 80 chars in CLI (with a [v]iew full option), 500 chars in Discord embeds, and missing entirely in Telegram/Slack approval messages. Now the full command is always displayed everywhere: - CLI: removed 80-char truncation and [v]iew full menu option - Gateway (TG/Slack): approval_required message includes full command in a code block - Discord: embed shows full command up to 4096-char limit - Windows: skip SIGALRM-based test timeout (Unix-only) - Updated tests: replaced view-flow tests with direct approval tests Cherry-picked from PR #1566 by crazywriter1. --------- Co-authored-by: buray <ygd58@users.noreply.github.com> Co-authored-by: lbn <llbn@users.noreply.github.com> Co-authored-by: crazywriter1 <53251494+crazywriter1@users.noreply.github.com>	2026-03-17 02:02:33 -07:00
Teknium	12afccd9ca	fix(tools): chunk long messages in send_message_tool before dispatch (#1552 ) * fix: prevent infinite 400 failure loop on context overflow (#1630) When a gateway session exceeds the model's context window, Anthropic may return a generic 400 invalid_request_error with just 'Error' as the message. This bypassed the phrase-based context-length detection, causing the agent to treat it as a non-retryable client error. Worse, the failed user message was still persisted to the transcript, making the session even larger on each attempt — creating an infinite loop. Three-layer fix: 1. run_agent.py — Fallback heuristic: when a 400 error has a very short generic message AND the session is large (>40% of context or >80 messages), treat it as a probable context overflow and trigger compression instead of aborting. 2. run_agent.py + gateway/run.py — Don't persist failed messages: when the agent returns failed=True before generating any response, skip writing the user's message to the transcript/DB. This prevents the session from growing on each failure. 3. gateway/run.py — Smarter error messages: detect context-overflow failures and suggest /compact or /reset specifically, instead of a generic 'try again' that will fail identically. * fix(skills): detect prompt injection patterns and block cache file reads Adds two security layers to prevent prompt injection via skills hub cache files (#1558): 1. read_file: blocks direct reads of ~/.hermes/skills/.hub/ directory (index-cache, catalog files). The 3.5MB clawhub_catalog_v1.json was the original injection vector — untrusted skill descriptions in the catalog contained adversarial text that the model executed. 2. skill_view: warns when skills are loaded from outside the trusted ~/.hermes/skills/ directory, and detects common injection patterns in skill content ("ignore previous instructions", "<system>", etc.). Cherry-picked from PR #1562 by ygd58. * fix(tools): chunk long messages in send_message_tool before dispatch (#1552) Long messages sent via send_message tool or cron delivery silently failed when exceeding platform limits. Gateway adapters handle this via truncate_message(), but the standalone senders in send_message_tool bypassed that entirely. - Apply truncate_message() chunking in _send_to_platform() before dispatching to individual platform senders - Remove naive message[i:i+2000] character split in _send_discord() in favor of centralized smart splitting - Attach media files to last chunk only for Telegram - Add regression tests for chunking and media placement Cherry-picked from PR #1557 by llbn. --------- Co-authored-by: buray <ygd58@users.noreply.github.com> Co-authored-by: lbn <llbn@users.noreply.github.com>	2026-03-17 01:52:43 -07:00
Teknium	81f76111b0	Merge pull request #1560 from eren-karakus0/fix/singularity-preflight-check fix(terminal): add Singularity/Apptainer preflight availability check	2026-03-17 01:52:03 -07:00
teknium1	19eaf5d956	test: fix telegram mock to include ParseMode constant The MarkdownV2 formatting change imports telegram.constants.ParseMode, which the test mock didn't provide. Add ParseMode to the mock so existing tests continue working.	2026-03-17 01:44:11 -07:00
Teknium	949fac192f	fix(tools): remove unnecessary crontab requirement from cronjob tool (#1638 ) * fix(tools): remove unnecessary crontab requirement from cronjob tool The hermes cron system is internal — it uses a JSON-based scheduler ticked by the gateway (cron/scheduler.py), not system crontab. The check for shutil.which('crontab') was preventing the cronjob tool from being available in environments without crontab installed (e.g. minimal Ubuntu containers). Changes: - Remove shutil.which('crontab') check from check_cronjob_requirements() - Remove unused shutil import - Update docstring to clarify internal scheduler is used - Update tests to reflect new behavior and add coverage for all session modes (interactive, gateway, exec_ask) Fixes #1589 * test: add HERMES_EXEC_ASK coverage for cronjob requirements Adds missing test for the exec_ask session mode, complementing the cherry-picked fix from PR #1633. --------- Co-authored-by: Bartok9 <bartokmagic@proton.me>	2026-03-17 01:40:02 -07:00
Teknium	e3f9894caf	fix: send_animation metadata, MarkdownV2 inline code splitting, tirith cosign-free install (#1626 ) * fix: Anthropic OAuth compatibility — Claude Code identity fingerprinting Anthropic routes OAuth/subscription requests based on Claude Code's identity markers. Without them, requests get intermittent 500 errors (~25% failure rate observed). This matches what pi-ai (clawdbot) and OpenCode both implement for OAuth compatibility. Changes (OAuth tokens only — API key users unaffected): 1. Headers: user-agent 'claude-cli/2.1.2 (external, cli)' + x-app 'cli' 2. System prompt: prepend 'You are Claude Code, Anthropic's official CLI' 3. System prompt sanitization: replace Hermes/Nous references 4. Tool names: prefix with 'mcp_' (Claude Code convention for non-native tools) 5. Tool name stripping: remove 'mcp_' prefix from response tool calls Before: 9/12 OK, 1 hard fail, 4 needed retries (~25% error rate) After: 16/16 OK, 0 failures, 0 retries (0% error rate) * fix: three gateway issues from user error logs 1. send_animation missing metadata kwarg (base.py) - Base class send_animation lacked the metadata parameter that the call site in base.py line 917 passes. Telegram's override accepted it, but any platform without an override (Discord, Slack, etc.) hit TypeError. Added metadata to base class signature. 2. MarkdownV2 split-inside-inline-code (base.py truncate_message) - truncate_message could split at a space inside an inline code span (e.g. `function(arg1, arg2)`), leaving an unpaired backtick and unescaped parentheses in the chunk. Telegram rejects with 'character ( is reserved'. Added inline code awareness to the split-point finder — detects odd backtick counts and moves the split before the code span. 3. tirith auto-install without cosign (tirith_security.py) - Previously required cosign on PATH for auto-install, blocking install entirely with a warning if missing. Now proceeds with SHA-256 checksum verification only when cosign is unavailable. Cosign is still used for full supply chain verification when present. If cosign IS present but verification explicitly fails, install is still aborted (tampered release).	2026-03-16 23:39:41 -07:00
Muhammet Eren Karakuş	43b8ecd172	fix(tests): use case-insensitive regex in singularity preflight tests pytest.raises(match=...) is case-sensitive by default. The error message starts with "Neither" (capital N) but the regex used lowercase "neither", causing CI failures on Linux.	2026-03-16 19:01:39 +03:00
Muhammet Eren Karakuş	606f57a3ab	fix(terminal): add Singularity/Apptainer preflight availability check When neither apptainer nor singularity is installed, the Singularity backend silently defaults to "singularity" and fails with a cryptic FileNotFoundError inside _start_instance(). Add a preflight check that resolves the executable and verifies it responds, raising a clear RuntimeError with install instructions on failure. Closes #1511	2026-03-16 18:25:20 +03:00
teknium1	b72f522e30	test: fake minisweagent for docker cwd mount regressions Make the new Docker cwd-mount tests pass in CI environments that do not have the minisweagent package installed by injecting a fake module instead of monkeypatching an import path that may not exist.	2026-03-16 05:40:05 -07:00
teknium1	780ddd102b	fix(docker): gate cwd workspace mount behind config Keep Docker sandboxes isolated by default. Add an explicit terminal.docker_mount_cwd_to_workspace opt-in, thread it through terminal/file environment creation, and document the security tradeoff and config.yaml workflow clearly.	2026-03-16 05:20:56 -07:00
Bartok9	8cdbbcaaa2	fix(docker): auto-mount host CWD to /workspace Fixes #1445 — When using Docker backend, the user's current working directory is now automatically bind-mounted to /workspace inside the container. This allows users to run `cd my-project && hermes` and have their project files accessible to the agent without manual volume config. Changes: - Add host_cwd and auto_mount_cwd parameters to DockerEnvironment - Capture original host CWD in _get_env_config() before container fallback - Pass host_cwd through _create_environment() to Docker backend - Add TERMINAL_DOCKER_NO_AUTO_MOUNT env var to disable if needed - Skip auto-mount when /workspace is already explicitly mounted - Add tests for auto-mount behavior - Add documentation for the new feature The auto-mount is skipped when: 1. TERMINAL_DOCKER_NO_AUTO_MOUNT=true is set 2. User configured docker_volumes with :/workspace 3. persistent_filesystem=true (persistent sandbox mode) This makes the Docker backend behave more intuitively — the agent operates on the user's actual project directory by default.	2026-03-16 05:20:21 -07:00
Teknium	c1da1fdcd5	feat: auto-detect provider when switching models via /model (#1506 ) When typing /model deepseek-chat while on a different provider, the model name now auto-resolves to the correct provider instead of silently staying on the wrong one and causing API errors. Detection priority: 1. Direct provider with credentials (e.g. DEEPSEEK_API_KEY set) 2. OpenRouter catalog match with proper slug remapping 3. Direct provider without creds (clear error beats silent failure) Also adds DeepSeek as a first-class API-key provider — just set DEEPSEEK_API_KEY and /model deepseek-chat routes directly. Bare model names get remapped to proper OpenRouter slugs: /model gpt-5.4 → openai/gpt-5.4 /model claude-opus-4.6 → anthropic/claude-opus-4.6 Salvages the concept from PR #1177 by @virtaava with credential awareness and OpenRouter slug mapping added. Co-authored-by: virtaava <virtaava@users.noreply.github.com>	2026-03-16 04:34:45 -07:00
Teknium	dd7921d514	fix(honcho): isolate session routing for multi-user gateway (#1500 ) Salvaged from PR #1470 by adavyas. Core fix: Honcho tool calls in a multi-session gateway could route to the wrong session because honcho_tools.py relied on process-global state. Now threads session context through the call chain: AIAgent._invoke_tool() → handle_function_call() → registry.dispatch() → handler **kw → _resolve_session_context() Changes: - Add _resolve_session_context() to prefer per-call context over globals - Plumb honcho_manager + honcho_session_key through handle_function_call - Add sync_honcho=False to run_conversation() for synthetic flush turns - Pass honcho_session_key through gateway memory flush lifecycle - Harden gateway PID detection when /proc cmdline is unreadable - Make interrupt test scripts import-safe for pytest-xdist - Wrap BibTeX examples in Jekyll raw blocks for docs build - Fix thread-order-dependent assertion in client lifecycle test - Expand Honcho docs: session isolation, lifecycle, routing internals Dropped from original PR: - Indentation change in _create_request_openai_client that would move client creation inside the lock (causes unnecessary contention) Co-authored-by: adavyas <adavyas@users.noreply.github.com>	2026-03-16 00:23:47 -07:00
teknium1	1f72ce71b7	fix: restore local STT fallback for gateway voice notes Restore local STT command fallback for voice transcription, detect whisper and ffmpeg in common local install paths, and avoid bogus no-provider messaging when only a backend-specific key is missing.	2026-03-15 21:51:40 -07:00
teknium1	01e62c067b	merge: resolve conflicts with origin/main (SSH preflight check)	2026-03-15 21:13:40 -07:00
Teknium	ceb970c559	fix(terminal): add SSH preflight check (#1486 )	2026-03-15 21:09:07 -07:00
teknium1	210d5ade1e	feat(tools): centralize tool emoji metadata in registry + skin integration - Add 'emoji' field to ToolEntry and 'get_emoji()' to ToolRegistry - Add emoji= to all 50+ registry.register() calls across tool files - Add get_tool_emoji() helper in agent/display.py with 3-tier resolution: skin override → registry default → hardcoded fallback - Replace hardcoded emoji maps in run_agent.py, delegate_tool.py, and gateway/run.py with centralized get_tool_emoji() calls - Add 'tool_emojis' field to SkinConfig so skins can override per-tool emojis (e.g. ares skin could use swords instead of wrenches) - Add 11 tests (5 registry emoji, 6 display/skin integration) - Update AGENTS.md skin docs table Based on the approach from PR #1061 by ForgingAlex (emoji centralization in registry). This salvage fixes several issues from the original: - Does NOT split the cronjob tool (which would crash on missing schemas) - Does NOT change image_generate toolset/requires_env/is_async - Does NOT delete existing tests - Completes the centralization (gateway/run.py was missed) - Hooks into the skin system for full customizability	2026-03-15 20:21:21 -07:00
teknium1	33ebedc76d	feat: enable persistent shell by default for SSH, add config option SSH persistent shell now defaults to true — non-local backends benefit most from state persistence across execute() calls. Local backend remains opt-in via TERMINAL_LOCAL_PERSISTENT env var. New config.yaml option: terminal.persistent_shell (default: true) Controls the default for non-local backends. Users can disable with: hermes config set terminal.persistent_shell false Precedence: per-backend env var > TERMINAL_PERSISTENT_SHELL > default. Wired through cli.py, gateway/run.py, and hermes_cli/config.py so the config.yaml value reaches terminal_tool via env var bridge.	2026-03-15 20:17:13 -07:00
teknium1	5b80654198	feat(tools): add persistent shell mode to local and SSH backends Cherry-picked from PR #1067 by alt-glitch. Adds PersistentShellMixin with file-based IPC protocol for long-lived bash shells. LocalEnvironment and SSHEnvironment gain persistent=True option. Controlled via TERMINAL_LOCAL_PERSISTENT / TERMINAL_SSH_PERSISTENT env vars. Fixes latent stderr pipe buffer deadlock. Co-authored-by: alt-glitch <balyan.sid@gmail.com>	2026-03-15 20:13:02 -07:00
Teknium	471c663fdf	fix(cli): silence tirith prefetch install warnings at startup (#1452 )	2026-03-15 18:07:03 -07:00
Teknium	64d333204b	Merge pull request #1242 from NousResearch/fix/file-tool-log-noise fix: reduce file tool log noise	2026-03-15 11:11:18 -07:00
alt-glitch	4511322f56	Merge origin/main into sid/persistent-backend Resolve conflict in local.py: keep refactored _make_run_env helper over inline _sanitize_subprocess_env logic.	2026-03-15 21:08:11 +05:30
teyrebaz33	20f381cfb6	fix: preserve thread context for cronjob deliver=origin When a cronjob is created from within a Telegram or Slack thread, deliver=origin was posting to the parent channel instead of the thread. Root cause: the gateway never set HERMES_SESSION_THREAD_ID in the session environment, so cronjob_tools.py could not capture thread_id into the job's origin metadata — even though the scheduler already reads origin.get('thread_id'). Fix: - gateway/run.py: set HERMES_SESSION_THREAD_ID when thread_id is present on the session context, and clear it in _clear_session_env - tools/cronjob_tools.py: read HERMES_SESSION_THREAD_ID into origin Closes #1219	2026-03-15 06:57:00 -07:00
teknium1	b177b4abad	fix(security): block gateway and tool env vars in subprocesses Extend subprocess env sanitization beyond provider credentials by blocking Hermes-managed tool, messaging, and related gateway runtime vars. Reuse a shared sanitizer in LocalEnvironment and ProcessRegistry so background and PTY processes honor the same blocklist and _HERMES_FORCE_ escape hatch. Add regression coverage for local env execution and process_registry spawning.	2026-03-15 02:51:04 -07:00
Teknium	fd0e1aac72	Merge pull request #1400 from NousResearch/hermes/hermes-45b79a59-clawhub-search fix: harden ClawHub skill search exact matches	2026-03-14 23:17:24 -07:00
teknium1	8ccd14a0d4	fix: improve clawhub skill search matching	2026-03-14 23:15:04 -07:00
teknium1	df9020dfa3	fix: harden clawhub skill search exact matches	2026-03-14 22:31:09 -07:00
Teknium	c6fb7f6463	Merge pull request #1399 from NousResearch/hermes/hermes-629f8bde fix(#1002): expand environment blocklist for terminal isolation	2026-03-14 22:30:05 -07:00
teknium1	672dc1666f	test: cover extra provider env blocklist vars	2026-03-14 22:29:35 -07:00
Teknium	5b11570517	Merge pull request #1398 from NousResearch/hermes/hermes-1b6f4583 fix(cron): support per-job runtime overrides	2026-03-14 22:29:30 -07:00
Synergy	28b3764d1e	fix(cron): support per-job runtime overrides Salvaged from PR #1292 onto current main. Preserve per-job model, provider, and base_url overrides in cron execution, persist them in job records, expose them through the cronjob tool create/update paths, and add regression coverage. Deliberately does not persist per-job api_key values.	2026-03-14 22:22:31 -07:00
Teknium	62f1c2b622	Merge pull request #1397 from NousResearch/hermes/hermes-629f8bde fix: escape parens and braces in fork bomb regex pattern	2026-03-14 22:17:16 -07:00
Teknium	84d99f7754	Merge pull request #1394 from NousResearch/hermes/hermes-eca4a640 fix: honor stt.enabled false across gateway transcription	2026-03-14 22:11:47 -07:00
teknium1	d5b64ebdb3	fix: preserve legacy approval keys after pattern key migration	2026-03-14 22:10:39 -07:00
teknium1	f8ceadbad0	fix: propagate STT disable through shared transcription config - add stt.enabled to the default user config - make transcription_tools respect the disabled flag globally - surface disabled state cleanly in voice mode diagnostics - add regression coverage for disabled STT provider selection	2026-03-14 22:09:59 -07:00
0xbyt4	4a93cfd889	fix: use description as pattern_key to prevent approval collisions pattern_key was derived by splitting the regex on \b and taking [1], so patterns starting with the same word (e.g. find -exec rm and find -delete) produced the same key "find". Approving one silently approved the other. Using the unique description string as the key eliminates all collisions.	2026-03-14 22:07:58 -07:00
0xbyt4	e6417cb7bc	fix: escape parens and braces in fork bomb regex pattern The fork bomb regex used `()` (empty capture group) and unescaped `{}` instead of literal `\(\)` and `\{\}`. This meant the classic fork bomb `:(){ :\|:& };:` was never detected. Also added `\s*` between `:` and `&` and between `;` and trailing `:` to catch whitespace variants.	2026-03-14 22:06:44 -07:00
Teknium	f9a61a0d9e	Merge pull request #1383 from NousResearch/hermes/hermes-7ef7cb6a fix: add project root to PYTHONPATH in execute_code sandbox	2026-03-14 21:41:50 -07:00
teknium1	0614969f7b	test: cover repo-root imports in execute_code sandbox	2026-03-14 21:41:12 -07:00
teknium1	f6ff6639e8	fix: complete salvaged cronjob dependency check Add regression coverage for cronjob availability and import shutil for the crontab PATH check added from PR #1380.	2026-03-14 21:39:59 -07:00
teknium1	9f6bccd76a	feat: add direct endpoint overrides for auxiliary and delegation Add base_url/api_key overrides for auxiliary tasks and delegation so users can route those flows straight to a custom OpenAI-compatible endpoint without having to rely on provider=main or named custom providers. Also clear gateway session env vars in test isolation so the full suite stays deterministic when run from a messaging-backed agent session.	2026-03-14 21:11:37 -07:00
Teknium	88a48037d1	Merge pull request #1367 from NousResearch/hermes/hermes-aa701810 refactor: unify vision backend gating	2026-03-14 20:31:58 -07:00
teknium1	dc11b86e4b	refactor: unify vision backend gating	2026-03-14 20:22:13 -07:00
teknium1	3229e434b8	Merge origin/main into hermes/hermes-5d160594	2026-03-14 19:34:05 -07:00
teknium1	2536ff328b	fix: prefer prompt names for multi-skill cron jobs	2026-03-14 19:28:52 -07:00
teknium1	c3ea620796	feat: add multi-skill cron editing and docs	2026-03-14 19:18:10 -07:00
teknium1	7b140b31e6	fix: suppress duplicate cron sends to auto-delivery targets Allow cron runs to keep using send_message for additional destinations, but skip same-target sends when the scheduler will already auto-deliver the final response there. Add prompt/tool guidance, docs, and regression coverage for origin/home-channel resolution and thread-aware comparisons.	2026-03-14 19:07:50 -07:00
alt-glitch	879b7d3fbf	fix(tests): update mock stdout in env blocklist tests The fake_popen mock used iter([]) for proc.stdout which doesn't support .close(). Use MagicMock with __iter__ instead, since _drain_stdout now calls proc.stdout.close() in its finally block. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-03-15 02:48:05 +05:30
balyan.sid@gmail.com	9001b34146	simplify docstrings, fix some bugs	2026-03-15 01:20:42 +05:30
balyan.sid@gmail.com	861202b56c	wip: add persistent shell to ssh and local terminal backends	2026-03-15 01:20:42 +05:30
teknium1	df5c61b37c	feat: compress cron management into one tool	2026-03-14 12:21:50 -07:00
Teknium	1114841a2c	Merge pull request #1329 from NousResearch/hermes/hermes-2f2b4807 fix: tighten memory and session recall guidance	2026-03-14 11:38:54 -07:00
teknium1	5319bb6ac4	fix: tighten memory and session recall guidance Remove diary-style memory framing from the system prompt and memory tool schema, explicitly steer task/session logs to session_search, and clarify that session_search is for cross-session recall after checking the current conversation first. Add regression tests for the updated guidance text.	2026-03-14 11:36:47 -07:00
Teknium	80a243efe6	Merge pull request #1333 from NousResearch/hermes/hermes-1fc28d17 fix: improve browser cleanup, local browser PATH setup, and screenshot recovery	2026-03-14 11:36:09 -07:00
Dave Tist	895fe5a5d3	Fix browser cleanup consistency and screenshot recovery Unify browser session teardown so manual close, inactivity cleanup, and emergency shutdown all follow the same cleanup path instead of partially duplicating logic. This changes browser_close() to delegate to cleanup_browser(), which means recording shutdown, Browserbase release, activity bookkeeping cleanup, and local socket-directory removal now happen consistently. It also updates emergency cleanup to route through cleanup_all_browsers() and explicitly clear in-memory tracking state after teardown so stale active-session, last-activity, and recording entries are not left behind on exit. The screenshot fallback path has also been fixed. _extract_screenshot_path_from_text() now matches real absolute PNG paths, including quoted output, so browser_vision() can recover screenshots when agent-browser emits human-readable text instead of JSON. Regression coverage was added in tests/tools/test_browser_cleanup.py for screenshot path extraction, cleanup_browser() state removal, browser_close() delegation, and emergency cleanup state clearing. Verified with: - python -m pytest tests/tools/test_browser_cleanup.py -q - python -m pytest tests/tools/test_browser_console.py tests/gateway/test_send_image_file.py -q	2026-03-14 11:28:26 -07:00
Stable Genius	3325e51e53	fix(skills): honor policy table for dangerous verdicts Salvaged from PR #1007 by stablegenius49. - let INSTALL_POLICY decide dangerous verdict handling for builtin skills - allow --force to override blocked dangerous decisions for trusted and community sources - accept --yes / -y as aliases for --force in /skills install - update regression tests to match the intended policy precedence	2026-03-14 11:27:02 -07:00
Teknium	681f1068ea	Merge pull request #1303 from NousResearch/hermes/hermes-aa653753 feat(skills): integrate skills.sh as a hub source	2026-03-14 09:48:18 -07:00
teknium1	05770520af	test(skills): isolate well-known cache in adapter tests Prevent the mocked well-known adapter tests from sharing index-cache state across runs or xdist workers.	2026-03-14 08:24:59 -07:00
teknium1	43d25af964	feat(skills): add update checks and well-known support Round out the skills hub integration with: - richer skills.sh metadata and security surfacing during inspect/install - generic check/update flows for hub-installed skills - support for well-known Agent Skills endpoints via /.well-known/skills/index.json Also persist upstream bundle metadata in the lock file and add regression coverage plus live-compatible path handling for both skills.sh aliases and well-known endpoints.	2026-03-14 08:21:16 -07:00
Teknium	707f3ff41f	refactor: tighten MoA traceback logging scope (#1307 ) * improve: add exc_info to MoA error logging * refactor: tighten MoA traceback logging scope Follow up on salvaged PR #998 by limiting exc_info logging to terminal failure paths, avoiding duplicate aggregator errors, and refreshing the MoA default OpenRouter model lineup to current frontier options. --------- Co-authored-by: aydnOktay <xaydinoktay@gmail.com>	2026-03-14 07:53:56 -07:00
teknium1	02c307b004	fix(skills): resolve skills.sh alias installs Harden the skills.sh hub adapter by parsing skill detail pages when search slugs do not map cleanly onto GitHub skill folder names. This adds detail-page resolution for alias-style skills, improves inspect metadata from the page itself, and covers the behavior with regression tests plus live smoke validation for json-render-react.	2026-03-14 06:50:25 -07:00
Teknium	95c0bee7f8	Merge pull request #1299 from NousResearch/hermes/hermes-f5fb1d3b fix: salvage PR #327 voice mode onto current main	2026-03-14 06:45:20 -07:00
teknium1	483a0b5233	feat(skills): integrate skills.sh as a hub source Add a skills.sh-backed source adapter for the Hermes Skills Hub. The new adapter uses skills.sh search results for discovery, falls back to featured homepage links for browse-style queries, and resolves installs / inspects through the underlying GitHub repo using common Agent Skills layout conventions. Also expose skills-sh in CLI source filters and add regression coverage for search, alias resolution, and source routing.	2026-03-14 06:23:36 -07:00
teknium1	04e151714f	feat(mcp): make selective tool loading capability-aware Extend the salvaged MCP filtering work so utility tools are also governed by policy and server capabilities. Store the registered tool subset per server so rediscovery and status reporting stay accurate after filtering.	2026-03-14 06:22:02 -07:00
teyrebaz33	3198cc8fd9	feat(mcp): per-server tool filtering via include/exclude and enabled flag Add optional config keys under each mcp_servers entry: - tools.include: whitelist, only listed tools are registered - tools.exclude: blacklist, all tools except listed are registered - enabled: false: skip server entirely, no connection attempt Backward-compatible: no config keys = all tools registered as before. Tests: TestMCPSelectiveToolLoading (4 tests), 134 passed total.	2026-03-14 06:12:17 -07:00
Oktay Aydin	00a0f18544	fix: clearer terminal backend requirement errors Salvaged from PR #979 onto current main. Preserve the current terminal backend checks while surfacing actionable preflight errors for unknown TERMINAL_ENV values, missing SSH host/user configuration, and missing Modal credentials/config. Tighten the modal regression test so it deterministically exercises the config-missing path.	2026-03-14 06:04:39 -07:00
teknium1	523a1b6faf	merge: salvage PR #327 voice mode branch Merge contributor branch feature/voice-mode onto current main for follow-up fixes.	2026-03-14 06:03:07 -07:00
Teknium	b646440ca0	fix(mcp): resolve npx stdio connection failures (#1291 ) Salvaged from PR #977 onto current main. Preserves the MCP stdio command resolution and improved error diagnostics, with deterministic regression tests for the npx/node PATH cases. Co-authored-by: kshitij <82637225+kshitijk4poor@users.noreply.github.com>	2026-03-14 05:44:00 -07:00
0xbyt4	eb34c0b09a	fix: voice pipeline hardening — 7 bug fixes with tests 1. Anthropic + ElevenLabs TTS silence: forward full response to TTS callback for non-streaming providers (choices first, then native content blocks fallback). 2. Subprocess timeout kill: play_audio_file now kills the process on TimeoutExpired instead of leaving zombie processes. 3. Discord disconnect cleanup: leave all voice channels before closing the client to prevent leaked state. 4. Audio stream leak: close InputStream if stream.start() fails. 5. Race condition: read/write _on_silence_stop under lock in audio callback thread. 6. _vprint force=True: show API error, retry, and truncation messages even during streaming TTS. 7. _refresh_level lock: read _voice_recording under _voice_lock.	2026-03-14 14:27:21 +03:00
0xbyt4	35748a2fb0	fix: address PR review round 4 — remove web UI, fix audio/import/interface issues Remove web UI gateway (web.py, tests, docs, toolset, env vars, Platform.WEB enum) per maintainer request — Nous is building their own official chat UI. Fix 1: Replace sd.wait() with polling pattern in play_audio_file() to prevent indefinite hang when audio device stalls (consistent with play_beep()). Fix 2: Use importlib.util.find_spec() for faster_whisper/openai availability checks instead of module-level imports that trigger heavy native library loading (CUDA/cuDNN) at import time. Fix 3: Remove inspect.signature() hack in _send_voice_reply() — add **kwargs to Telegram send_voice() so all adapters accept metadata uniformly. Fix 4: Make session loading resilient to removed platform enum values — skip entries with unknown platforms instead of crashing the entire gateway.	2026-03-14 14:27:21 +03:00
0xbyt4	69cb373864	fix: update /voice status to show correct STT provider Voice status was hardcoded to check API keys only. Now uses the actual provider resolution (local/groq/openai) so it correctly shows "local faster-whisper" when installed instead of "Groq" or "MISSING".	2026-03-14 14:27:21 +03:00
0xbyt4	b8f8d3ef9e	feat: integrate faster-whisper local STT with three-provider fallback Merge main's faster-whisper (local, free) with our Groq support into a unified three-provider STT pipeline: local > groq > openai. Provider priority ensures free options are tried first. Each provider has its own transcriber function with model auto-correction, env- overridable endpoints, and proper error handling. 74 tests cover the full provider matrix, fallback chains, model correction, config loading, validation edge cases, and dispatch.	2026-03-14 14:27:21 +03:00
0xbyt4	2c84979d77	refactor: extract get_stt_model_from_config helper to eliminate DRY violation Duplicated YAML config parsing for stt.model existed in gateway/run.py and gateway/platforms/discord.py. Moved to a single helper in transcription_tools.py and added 5 tests covering all edge cases.	2026-03-14 14:27:21 +03:00
0xbyt4	34c324ff59	fix(test): use real _strip_markdown_for_tts instead of duplicated copy - Import from tools.tts_tool instead of reimplementing the logic - Fix test_truncates_long_text: truncation is the caller's job, not the function's - Remove unused re import	2026-03-14 14:27:20 +03:00
0xbyt4	eb79dda04b	fix: persistent audio stream and silence detection improvements - Keep InputStream alive across recordings to avoid CoreAudio hang on repeated open/close cycles on macOS. New _ensure_stream() creates the stream once; start()/stop()/cancel() only toggle frame collection. - Add _close_stream_with_timeout() with daemon thread to prevent stream.stop()/close() from blocking indefinitely. - Add generation counter to detect stale stream-open completions after cancel or restart. - Run recorder.cancel() in background thread from Ctrl+C handler to keep the event loop responsive. - Add shutdown() method called on /voice off to release audio resources. - Fix silence timer reset during active speech: use dip tolerance for _resume_start tracker so natural speech pauses (< 0.3s) don't prevent the silence timer from being reset. - Update tests to match persistent stream behavior.	2026-03-14 14:27:20 +03:00
0xbyt4	eec04d180a	fix(test): update play_beep test to match polling-based implementation play_beep was changed from sd.wait() to a poll loop + sd.stop() in 302e1fe but the test was not updated. Now asserts sd.stop() instead of sd.wait().	2026-03-14 14:27:20 +03:00
0xbyt4	9d58cafec9	fix: move process_loop voice restart to daemon thread, use _cprint consistently - process_loop's continuous mode restart called _voice_start_recording() directly, blocking the loop if play_beep/sd.wait hangs — queued user input would stall silently. Dispatch to daemon thread like Ctrl+B handler. - Replace print() with _cprint() in _handle_voice_command for consistency with the rest of the voice mode code.	2026-03-14 14:27:20 +03:00
0xbyt4	d0e3b39e69	fix: prevent Ctrl+B key handler from blocking prompt_toolkit event loop The handle_voice_record key binding runs in prompt_toolkit's event-loop thread. When silence auto-stopped recording, _voice_recording was False but recorder.stop() still held AudioRecorder._lock. A concurrent Ctrl+B press entered the START path and blocked on that lock, freezing all keyboard input. Three changes: - Set _voice_processing atomically with _voice_recording=False in _voice_stop_and_transcribe to close the race window - Add _voice_processing guard in the START path to prevent starting while stop/transcribe is still running - Dispatch _voice_start_recording to a daemon thread so play_beep (sd.wait) and AudioRecorder.start (lock acquire) never block the event loop	2026-03-14 14:27:20 +03:00
0xbyt4	ecc3dd7c63	test: add comprehensive voice mode test coverage (86 tests) - Add TestStreamingApiCall (11 tests) for _streaming_api_call in test_run_agent.py - Add regression tests for all 7 bug fixes (edge_tts lazy import, output_stream cleanup, ctrl+c continuous reset, disable stops TTS, config key, chat cleanup, browser_tool signal handler removal) - Add real behavior tests for CLI voice methods via _make_voice_cli() fixture: TestHandleVoiceCommandReal (7), TestEnableVoiceModeReal (7), TestDisableVoiceModeReal (6), TestVoiceSpeakResponseReal (7), TestVoiceStopAndTranscribeReal (12)	2026-03-14 14:27:20 +03:00
0xbyt4	6e51729c4c	fix: remove browser_tool signal handlers that cause voice mode deadlock browser_tool.py registered SIGINT/SIGTERM handlers that called sys.exit() at module import time. When a signal arrived during a lock acquisition (e.g. AudioRecorder._lock in voice mode), SystemExit was raised inside prompt_toolkit's async event loop, corrupting coroutine state and making the process unkillable (required SIGKILL). atexit handler already ensures browser sessions are cleaned up on any normal exit path, so the signal handlers were redundant and harmful.	2026-03-14 14:27:20 +03:00
0xbyt4	ddfd6e0c59	fix: resolve 6 voice mode bugs found during audit - edge_tts NameError: _generate_edge_tts now calls _import_edge_tts() instead of referencing bare module name (tts_tool.py) - TTS thread leak: chat() finally block sends sentinel to text_queue, sets stop_event, and joins tts_thread on exception paths (cli.py) - output_stream leak: moved close() into finally block so audio device is released even on exception (tts_tool.py) - Ctrl+C continuous mode: cancel handler now resets _voice_continuous to prevent auto-restart after user cancels recording (cli.py) - _disable_voice_mode: now calls stop_playback() and sets _voice_tts_done so TTS stops when voice mode is turned off (cli.py) - _show_voice_status: reads record key from config instead of hardcoding Ctrl+B (cli.py)	2026-03-14 14:27:20 +03:00
0xbyt4	a78249230c	fix: address voice mode PR review (streaming TTS, prompt cache, _vprint) Bug A: Replace stale _HAS_ELEVENLABS/_HAS_AUDIO boolean imports with lazy import function calls (_import_elevenlabs, _import_sounddevice). The old constants no longer exist in tts_tool -- the try/except silently swallowed the ImportError, leaving streaming TTS dead. Bug B: Use user message prefix instead of modifying system prompt for voice mode instruction. Changing ephemeral_system_prompt mid-session invalidates the prompt cache. Now the concise-response hint is prepended to the user_message passed to run_conversation while conversation_history keeps the original text. Minor: Add force parameter to _vprint so critical error messages (max retries, non-retryable errors, API failures) are always shown even during streaming TTS playback. Tests: 15 new tests in test_voice_cli_integration.py covering all three fixes -- lazy import activation, message prefix behavior, history cleanliness, system prompt stability, and AST verification that all critical _vprint calls use force=True.	2026-03-14 14:27:20 +03:00
0xbyt4	b859dfab16	fix: address voice mode review feedback 1. Fully lazy imports: sounddevice, numpy, elevenlabs, edge_tts, and openai are never imported at module level. Each is imported only when the feature is explicitly activated, preventing crashes in headless environments (SSH, Docker, WSL, no PortAudio). 2. No core agent loop changes: streaming TTS path extracted from _interruptible_api_call() into separate _streaming_api_call() method. The original method is restored to its upstream form. 3. Configurable key binding: push-to-talk key changed from Ctrl+R (conflicts with readline reverse-search) to Ctrl+B by default. Configurable via voice.push_to_talk_key in config.yaml. 4. Environment detection: new detect_audio_environment() function checks for SSH, Docker, WSL, and missing audio devices before enabling voice mode. Auto-disables with clear warnings in incompatible environments. 5. Graceful degradation: every audio touchpoint (sd.play, sd.InputStream, sd.OutputStream) wrapped in try/except with ImportError/OSError handling. Failures produce warnings, not crashes.	2026-03-14 14:27:20 +03:00
0xbyt4	dad865e920	fix: fix silence detection bugs and add Phase 4 voice mode features Fix 3 critical bugs in silence detection: - Micro-pause tolerance now tracks dip duration (not time since speech start) - Peak RMS check in stop() prevents discarding recordings with real speech - Reduced min_speech_duration from 0.5s to 0.3s for reliable speech confirmation Phase 4 features: configurable silence params, visual audio level indicator, voice system prompt, tool call audio cues, TTS interrupt, continuous mode auto-restart, interruptable playback via Popen tracking.	2026-03-14 14:26:30 +03:00
0xbyt4	32b033c11c	feat: add silence filter, hallucination guard, and continuous mode control - Skip silent recordings before STT call (RMS check in AudioRecorder.stop) - Filter known Whisper hallucinations ("Thank you.", "Bye." etc.) - Continuous mode: Ctrl+R starts loop, Ctrl+R during recording exits it - Wait for TTS to finish before auto-restart to avoid recording speaker - Silence timeout increased to 3s for natural pauses - Tests: hallucination filter, silent recording skip, real speech passthrough	2026-03-14 14:25:28 +03:00
0xbyt4	bfd9c97705	feat: add Phase 4 low-latency features for voice mode - Audio cues: beep on record start (880Hz), double beep on stop (660Hz) - Silence detection: auto-stop recording after 3s of silence (RMS-based) - Continuous mode: auto-restart recording after agent responds - Ctrl+R starts continuous mode, Ctrl+R during recording exits it - Waits for TTS to finish before restarting to avoid recording speaker - Tests: 7 new tests for beep generation and silence detection	2026-03-14 14:25:28 +03:00
0xbyt4	a69bd55b5a	fix: isolate GROQ_API_KEY in test_missing_stt_key test The test was failing because GROQ_API_KEY leaked from the environment. Now both VOICE_TOOLS_OPENAI_KEY and GROQ_API_KEY are removed to properly test the "no STT key" scenario.	2026-03-14 14:25:28 +03:00
0xbyt4	c23928d089	fix: improve voice mode robustness and add integration tests - Show TTS errors to user instead of silently logging - Improve markdown stripping: code blocks, URLs, links, horizontal rules - Fix stripping order: process markdown links before removing URLs - Add threading.Lock for voice state variables (cross-thread safety) - Add 14 CLI integration tests (markdown stripping, command parsing, thread safety) - Total: 47 voice-related tests	2026-03-14 14:25:28 +03:00
0xbyt4	37b01ab964	test: add transcription_tools tests for multi-provider STT - Provider resolution: OpenAI priority, Groq fallback, no keys - Model auto-correction: Groq corrects OpenAI models and vice versa - Success path: transcription, API errors, whitespace stripping - 12 new tests, 33 total voice-related tests	2026-03-14 14:25:28 +03:00
0xbyt4	1a6fbef8a9	feat: add voice mode with push-to-talk and TTS output for CLI Implements Issue #314 Phase 2 & 3: - /voice command to toggle voice mode (on/off/tts/status) - Ctrl+Space push-to-talk recording via sounddevice - Whisper STT transcription via existing transcription_tools - Optional TTS response playback via existing tts_tool - Visual indicators in prompt (recording/transcribing/voice) - 21 unit tests, all mocked (no real mic/API) - Optional deps: sounddevice, numpy (pip install hermes-agent[voice])	2026-03-14 14:25:28 +03:00
teknium1	5c9a84219d	fix: complete send_message MEDIA delivery salvage - prevent raw MEDIA tag leakage outside the gateway pipeline - make extract_media handle quoted/backticked paths and optional whitespace - send Telegram media natively with explicit error/warning handling - add regression tests for Telegram media dispatch and MEDIA parsing	2026-03-14 04:02:03 -07:00
teknium1	96c250e538	test: cover pipe characters in v4a patch apply Add a regression test for apply_v4a_operations when read content contains a literal pipe character outside a line-number prefix.	2026-03-14 03:54:46 -07:00
Teknium	6036793f60	fix: clearer docker backend preflight errors (#1276 ) * feat: improve context compaction handoff summaries Adapt PR #916 onto current main by replacing the old context summary marker with a clearer handoff wrapper, updating the summarization prompt for resume-oriented summaries, and preserving the current call_llm-based compression path. * fix: clearer error when docker backend is unavailable * fix: preserve docker discovery in backend preflight Follow up on salvaged PR #940 by reusing find_docker() during the new availability check so non-PATH Docker Desktop installs still work. Add a regression test covering the resolved executable path. --------- Co-authored-by: aydnOktay <xaydinoktay@gmail.com>	2026-03-14 02:53:02 -07:00
teknium1	6f1889b0fa	fix: preserve current approval semantics for tirith guard Restore gateway/run.py to current main behavior while keeping tirith startup and pattern_keys replay, preserve yolo and non-interactive bypass semantics in the combined guard, and add regression tests for yolo and view-full flows.	2026-03-14 00:17:04 -07:00
sheeki003	375ce8a881	feat(security): add tirith pre-exec command scanning Integrate tirith as a pre-execution security scanner that detects homograph URLs, pipe-to-interpreter patterns, terminal injection, zero-width Unicode, and environment variable manipulation — threats the existing 50-pattern dangerous command detector doesn't cover. Architecture: gather-then-decide — both tirith and the dangerous command detector run before any approval prompt, preventing gateway force=True replay from bypassing one check when only the other was shown to the user. New files: - tools/tirith_security.py: subprocess wrapper with auto-installer, mandatory cosign provenance verification, non-blocking background download, disk-persistent failure markers with retryable-cause tracking (cosign_missing auto-clears when cosign appears on PATH) - tests/tools/test_tirith_security.py: 62 tests covering exit code mapping, fail_open, cosign verification, background install, HERMES_HOME isolation, and failure recovery - tests/tools/test_command_guards.py: 21 integration tests for the combined guard orchestration Modified files: - tools/approval.py: add check_all_command_guards() orchestrator, add allow_permanent parameter to prompt_dangerous_approval() - tools/terminal_tool.py: replace _check_dangerous_command with consolidated check_all_command_guards - cli.py: update _approval_callback for allow_permanent kwarg, call ensure_installed() at startup - gateway/run.py: iterate pattern_keys list on replay approval, call ensure_installed() at startup - hermes_cli/config.py: add security config defaults, split commented sections for independent fallback - cli-config.yaml.example: document tirith security config	2026-03-14 00:11:27 -07:00
Teknium	a20d373945	fix: worktree-aware minisweagent path discovery + clean up requirements check (#1248 ) Salvage of PR #1246 by ChatGPT (teknium1 session), resolved against current main which already includes #1239. Changes: - Add minisweagent_path.py: worktree-aware helper that finds mini-swe-agent/src from either the current checkout or the main checkout behind a git worktree - Use the helper in tools/terminal_tool.py and mini_swe_runner.py instead of naive path-relative lookup that fails in worktrees - Clean up check_terminal_requirements(): - local: return True (no minisweagent dep, per #1239) - singularity/ssh: remove unnecessary minisweagent imports - docker/modal: use importlib.util.find_spec with clear error - Add regression tests for worktree path discovery and tool resolution	2026-03-13 23:39:51 -07:00
Teknium	21422dba44	Merge pull request #1239 from NousResearch/hermes/hermes-07d947aa fix: stop local terminal warning without minisweagent	2026-03-13 22:14:44 -07:00
teknium1	b59da08730	fix: reduce file tool log noise - treat git diff --cached --quiet rc=1 as an expected checkpoint state instead of logging it as an error - downgrade expected write PermissionError/EROFS/EACCES failures out of error logging while keeping unexpected exceptions at error level - add regression tests for both logging behaviors	2026-03-13 22:14:00 -07:00
teknium1	329f83ff2d	fix: stop local terminal warning without minisweagent	2026-03-13 22:00:36 -07:00
0xIbra	437ec17125	fix(cli): respect HERMES_HOME in all remaining hardcoded ~/.hermes paths Several files resolved paths via Path.home() / ".hermes" or os.path.expanduser("~/.hermes/..."), bypassing the HERMES_HOME environment variable. This broke isolation when running multiple Hermes instances with distinct HERMES_HOME directories. Replace all hardcoded paths with calls to get_hermes_home() from hermes_cli.config, consistent with the rest of the codebase. Files fixed: - tools/process_registry.py (processes.json) - gateway/pairing.py (pairing/) - gateway/sticker_cache.py (sticker_cache.json) - gateway/channel_directory.py (channel_directory.json, sessions.json) - gateway/config.py (gateway.json, config.yaml, sessions_dir) - gateway/mirror.py (sessions/) - gateway/hooks.py (hooks/) - gateway/platforms/base.py (image_cache/, audio_cache/, document_cache/) - gateway/platforms/whatsapp.py (whatsapp/session) - gateway/delivery.py (cron/output) - agent/auxiliary_client.py (auth.json) - agent/prompt_builder.py (SOUL.md) - cli.py (config.yaml, images/, pastes/, history) - run_agent.py (logs/) - tools/environments/base.py (sandboxes/) - tools/environments/modal.py (modal_snapshots.json) - tools/environments/singularity.py (singularity_snapshots.json) - tools/tts_tool.py (audio_cache) - hermes_cli/status.py (cron/jobs.json, sessions.json) - hermes_cli/gateway.py (logs/, whatsapp session) - hermes_cli/main.py (whatsapp/session) Tests updated to use HERMES_HOME env var instead of patching Path.home(). Closes #892 (cherry picked from commit 78ac1bba43b8b74a934c6172f2c29bb4d03164b9)	2026-03-13 21:32:53 -07:00
Teknium	07927f6bf2	feat(stt): add free local whisper transcription via faster-whisper (#1185 ) * fix: Home Assistant event filtering now closed by default Previously, when no watch_domains or watch_entities were configured, ALL state_changed events passed through to the agent, causing users to be flooded with notifications for every HA entity change. Now events are dropped by default unless the user explicitly configures: - watch_domains: list of domains to monitor (e.g. climate, light) - watch_entities: list of specific entity IDs to monitor - watch_all: true (new option — opt-in to receive all events) A warning is logged at connect time if no filters are configured, guiding users to set up their HA platform config. All 49 gateway HA tests + 52 HA tool tests pass. * docs: update Home Assistant integration documentation - homeassistant.md: Fix event filtering docs to reflect closed-by-default behavior. Add watch_all option. Replace Python dict config example with YAML. Fix defaults table (was incorrectly showing 'all'). Add required configuration warning admonition. - environment-variables.md: Add HASS_TOKEN and HASS_URL to Messaging section. - messaging/index.md: Add Home Assistant to description, architecture diagram, platform toolsets table, and Next Steps links. * fix(terminal): strip provider env vars from background and PTY subprocesses Extends the env var blocklist from #1157 to also cover the two remaining leaky paths in process_registry.py: - spawn_local() PTY path (line 156) - spawn_local() background Popen path (line 197) Both were still using raw os.environ, leaking provider vars to background processes and interactive PTY sessions. Now uses the same dynamic _HERMES_PROVIDER_ENV_BLOCKLIST from local.py. Explicit env_vars passed to spawn_local() still override the blocklist, matching the existing behavior for callers that intentionally need these. Gap identified by PR #1004 (@PeterFile). * feat(delegate): add observability metadata to subagent results Enrich delegate_task results with metadata from the child AIAgent: - model: which model the child used - exit_reason: completed \| interrupted \| max_iterations - tokens.input / tokens.output: token counts - tool_trace: per-tool-call trace with byte sizes and ok/error status Tool trace uses tool_call_id matching to correctly pair parallel tool calls with their results, with a fallback for messages without IDs. Cherry-picked from PR #872 by @omerkaz, with fixes: - Fixed parallel tool call trace pairing (was always updating last entry) - Removed redundant 'iterations' field (identical to existing 'api_calls') - Added test for parallel tool call trace correctness Co-authored-by: omerkaz <omerkaz@users.noreply.github.com> * feat(stt): add free local whisper transcription via faster-whisper Replace OpenAI-only STT with a dual-provider system mirroring the TTS architecture (Edge TTS free / ElevenLabs paid): STT: faster-whisper local (free, default) / OpenAI Whisper API (paid) Changes: - tools/transcription_tools.py: Full rewrite with provider dispatch, config loading, local faster-whisper backend, and OpenAI API backend. Auto-downloads model (~150MB for 'base') on first voice message. Singleton model instance reused across calls. - pyproject.toml: Add faster-whisper>=1.0.0 as core dependency - hermes_cli/config.py: Expand stt config to match TTS pattern with provider selection and per-provider model settings - agent/context_compressor.py: Fix .strip() crash when LLM returns non-string content (dict from llama.cpp, None). Fixes #1100 partially. - tests/: 23 new tests for STT providers + 2 for compressor fix - docs/: Updated Voice & TTS page with STT provider table, model sizes, config examples, and fallback behavior Fallback behavior: - Local not installed → OpenAI API (if key set) - OpenAI key not set → local whisper (if installed) - Neither → graceful error message to user Co-authored-by: Jah-yee <Jah-yee@users.noreply.github.com> --------- Co-authored-by: omerkaz <omerkaz@users.noreply.github.com> Co-authored-by: Jah-yee <Jah-yee@users.noreply.github.com>	2026-03-13 11:11:05 -07:00
Teknium	02a819b16e	feat(delegate): add observability metadata to subagent results (#1175 ) * fix: Home Assistant event filtering now closed by default Previously, when no watch_domains or watch_entities were configured, ALL state_changed events passed through to the agent, causing users to be flooded with notifications for every HA entity change. Now events are dropped by default unless the user explicitly configures: - watch_domains: list of domains to monitor (e.g. climate, light) - watch_entities: list of specific entity IDs to monitor - watch_all: true (new option — opt-in to receive all events) A warning is logged at connect time if no filters are configured, guiding users to set up their HA platform config. All 49 gateway HA tests + 52 HA tool tests pass. * docs: update Home Assistant integration documentation - homeassistant.md: Fix event filtering docs to reflect closed-by-default behavior. Add watch_all option. Replace Python dict config example with YAML. Fix defaults table (was incorrectly showing 'all'). Add required configuration warning admonition. - environment-variables.md: Add HASS_TOKEN and HASS_URL to Messaging section. - messaging/index.md: Add Home Assistant to description, architecture diagram, platform toolsets table, and Next Steps links. * fix(terminal): strip provider env vars from background and PTY subprocesses Extends the env var blocklist from #1157 to also cover the two remaining leaky paths in process_registry.py: - spawn_local() PTY path (line 156) - spawn_local() background Popen path (line 197) Both were still using raw os.environ, leaking provider vars to background processes and interactive PTY sessions. Now uses the same dynamic _HERMES_PROVIDER_ENV_BLOCKLIST from local.py. Explicit env_vars passed to spawn_local() still override the blocklist, matching the existing behavior for callers that intentionally need these. Gap identified by PR #1004 (@PeterFile). * feat(delegate): add observability metadata to subagent results Enrich delegate_task results with metadata from the child AIAgent: - model: which model the child used - exit_reason: completed \| interrupted \| max_iterations - tokens.input / tokens.output: token counts - tool_trace: per-tool-call trace with byte sizes and ok/error status Tool trace uses tool_call_id matching to correctly pair parallel tool calls with their results, with a fallback for messages without IDs. Cherry-picked from PR #872 by @omerkaz, with fixes: - Fixed parallel tool call trace pairing (was always updating last entry) - Removed redundant 'iterations' field (identical to existing 'api_calls') - Added test for parallel tool call trace correctness Co-authored-by: omerkaz <omerkaz@users.noreply.github.com> --------- Co-authored-by: omerkaz <omerkaz@users.noreply.github.com>	2026-03-13 08:07:12 -07:00
Muhammet Eren Karakuş	c92507e53d	fix(terminal): strip Hermes provider env vars from subprocess environment (#1157 ) Terminal subprocesses inherit OPENAI_BASE_URL and other provider env vars loaded from ~/.hermes/.env, silently misrouting external CLIs like codex. Build a blocklist dynamically from the provider registry so new providers are automatically covered. Callers that truly need a blocked var can opt in via the _HERMES_FORCE_ prefix. Closes #1002 Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-03-13 07:52:03 -07:00
teknium1	06a5cc484c	fix: improve gateway secret capture guidance message The old message referenced 'hermes setup' which doesn't handle skill-specific env vars. Updated to direct users to load the skill in the local CLI (which triggers the secure prompt) or add the key to ~/.hermes/.env manually.	2026-03-13 04:10:22 -07:00
Teknium	0157253145	Merge pull request #1152 from NousResearch/hermes/hermes-f47f71c0 feat: concurrent tool execution with ThreadPoolExecutor	2026-03-13 03:20:38 -07:00
kshitijk4poor	ccfbf42844	feat: secure skill env setup on load (core #688 ) When a skill declares required_environment_variables in its YAML frontmatter, missing env vars trigger a secure TUI prompt (identical to the sudo password widget) when the skill is loaded. Secrets flow directly to ~/.hermes/.env, never entering LLM context. Key changes: - New required_environment_variables frontmatter field for skills - Secure TUI widget (masked input, 120s timeout) - Gateway safety: messaging platforms show local setup guidance - Legacy prerequisites.env_vars normalized into new format - Remote backend handling: conservative setup_needed=True - Env var name validation, file permissions hardened to 0o600 - Redact patterns extended for secret-related JSON fields - 12 existing skills updated with prerequisites declarations - ~48 new tests covering skip, timeout, gateway, remote backends - Dynamic panel widget sizing (fixes hardcoded width from original PR) Cherry-picked from PR #723 by kshitijk4poor, rebased onto current main with conflict resolution. Fixes #688 Co-authored-by: kshitijk4poor <kshitijk4poor@users.noreply.github.com>	2026-03-13 03:14:04 -07:00
teknium1	5d0d5b191c	feat: concurrent tool execution with ThreadPoolExecutor When the model returns multiple tool calls in a single response, they are now executed concurrently using a thread pool instead of sequentially. This significantly reduces wall-clock time when multiple independent tools are batched (e.g. parallel web_search, read_file, terminal calls). Architecture: - _execute_tool_calls() dispatches to sequential or concurrent path - Single tool calls and batches containing 'clarify' use sequential path - Multiple non-interactive tools use ThreadPoolExecutor (max 8 workers) - Results are collected and appended to messages in original order - _invoke_tool() extracted as shared tool invocation helper Safety: - Pre-flight interrupt check skips all tools if interrupted - Per-tool exception handling: one failure doesn't crash the batch - Result truncation (100k char limit) applied per tool - Budget pressure injection after all tools complete - Checkpoints taken before file-mutating tools - CLI spinner shows batch progress, then per-tool completion messages Tests: 10 new tests covering dispatch logic, ordering, error handling, interrupt behavior, truncation, and _invoke_tool routing.	2026-03-13 02:51:51 -07:00
Teknium	2a62514d17	feat: add 'View full command' option to dangerous command approval (#887 ) When a dangerous command is detected and the user is prompted for approval, long commands are truncated (80 chars in fallback, 70 chars in the TUI). Users had no way to see the full command before deciding. This adds a 'View full command' option across all approval interfaces: - CLI fallback (tools/approval.py): [v]iew option in the prompt menu. Shows the full command and re-prompts for approval decision. - CLI TUI (cli.py): 'Show full command' choice in the arrow-key selection panel. Expands the command display in-place and removes the view option after use. - CLI callbacks (callbacks.py): 'view' choice added to the list when the command exceeds 70 characters. - Gateway (gateway/run.py): 'full', 'show', 'view' responses reveal the complete command while keeping the approval pending. Includes 7 new tests covering view-then-approve, view-then-deny, short command fallthrough, and double-view behavior. Closes community feedback about the 80-char cap on dangerous commands.	2026-03-12 06:27:21 -07:00
teknium1	a37fc05171	fix: skip hanging tests + add global test timeout 4 test files spawn real processes or make live API calls that hang indefinitely in batch/CI runs. Skip them with pytestmark: - tests/tools/test_code_execution.py (subprocess spawns) - tests/tools/test_file_tools_live.py (live LocalEnvironment) - tests/test_413_compression.py (blocks on process) - tests/test_agent_loop_tool_calling.py (live OpenRouter API calls) Also added global 30s signal.alarm timeout in conftest.py as a safety net, and removed stale nous-api test that hung on OAuth browser login. Suite now runs in ~55s with no hangs.	2026-03-12 01:23:28 -07:00
teknium1	2192b17670	merge: resolve conflicts with origin/main - gateway/run.py: Take main's _resolve_gateway_model() helper - hermes_cli/setup.py: Re-apply nous-api removal after merge brought it back. Fix provider_idx offset (Custom is now index 3, not 4). - tests/hermes_cli/test_setup.py: Fix custom setup test index (3→4)	2026-03-12 00:29:04 -07:00
teknium1	29ef69c703	fix: update all test mocks for call_llm migration Update 14 test files to use the new call_llm/async_call_llm mock patterns instead of the old get_text_auxiliary_client/ get_vision_auxiliary_client tuple returns. - vision_tools tests: mock async_call_llm instead of _aux_async_client - browser tests: mock call_llm instead of _aux_vision_client - flush_memories tests: mock call_llm instead of get_text_auxiliary_client - session_search tests: mock async_call_llm with RuntimeError - mcp_tool tests: fix whitelist model config, use side_effect for multi-response tests - auxiliary_config_bridge: update for model=None (resolved in router) 3251 passed, 2 pre-existing unrelated failures.	2026-03-11 21:06:54 -07:00
teknium1	0aa31cd3cb	feat: call_llm/async_call_llm + config slots + migrate all consumers Add centralized call_llm() and async_call_llm() functions that own the full LLM request lifecycle: 1. Resolve provider + model from task config or explicit args 2. Get or create a cached client for that provider 3. Format request args (max_tokens handling, provider extra_body) 4. Make the API call with max_tokens/max_completion_tokens retry 5. Return the response Config: expanded auxiliary section with provider:model slots for all tasks (compression, vision, web_extract, session_search, skills_hub, mcp, flush_memories). Config version bumped to 7. Migrated all auxiliary consumers: - context_compressor.py: uses call_llm(task='compression') - vision_tools.py: uses async_call_llm(task='vision') - web_tools.py: uses async_call_llm(task='web_extract') - session_search_tool.py: uses async_call_llm(task='session_search') - browser_tool.py: uses call_llm(task='vision'/'web_extract') - mcp_tool.py: uses call_llm(task='mcp') - skills_guard.py: uses call_llm(provider='openrouter') - run_agent.py flush_memories: uses call_llm(task='flush_memories') Tests updated for context_compressor and MCP tool. Some test mocks still need updating (15 remaining failures from mock pattern changes, 2 pre-existing).	2026-03-11 20:52:19 -07:00
0xbyt4	4a8f23eddf	fix: correctly track failed MCP server connections in discovery _discover_one() caught all exceptions and returned [], making asyncio.gather(return_exceptions=True) redundant. The isinstance(result, Exception) branch in _discover_all() was dead code, so failed_count was always 0. This caused: - No summary printed when all servers fail (silent failure) - ok_servers always equaling total_servers (misleading count) - Unused variables transport_desc and transport_type Fix: let exceptions propagate to gather() so failed_count increments correctly. Move per-server failure logging to _discover_all(). Remove dead variables.	2026-03-11 18:24:45 +03:00
dmahan93	59b53f0a23	fix: skip tests when atroposlib/minisweagent unavailable in CI - test_agent_loop_tool_calling.py: import atroposlib at module level to trigger skip (environments.agent_loop is now importable without atroposlib due to __init__.py graceful fallback) - test_modal_sandbox_fixes.py: skip TestToolResolution tests when minisweagent not installed	2026-03-11 06:52:55 -07:00
dmahan93	0f53275169	test: skip atropos-dependent tests when atroposlib not installed Guard all test files that import from environments/ or atroposlib with try/except + pytest.skip(allow_module_level=True) so they gracefully skip instead of crashing when deps aren't available.	2026-03-11 06:52:55 -07:00
dmahan93	b03aefaf20	test: 13 tests for Modal sandbox infra fixes	2026-03-11 06:51:42 -07:00
teknium1	184aa5b2b3	fix: tighten exc_info assertion in vision test (from PR #803 ) The weaker assertion (r.exc_info is not None) passes even when exc_info is (None, None, None). Check r.exc_info[0] is not None to verify actual exception info is present. The _aux_async_client mock was already applied on main. Co-authored-by: OutThisLife <nickolasgustafsson@gmail.com>	2026-03-11 06:32:01 -07:00
teknium1	9423fda5cb	feat: configurable subagent provider:model with full credential resolution Adds delegation.model and delegation.provider config fields so subagents can run on a completely different provider:model pair than the parent agent. When delegation.provider is set, the system resolves the full credential bundle (base_url, api_key, api_mode) via resolve_runtime_provider() — the same path used by CLI/gateway startup. This means all configured providers work out of the box: openrouter, nous, zai, kimi-coding, minimax, minimax-cn. Key design decisions: - Provider resolution uses hermes_cli.runtime_provider (single source of truth for credential resolution across CLI, gateway, cron, and now delegation) - When only delegation.model is set (no provider), the model name changes but parent credentials are inherited (for switching models within the same provider like OpenRouter) - When delegation.provider is set, full credentials are resolved independently — enabling cross-provider delegation (e.g. parent on Nous Portal, subagents on OpenRouter) - Clear error messages if provider resolution fails (missing API key, unknown provider name) - _load_config() now falls back to hermes_cli.config.load_config() for gateway/cron contexts where CLI_CONFIG is unavailable Based on PR #791 by 0xbyt4 (closes #609), reworked to use proper provider credential resolution instead of passing provider as metadata. Co-authored-by: 0xbyt4 <0xbyt4@users.noreply.github.com>	2026-03-11 06:12:21 -07:00
teknium1	09336a6710	Merge PR #795 : fix: handle empty choices in MCP sampling callback Adds defensive guard against empty/None/missing choices in SamplingHandler.__call__ before accessing response.choices[0]. Returns proper ErrorData instead of crashing with IndexError/TypeError on content filtering, provider errors, or rate limits. Authored by 0xbyt4. Co-authored-by: 0xbyt4 <0xbyt4@users.noreply.github.com>	2026-03-11 05:47:51 -07:00
Teknium	fe9da5280f	Merge pull request #766 from spanishflu-est1918/codex/telegram-topic-session-pr Isolate Telegram forum topic sessions — each topic gets its own independent session key, history, and interrupt tracking. Progress, hygiene, and cron messages all route to the correct topic.	2026-03-11 03:14:43 -07:00
alireza78a	f1510ec33e	test(terminal): add tests for env var validation in _get_env_config	2026-03-11 02:59:12 -07:00
SPANISH FLU	0d6b25274c	fix(gateway): isolate telegram forum topic sessions	2026-03-11 09:15:34 +01:00
teknium1	a9241f3e3e	fix: head+tail truncation for execute_code stdout Replaces head-only stdout capture with a two-buffer approach (40% head, 60% tail rolling window) so scripts that print() their final results at the end never lose them. Adds truncation notice between sections. Cherry-picked from PR #755, conflict resolved (test file additions). 3 new tests for short output, head+tail preservation, and notice format.	2026-03-11 00:26:13 -07:00
teknium1	586fe5d62d	Merge PR #724 : feat: --yolo flag to bypass all approval prompts Authored by dmahan93. Adds HERMES_YOLO_MODE env var and --yolo CLI flag to auto-approve all dangerous command prompts. Post-merge: renamed --fuck-it-ship-it to --yolo for brevity, resolved conflict with --checkpoints flag.	2026-03-10 20:56:30 -07:00
Teknium	b76cae94d4	Merge pull request #889 from NousResearch/hermes/hermes-b0162f8d fix: Docker backend fails when docker is not in PATH (macOS gateway)	2026-03-10 20:45:34 -07:00
teknium1	24479625a2	fix: Docker backend fails when docker is not in PATH (macOS gateway) On macOS, Docker Desktop installs the CLI to /usr/local/bin/docker, but when Hermes runs as a gateway service (launchd) or in other non-login contexts, /usr/local/bin is often not in PATH. This causes the Docker requirements check to fail with 'No such file or directory: docker' even though docker works fine from the user's terminal. Add find_docker() helper that uses shutil.which() first, then probes common Docker Desktop install paths on macOS (/usr/local/bin, /opt/homebrew/bin, Docker.app bundle). The resolved path is cached and passed to mini-swe-agent via its 'executable' parameter. - tools/environments/docker.py: add find_docker(), use it in _storage_opt_supported() and pass to _Docker(executable=...) - tools/terminal_tool.py: use find_docker() in requirements check - tests/tools/test_docker_find.py: 4 tests (PATH, fallback, not found, cache) 2877 tests pass.	2026-03-10 20:45:13 -07:00
teknium1	03a4f184e6	fix: call _stop_training_run on early-return failure paths The 4 early-return paths in _spawn_training_run (API exit, trainer exit, env not found, env exit) were doing manual process.terminate() or returning without cleanup, leaking open log file handles. Now all paths call _stop_training_run() which handles both process termination and file handle closure. Also adds 12 tests for _stop_training_run covering file handle cleanup, process termination, status transitions, and edge cases. Inspired by PR #715 (0xbyt4) which identified the early-return issue. Core file handle fix was already on main via `e28dc13` (memosr.eth).	2026-03-10 17:09:51 -07:00
teknium1	a458b535c9	fix: improve read-loop detection — consecutive-only, correct thresholds, fix bugs Follow-up to PR #705 (merged from 0xbyt4). Addresses several issues: 1. CONSECUTIVE-ONLY TRACKING: Redesigned the read/search tracker to only warn/block on truly consecutive identical calls. Any other tool call in between (write, patch, terminal, etc.) resets the counter via notify_other_tool_call(), called from handle_function_call() in model_tools.py. This prevents false blocks in read→edit→verify flows. 2. THRESHOLD ADJUSTMENT: Warn on 3rd consecutive (was 2nd), block on 4th+ consecutive (was 3rd+). Gives the model more room before intervening. 3. TUPLE UNPACKING BUG: Fixed get_read_files_summary() which crashed on search keys (5-tuple) when trying to unpack as 3-tuple. Now uses a separate read_history set that only tracks file reads. 4. WEB_EXTRACT DOCSTRING: Reverted incorrect removal of 'title' from web_extract return docs in code_execution_tool.py — the field IS returned by web_tools.py. 5. TESTS: Rewrote test_read_loop_detection.py (35 tests) to cover consecutive-only behavior, notify_other_tool_call, interleaved read/search, and summary-unaffected-by-searches.	2026-03-10 16:25:41 -07:00
teknium1	b53d5dad67	Merge PR #705 : fix: detect, warn, and block file re-read/search loops after context compression Authored by 0xbyt4. Adds read/search loop detection, file history injection after compression, and todo filtering for active items only.	2026-03-10 16:17:03 -07:00
teknium1	c1171fe666	fix: eliminate 3x SQLite message duplication in gateway sessions (#860 ) Three separate code paths all wrote to the same SQLite state.db with no deduplication, inflating session transcripts by 3-4x: 1. _log_msg_to_db() — wrote each message individually after append 2. _flush_messages_to_session_db() — re-wrote ALL new messages at every _persist_session() call (~18 exit points), with no tracking of what was already written 3. gateway append_to_transcript() — wrote everything a third time after the agent returned Since load_transcript() prefers SQLite over JSONL, the inflated data was loaded on every session resume, causing proportional token waste. Fix: - Remove _log_msg_to_db() and all 16 call sites (redundant with flush) - Add _last_flushed_db_idx tracking in _flush_messages_to_session_db() so repeated _persist_session() calls only write truly new messages - Reset flush cursor on compression (new session ID) - Add skip_db parameter to SessionStore.append_to_transcript() so the gateway skips SQLite writes when the agent already persisted them - Gateway now passes skip_db=True for agent-managed messages, still writes to JSONL as backup Verified: a 12-message CLI session with tool calls produces exactly 12 SQLite rows with zero duplicates (previously would be 36-48). Tests: 9 new tests covering flush deduplication, skip_db behavior, compression reset, and initialization. Full suite passes (2869 tests).	2026-03-10 15:22:44 -07:00
SHL0MS	0229e6b407	Fix test_analysis_error_logs_exc_info: mock _aux_async_client so download path is reached	2026-03-10 16:03:19 -04:00
0xbyt4	52e3580cd4	refactor: merge new tests into test_code_execution.py Move all new tests (schema, env filtering, edge cases, interrupt) into the existing test_code_execution.py instead of a separate file. Delete the now-redundant test_code_execution_schema.py.	2026-03-10 06:18:27 -07:00
0xbyt4	694a3ebdd5	fix(code_execution): handle empty enabled_sandbox_tools in schema description build_execute_code_schema(set()) produced "from hermes_tools import , ..." in the code property description — invalid Python syntax shown to the model. This triggers when a user enables only the code_execution toolset without any of the sandbox-allowed tools (e.g. `hermes tools code_execution`), because SANDBOX_ALLOWED_TOOLS & {"execute_code"} = empty set. Also adds 29 unit tests covering build_execute_code_schema, environment variable filtering, execute_code edge cases, and interrupt handling.	2026-03-10 06:18:27 -07:00
teknium1	5e6c7bc205	Merge PR #602 : fix: prevent data loss in clipboard PNG conversion when ImageMagick fails Authored by 0xbyt4. Only deletes temp .bmp after confirmed successful conversion, restores original on failure. Adds 3 tests.	2026-03-10 04:15:05 -07:00
teknium1	c1775de56f	feat: filesystem checkpoints and /rollback command Automatic filesystem snapshots before destructive file operations, with user-facing rollback. Inspired by PR #559 (by @alireza78a). Architecture: - Shadow git repos at ~/.hermes/checkpoints/{hash}/ via GIT_DIR - CheckpointManager: take/list/restore, turn-scoped dedup, pruning - Transparent — the LLM never sees it, no tool schema, no tokens - Once per turn — only first write_file/patch triggers a snapshot Integration: - Config: checkpoints.enabled + checkpoints.max_snapshots - CLI flag: hermes --checkpoints - Trigger: run_agent.py _execute_tool_calls() before write_file/patch - /rollback slash command in CLI + gateway (list, restore by number) - Pre-rollback snapshot auto-created on restore (undo the undo) Safety: - Never blocks file operations — all errors silently logged - Skips root dir, home dir, dirs >50K files - Disables gracefully when git not installed - Shadow repo completely isolated from project git Tests: 35 new tests, all passing (2798 total suite) Docs: feature page, config reference, CLI commands reference	2026-03-10 00:49:15 -07:00
teknium1	5212644861	fix(security): prevent shell injection in tilde-username path expansion Validate that the username portion of ~username paths contains only valid characters (alphanumeric, dot, hyphen, underscore) before passing to shell echo for expansion. Previously, paths like '~; rm -rf /' would be passed unquoted to self._exec(f'echo {path}'), allowing arbitrary command execution. The approach validates the username rather than using shlex.quote(), which would prevent tilde expansion from working at all since echo '~user' outputs the literal string instead of expanding it. Added tests for injection blocking and valid ~username/path expansion. Credit to @alireza78a for reporting (PR #442, issue #442).	2026-03-09 17:33:19 -07:00
0xbyt4	4e3a8a0637	fix: handle empty choices in MCP sampling callback SamplingHandler.__call__ accessed response.choices[0] without checking if the list was non-empty. LLM APIs can return empty choices on content filtering, provider errors, or rate limits, causing an unhandled IndexError that propagates to the MCP SDK and may crash the connection. Add a defensive guard that returns a proper ErrorData when choices is empty, None, or missing. Includes three test cases covering all variants.	2026-03-10 02:24:53 +03:00
teknium1	2d44ed1c5b	test: add comprehensive tests for vision_tools (42 tests) Covers PR #428 changes and existing vision_tools functionality: - _validate_image_url: 20 tests for urlparse-based validation - _determine_mime_type: 6 tests for MIME type detection - _image_to_base64_data_url: 3 tests for base64 conversion - _handle_vision_analyze: 5 tests for type hints, prompt building, AUXILIARY_VISION_MODEL env var override - Error logging exc_info: 3 async tests verifying stack traces are logged on download failure, analysis error, and cleanup error - check_vision_requirements & get_debug_session_info: 2 basic tests - Registry integration: 3 tests for tool registration	2026-03-09 15:32:02 -07:00
Teknium	654e16187e	feat(mcp): add sampling support — server-initiated LLM requests (#753 ) Add MCP sampling/createMessage capability via SamplingHandler class. Text-only sampling + tool use in sampling with governance (rate limits, model whitelist, token caps, tool loop limits). Per-server audit metrics. Based on concept from PR #366 by eren-karakus0. Restructured as class-based design with bug fixes and tests using real MCP SDK types. 50 new tests, 2600 total passing.	2026-03-09 03:37:38 -07:00
0xbyt4	912efe11b5	fix(tests): add content attribute to fake result objects _FakeReadResult and _FakeSearchResult now expose the attributes that read_file_tool/search_tool access after the redact_sensitive_text integration from main.	2026-03-09 13:25:52 +03:00
0xbyt4	4684aaffdc	merge: resolve file_tools.py conflict with origin/main Combine read/search loop detection with main's redact_sensitive_text and truncation hint features. Add tracker reset to TestSearchHints to prevent cross-test state leakage.	2026-03-09 13:21:46 +03:00
teknium1	7af33accf1	fix: apply secret redaction to file tool outputs Terminal output was already redacted via redact_sensitive_text() but read_file and search_files returned raw content. Now both tools redact secrets before returning results to the LLM. Based on PR #372 by @teyrebaz33 (closes #363) — applied manually due to branch conflicts with the current codebase.	2026-03-09 00:49:46 -07:00
teknium1	a8bf414f4a	feat: browser console/errors tool, annotated screenshots, auto-recording, and dogfood QA skill New browser capabilities and a built-in skill for agent-driven web QA. ## New tool: browser_console Returns console messages (log/warn/error/info) AND uncaught JavaScript exceptions in a single call. Uses agent-browser's 'console' and 'errors' commands through the existing session plumbing. Supports --clear to reset buffers. Verified working in both local and Browserbase cloud modes. ## Enhanced tool: browser_vision(annotate=True) New boolean parameter on browser_vision. When true, agent-browser overlays numbered [N] labels on interactive elements — each [N] maps to ref @eN. Annotation data (element name, role, bounding box) returned alongside the vision analysis. Useful for QA reports and spatial reasoning. ## Config: browser.record_sessions Auto-record browser sessions as WebM video files when enabled: - Starts recording on first browser_navigate - Stops and saves on browser_close - Saves to ~/.hermes/browser_recordings/ - Works in both local and cloud modes (verified) - Disabled by default ## Built-in skill: dogfood Systematic exploratory QA testing for web applications. Teaches the agent a 5-phase workflow: 1. Plan — accept URL, create output dirs, set scope 2. Explore — systematic crawl with annotated screenshots 3. Collect Evidence — screenshots, console errors, JS exceptions 4. Categorize — severity (Critical/High/Medium/Low) and category (Functional/Visual/Accessibility/Console/UX/Content) 5. Report — structured markdown with per-issue evidence Includes: - skills/dogfood/SKILL.md — full workflow instructions - skills/dogfood/references/issue-taxonomy.md — severity/category defs - skills/dogfood/templates/dogfood-report-template.md — report template ## Tests 21 new tests covering: - browser_console message/error parsing, clear flag, empty/failed states - browser_console schema registration - browser_vision annotate schema and flag passing - record_sessions config defaults and recording lifecycle - Dogfood skill file existence and content validation Addresses #315.	2026-03-08 21:28:12 -07:00
0xbyt4	d8df91dfa8	fix: resolve merge conflict with main in clipboard.py	2026-03-09 03:50:29 +03:00
teknium1	491605cfea	feat: add high-value tool result hints for patch and search_files (#722 ) Add contextual [Hint: ...] suffixes to tool results where they save real iterations: - patch (no match): suggests read_file/search_files to verify content before retrying — addresses the common pattern where the agent retries with stale old_string instead of re-reading the file. - search_files (truncated): provides explicit next offset and suggests narrowing the search — clearer than relying on total_count inference. Other hints proposed in #722 (terminal, web_search, web_extract, browser_snapshot, search zero-results, search content-matches) were evaluated and found to be low-value: either already covered by existing mechanisms (read_file pagination, similar-files, schema descriptions) or guidance the agent already follows from its own reasoning. 5 new tests covering hint presence/absence for both tools.	2026-03-08 17:46:28 -07:00
teknium1	c0520223fd	fix: clipboard BMP conversion file loss and broken test Source code (hermes_cli/clipboard.py): - _convert_to_png() lost the file when both Pillow and ImageMagick were unavailable: path.rename(tmp) moved the file to .bmp, then subprocess.run raised FileNotFoundError, but the file was never renamed back. The final fallback 'return path.exists()' returned False. - Fix: restore the original file in both except handlers by renaming tmp back to path when the original is missing. Test (tests/tools/test_clipboard.py): - test_file_still_usable_when_no_converter expected 'from PIL import Image' to raise an Exception, but Pillow is installed so pytest.raises fired 'DID NOT RAISE'. The test also never called _convert_to_png(). - Fix: properly mock PIL unavailability via patch.dict(sys.modules), actually call _convert_to_png(), and assert the correct result.	2026-03-08 17:22:27 -07:00
teknium1	3fb8938cd3	fix: search_files now reports error for non-existent paths instead of silent empty results Previously, search_files would silently return 0 results when the search path didn't exist (e.g., /root/.hermes/... when HOME is /home/user). The path was passed to rg/grep/find which would fail silently, and the empty stdout was parsed as 'no matches found'. Changes: - Add path existence check at the top of search() using test -e. Returns SearchResult with a clear error message when path doesn't exist. - Add exit code 2 checks in _search_with_rg() and _search_with_grep() as secondary safety net for other error types (bad regex, permissions). - Add 4 new tests covering: nonexistent path (content mode), nonexistent path (files mode), existing path proceeds normally, rg error exit code. Tests: 37 → 41 in test_file_operations.py, full suite 2330 passed.	2026-03-08 16:47:20 -07:00
dmahan93	7791174ced	feat: add --fuck-it-ship-it flag to bypass dangerous command approvals Adds a fun alias for skipping all dangerous command approval prompts. When passed, sets HERMES_YOLO_MODE=1 which causes check_dangerous_command() to auto-approve everything. Available on both top-level and chat subcommand: hermes --fuck-it-ship-it hermes chat --fuck-it-ship-it Includes 5 tests covering normal blocking, yolo bypass, all patterns, and edge cases (empty string env var).	2026-03-08 18:36:37 -05:00

1 2 3 4 5 ...

342 Commits