Commit Graph

1072 Commits

Author SHA1 Message Date
Teknium
24342813fe
fix(qqbot): correct Authorization header format in send_message REST path (#11569)
The send_message tool's direct-REST QQBot path used "QQBotAccessToken {token}"
which QQ's API rejects with 401. The correct format is "QQBot {token}" — the
gateway adapter at gateway/platforms/qqbot.py uses this format in all 5 header
sites (lines 341, 551, 579, 1068, 1467); this was the one outlier.

Credit to @Quon for surfacing this in #10257 (that PR had unrelated issues in
its media-upload logic and was closed; this salvages the genuine 1-line fix).
2026-04-17 04:25:47 -07:00
yeyitech
a97b08e30c fix: allow trusted QQ CDN benchmark IP resolution 2026-04-17 04:22:40 -07:00
Teknium
e5cde568b7
feat(skills): add 'hermes skills reset' to un-stick bundled skills (#11468)
When a user edits a bundled skill, sync flags it as user_modified and
skips it forever. The problem: if the user later tries to undo the edit
by copying the current bundled version back into ~/.hermes/skills/, the
manifest still holds the old origin hash from the last successful
sync, so the fresh bundled hash still doesn't match and the skill stays
stuck as user_modified.

Adds an escape hatch for this case.

  hermes skills reset <name>
      Drops the skill's entry from ~/.hermes/skills/.bundled_manifest and
      re-baselines against the user's current copy. Future 'hermes update'
      runs accept upstream changes again. Non-destructive.

  hermes skills reset <name> --restore
      Also deletes the user's copy and re-copies the bundled version.
      Use when you want the pristine upstream skill back.

Also available as /skills reset in chat.

- tools/skills_sync.py: new reset_bundled_skill(name, restore=False)
- hermes_cli/skills_hub.py: do_reset() + wired into skills_command and
  handle_skills_slash; added to the slash /skills help panel
- hermes_cli/main.py: argparse entry for 'hermes skills reset'
- tests/tools/test_skills_sync.py: 5 new tests covering the stuck-flag
  repro, --restore, unknown-skill error, upstream-removed-skill, and
  no-op on already-clean state
- website/docs/user-guide/features/skills.md: new 'Bundled skill updates'
  section explaining the origin-hash mechanic + reset usage
2026-04-17 00:41:31 -07:00
Teknium
220fa7db90
feat(image_gen): upgrade Recraft V3 → V4 Pro, Nano Banana → Pro (#11406)
* feat(image_gen): upgrade Recraft V3 → V4 Pro, Nano Banana → Pro

Upstream asked for these two upgrades ASAP — the old entries show
stale models when newer, higher-quality versions are available on FAL.

Recraft V3 → Recraft V4 Pro
  ID:    fal-ai/recraft-v3 → fal-ai/recraft/v4/pro/text-to-image
  Price: $0.04/image → $0.25/image (6x — V4 Pro is premium tier)
  Schema: V4 dropped the required `style` enum entirely; defaults
          handle taste now. Added `colors` and `background_color`
          to supports for brand-palette control. `seed` is not
          supported by V4 per the API docs.

Nano Banana → Nano Banana Pro
  ID:    fal-ai/nano-banana → fal-ai/nano-banana-pro
  Price: $0.08/image → $0.15/image (1K); $0.30 at 4K
  Schema: Aspect ratio family unchanged. Added `resolution`
          (1K/2K/4K, default 1K for billing predictability),
          `enable_web_search` (real-time info grounding, +$0.015),
          and `limit_generations` (force exactly 1 image).
  Architecture: Gemini 2.5 Flash → Gemini 3 Pro Image. Quality
                and reasoning depth improved; slower (~6s → ~8s).

Migration: users who had the old IDs in `image_gen.model` will
fall through the existing 'unknown model → default' warning path
in `_resolve_fal_model()` and get the Klein 9B default on the next
run. Re-run `hermes tools` → Image Generation to pick the new
version. No silent cost-upgrade aliasing — the 2-6x price jump
on these tiers warrants explicit user re-selection.

Portal note: both new model IDs need to be allowlisted on the
Nous fal-queue-gateway alongside the previous 7 additions, or
users on Nous Subscription will see the 'managed gateway rejected
model' error we added previously (which is clear and
self-remediating, just noisy).

* docs: wrap '<1s' in backticks to unblock MDX compilation

Docusaurus's MDX parser treats unquoted '<' as the start of JSX, and
'<1s' fails because '1' isn't a valid tag-name start character. This
was broken on main since PR #11265 (never noticed because
docs-site-checks was failing on OTHER issues at the time and we
admin-merged through it).

Wrapping in backticks also gives the cell monospace styling which
reads more cleanly alongside the inline-code model ID in the same row.

The other '<1s' occurrence (line 52) is inside a fenced code block
and is already safe — code fences bypass MDX parsing.
2026-04-16 22:05:41 -07:00
Teknium
70768665a4
fix(mcp): consolidate OAuth handling, pick up external token refreshes (#11383)
* feat(mcp-oauth): scaffold MCPOAuthManager

Central manager for per-server MCP OAuth state. Provides
get_or_build_provider (cached), remove (evicts cache + deletes
disk), invalidate_if_disk_changed (mtime watch, core fix for
external-refresh workflow), and handle_401 (dedup'd recovery).

No behavior change yet — existing call sites still use
build_oauth_auth directly. Task 1 of 8 in the MCP OAuth
consolidation (fixes Cthulhu's BetterStack reliability issues).

* feat(mcp-oauth): add HermesMCPOAuthProvider with pre-flow disk watch

Subclasses the MCP SDK's OAuthClientProvider to inject a disk
mtime check before every async_auth_flow, via the central
manager. When a subclass instance is used, external token
refreshes (cron, another CLI instance) are picked up before
the next API call.

Still dead code: the manager's _build_provider still delegates
to build_oauth_auth and returns the plain OAuthClientProvider.
Task 4 wires this subclass in. Task 2 of 8.

* refactor(mcp-oauth): extract build_oauth_auth helpers

Decomposes build_oauth_auth into _configure_callback_port,
_build_client_metadata, _maybe_preregister_client, and
_parse_base_url. Public API preserved. These helpers let
MCPOAuthManager._build_provider reuse the same logic in Task 4
instead of duplicating the construction dance.

Also updates the SDK version hint in the warning from 1.10.0 to
1.26.0 (which is what we actually require for the OAuth types
used here). Task 3 of 8.

* feat(mcp-oauth): manager now builds HermesMCPOAuthProvider directly

_build_provider constructs the disk-watching subclass using the
helpers from Task 3, instead of delegating to the plain
build_oauth_auth factory. Any consumer using the manager now gets
pre-flow disk-freshness checks automatically.

build_oauth_auth is preserved as the public API for backwards
compatibility. The code path is now:

    MCPOAuthManager.get_or_build_provider  ->
      _build_provider  ->
        _configure_callback_port
        _build_client_metadata
        _maybe_preregister_client
        _parse_base_url
        HermesMCPOAuthProvider(...)

Task 4 of 8.

* feat(mcp): wire OAuth manager + add _reconnect_event

MCPServerTask gains _reconnect_event alongside _shutdown_event.
When set, _run_http / _run_stdio exit their async-with blocks
cleanly (no exception), and the outer run() loop re-enters the
transport to rebuild the MCP session with fresh credentials.
This is the recovery path for OAuth failures that the SDK's
in-place httpx.Auth cannot handle (e.g. cron externally consumed
the refresh_token, or server-side session invalidation).

_run_http now asks MCPOAuthManager for the OAuth provider
instead of calling build_oauth_auth directly. Config-time,
runtime, and reconnect paths all share one provider instance
with pre-flow disk-watch active.

shutdown() defensively sets both events so there is no race
between reconnect and shutdown signalling.

Task 5 of 8.

* feat(mcp): detect auth failures in tool handlers, trigger reconnect

All 5 MCP tool handlers (tool call, list_resources, read_resource,
list_prompts, get_prompt) now detect auth failures and route
through MCPOAuthManager.handle_401:

  1. If the manager says recovery is viable (disk has fresh tokens,
     or SDK can refresh in-place), signal MCPServerTask._reconnect_event
     to tear down and rebuild the MCP session with fresh credentials,
     then retry the tool call once.

  2. If no recovery path exists, return a structured needs_reauth
     JSON error so the model stops hallucinating manual refresh
     attempts (the 'let me curl the token endpoint' loop Cthulhu
     pasted from Discord).

_is_auth_error catches OAuthFlowError, OAuthTokenError,
OAuthNonInteractiveError, and httpx.HTTPStatusError(401). Non-auth
exceptions still surface via the generic error path unchanged.

Task 6 of 8.

* feat(mcp-cli): route add/remove through manager, add 'hermes mcp login'

cmd_mcp_add and cmd_mcp_remove now go through MCPOAuthManager
instead of calling build_oauth_auth / remove_oauth_tokens
directly. This means CLI config-time state and runtime MCP
session state are backed by the same provider cache — removing
a server evicts the live provider, adding a server populates
the same cache the MCP session will read from.

New 'hermes mcp login <name>' command:
  - Wipes both the on-disk tokens file and the in-memory
    MCPOAuthManager cache
  - Triggers a fresh OAuth browser flow via the existing probe
    path
  - Intended target for the needs_reauth error Task 6 returns
    to the model

Task 7 of 8.

* test(mcp-oauth): end-to-end integration tests

Five new tests exercising the full consolidation with real file
I/O and real imports (no transport mocks):

  1. external_refresh_picked_up_without_restart — Cthulhu's cron
     workflow. External process writes fresh tokens to disk;
     on the next auth flow the manager's mtime-watch flips
     _initialized and the SDK re-reads from storage.

  2. handle_401_deduplicates_concurrent_callers — 10 concurrent
     handlers for the same failed token fire exactly ONE recovery
     attempt (thundering-herd protection).

  3. handle_401_returns_false_when_no_provider — defensive path
     for unknown servers.

  4. invalidate_if_disk_changed_handles_missing_file — pre-auth
     state returns False cleanly.

  5. provider_is_reused_across_reconnects — cache stickiness so
     reconnects preserve the disk-watch baseline mtime.

Task 8 of 8 — consolidation complete.
2026-04-16 21:57:10 -07:00
Brooklyn Nicholson
41d3d7afb7 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-16 22:35:27 -05:00
Teknium
01906e99dd
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265)
* feat(image_gen): multi-model FAL support with picker in hermes tools

Adds 8 FAL text-to-image models selectable via `hermes tools` →
Image Generation → (FAL.ai | Nous Subscription) → model picker.

Models supported:
- fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP)
- fal-ai/flux-2-pro (previous default, kept backward-compat upscaling)
- fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN)
- fal-ai/nano-banana (Gemini 2.5 Flash Image)
- fal-ai/gpt-image-1.5 (with quality tier: low/medium/high)
- fal-ai/ideogram/v3 (best typography)
- fal-ai/recraft-v3 (vector, brand styles)
- fal-ai/qwen-image (LLM-based)

Architecture:
- FAL_MODELS catalog declares per-model size family, defaults, supports
  whitelist, and upscale flag. Three size families handled uniformly:
  image_size_preset (flux family), aspect_ratio (nano-banana), and
  gpt_literal (gpt-image-1.5).
- _build_fal_payload() translates unified inputs (prompt + aspect_ratio)
  into model-specific payloads, merges defaults, applies caller overrides,
  wires GPT quality_setting, then filters to the supports whitelist — so
  models never receive rejected keys.
- IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen
  providers (Replicate, Stability, etc.); each provider entry tags itself
  with imagegen_backend: 'fal' to select the right catalog.
- Upscaler (Clarity) defaults off for new models (preserves <1s value
  prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS.

Config:
  image_gen.model           = fal-ai/flux-2/klein/9b  (new)
  image_gen.quality_setting = medium                  (new, GPT only)
  image_gen.use_gateway     = bool                    (existing)

Agent-facing schema unchanged (prompt + aspect_ratio only) — model
choice is a user-level config decision, not an agent-level arg.

Picker uses curses_radiolist (arrow keys, auto numbered-fallback on
non-TTY). Column-aligned: Model / Speed / Strengths / Price.

Docs: image-generation.md rewritten with the model table and picker
walkthrough. tools-reference, tool-gateway, overview updated to drop
the stale "FLUX 2 Pro" wording.

Tests: 42 new in tests/tools/test_image_generation.py covering catalog
integrity, all 3 size families, supports filter, default merging, GPT
quality wiring, model resolution fallback. 8 new in
tests/hermes_cli/test_tools_config.py for picker wiring (registry,
config writes, GPT quality follow-up prompt, corrupt-config repair).

* feat(image_gen): translate managed-gateway 4xx to actionable error

When the Nous Subscription managed FAL proxy rejects a model with 4xx
(likely portal-side allowlist miss or billing gate), surface a clear
message explaining:
  1. The rejected model ID + HTTP status
  2. Two remediation paths: set FAL_KEY for direct access, or
     pick a different model via `hermes tools`

5xx, connection errors, and direct-FAL errors pass through unchanged
(those have different root causes and reasonable native messages).

Motivation: new FAL models added to this release (flux-2-klein-9b,
z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3,
qwen-image) are untested against the Nous Portal proxy. If the portal
allowlists model IDs, users on Nous Subscription will hit cryptic
4xx errors without guidance on how to work around it.

Tests: 8 new cases covering status extraction across httpx/fal error
shapes and 4xx-vs-5xx-vs-ConnectionError translation policy.

Docs: brief note in image-generation.md for Nous subscribers.

Operator action (Nous Portal side): verify that fal-queue-gateway
passes through these 7 new FAL model IDs. If the proxy has an
allowlist, add them; otherwise Nous Subscription users will see the
new translated error and fall back to direct FAL.

* feat(image_gen): pin GPT-Image quality to medium (no user choice)

Previously the tools picker asked a follow-up question for GPT-Image
quality tier (low / medium / high) and persisted the answer to
`image_gen.quality_setting`. This created two problems:

1. Nous Portal billing complexity — the 22x cost spread between tiers
   ($0.009 low / $0.20 high) forces the gateway to meter per-tier per
   user, which the portal team can't easily support at launch.
2. User footgun — anyone picking `high` by mistake burns through
   credit ~6x faster than `medium`.

This commit pins quality at medium by baking it into FAL_MODELS
defaults for gpt-image-1.5 and removes all user-facing override paths:

- Removed `_resolve_gpt_quality()` runtime lookup
- Removed `honors_quality_setting` flag on the model entry
- Removed `_configure_gpt_quality_setting()` picker helper
- Removed `_GPT_QUALITY_CHOICES` constant
- Removed the follow-up prompt call in `_configure_imagegen_model()`
- Even if a user manually edits `image_gen.quality_setting` in
  config.yaml, no code path reads it — always sends medium.

Tests:
- Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium
  (5 tests) — proves medium is baked in, config is ignored, flag is
  removed, helper is removed, non-gpt models never get quality.
- Replaced test_picker_with_gpt_image_also_prompts_quality with
  test_picker_with_gpt_image_does_not_prompt_quality — proves only 1
  picker call fires when gpt-image is selected (no quality follow-up).

Docs updated: image-generation.md replaces the quality-tier table
with a short note explaining the pinning decision.

* docs(image_gen): drop stale 'wires GPT quality tier' line from internals section

Caught in a cleanup sweep after pinning quality to medium. The
"How It Works Internally" walkthrough still described the removed
quality-wiring step.
2026-04-16 20:19:53 -07:00
Teknium
7fd508979e fix: harden sync_back — PID-suffix temp path, size cap, lifecycle guards
Follow-ups on top of kshitijk4poor's cherry-picked salvage of PR #8018:

tools/environments/daytona.py
  - PID-suffix /tmp/.hermes_sync.<pid>.tar so concurrent sync_back calls
    against the same sandbox don't collide on the remote temp path
  - Move sync_back() inside the cleanup lock and after the _sandbox-None
    guard, with its own try/except. Previously a no-op cleanup (sandbox
    already cleared) still fired sync_back → 3-attempt retry storm against
    a nil sandbox (~6s of sleep). Now short-circuits cleanly.

tools/environments/file_sync.py
  - Add _SYNC_BACK_MAX_BYTES (2 GiB) defensive cap: refuse to extract a
    tar larger than the limit. Protects against runaway sandboxes
    producing arbitrary-size archives.
  - Add 'nothing previously pushed' guard at the top of sync_back(). If
    _pushed_hashes and _synced_files are both empty, the FileSyncManager
    was never initialized from the host side — there is nothing coherent
    to sync back. Skips the retry/backoff machinery on uninitialized
    managers and eliminates test-suite slowdown from pre-existing cleanup
    tests that don't mock the sync layer.

tests/tools/test_file_sync_back.py
  - Update _make_manager helper to seed a _pushed_hashes entry by default
    so sync_back() exercises its real path. A seed_pushed_state=False
    opt-out is available for noop-path tests.
  - Add TestSyncBackSizeCap with positive and negative coverage of the
    new cap.

tests/tools/test_sync_back_backends.py
  - Update Daytona bulk download test to assert the PID-suffixed path
    pattern instead of the fixed /tmp/.hermes_sync.tar.
2026-04-16 19:39:21 -07:00
kshitijk4poor
d64446e315 feat(file-sync): sync remote changes back to host on teardown
Salvage of PR #8018 by @alt-glitch onto current main.

On sandbox teardown, FileSyncManager now downloads the remote .hermes/
directory, diffs against SHA-256 hashes of what was originally pushed,
and applies only changed files back to the host.

Core (tools/environments/file_sync.py):
- sync_back(): orchestrates download -> unpack -> diff -> apply with:
  - Retry with exponential backoff (3 attempts, 2s/4s/8s)
  - SIGINT trap + defer (prevents partial writes on Ctrl-C)
  - fcntl.flock serialization (concurrent gateway sandboxes)
  - Last-write-wins conflict resolution with warning
  - New remote files pulled back via _infer_host_path prefix matching

Backends:
- SSH: _ssh_bulk_download — tar cf - piped over SSH
- Modal: _modal_bulk_download — exec tar cf - -> proc.stdout.read
- Daytona: _daytona_bulk_download — exec tar cf -> SDK download_file
- All three call sync_back() at the top of cleanup()

Fixes applied during salvage (vs original PR #8018):

| # | Issue | Fix |
|---|-------|-----|
| C1 | import fcntl unconditional — crashes Windows | try/except with fallback; _sync_back_locked skips locking when fcntl=None |
| W1 | assert for runtime guard (stripped by -O) | Replaced with proper if/raise RuntimeError |
| W2 | O(n*m) from _get_files_fn() called per file | Cache mapping once at start of _sync_back_impl, pass to resolve/infer |
| W3 | Dead BulkDownloadFn imports in 3 backends | Removed unused imports |
| W4 | Modal hardcodes root/.hermes, no explanation | Added docstring comment explaining Modal always runs as root |
| S1 | SHA-256 computed for new files where pushed_hash=None | Skip hashing when pushed_hash is None (comparison always False) |
| S2 | Daytona /tmp/.hermes_sync.tar never cleaned up | Added rm -f after download (best-effort) |

Tests: 49 passing (17 new: _infer_host_path edge cases, SIGINT
main/worker thread, Windows fcntl=None fallback, Daytona tar cleanup).

Based on #8018 by @alt-glitch.
2026-04-16 19:39:21 -07:00
Brooklyn Nicholson
3746c60439 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-16 18:25:49 -05:00
Teknium
edefec4e68
fix(checkpoints): isolate shadow git repo from user's global config (#11261)
Users with 'commit.gpgsign = true' in their global git config got a
pinentry popup (or a failed commit) every time the agent took a
background filesystem snapshot — every write_file, patch, or diff
mid-session. With GPG_TTY unset, pinentry-qt/gtk would spawn a GUI
window, constantly interrupting the session.

The shadow repo is internal Hermes infrastructure.  It must not
inherit user-level git settings (signing, hooks, aliases, credential
helpers, etc.) under any circumstance.

Fix is layered:

1. _git_env() sets GIT_CONFIG_GLOBAL=os.devnull,
   GIT_CONFIG_SYSTEM=os.devnull, and GIT_CONFIG_NOSYSTEM=1.  Shadow
   git commands no longer see ~/.gitconfig or /etc/gitconfig at all
   (uses os.devnull for Windows compat).

2. _init_shadow_repo() explicitly writes commit.gpgsign=false and
   tag.gpgSign=false into the shadow's own config, so the repo is
   correct even if inspected or run against directly without the
   env vars, and for older git versions (<2.32) that predate
   GIT_CONFIG_GLOBAL.

3. _take() passes --no-gpg-sign inline on the commit call.  This
   covers existing shadow repos created before this fix — they will
   never re-run _init_shadow_repo (it is gated on HEAD not existing),
   so they would miss layer 2.  Layer 1 still protects them, but the
   inline flag guarantees correctness at the commit call itself.

Existing checkpoints, rollback, list, diff, and restore all continue
to work — history is untouched.  Users who had the bug stop getting
pinentry popups; users who didn't see no observable change.

Tests: 5 new regression tests in TestGpgAndGlobalConfigIsolation,
including a full E2E repro with fake HOME, global gpgsign=true, and
a deliberately broken GPG binary — checkpoint succeeds regardless.
2026-04-16 16:06:49 -07:00
Teknium
387aa9afc9
fix(approval): heartbeat activity during gateway approval wait (#11245)
The blocking gateway approval wait at tools/approval.py called
`entry.event.wait(timeout=...)` which never touched the agent's
activity tracker.  When a user was slow to respond to a /approve prompt
(or the gateway_timeout config was set higher than the default 300s),
the agent thread sat silent long enough for the gateway's inactivity
watchdog (agent.gateway_timeout, default 1800s) to kill it — even
though the agent was doing exactly the right thing and the user was
the one causing the delay.

The fix polls the event in 1s slices and calls touch_activity_if_due
between slices, mirroring the _wait_for_process() pattern in
tools/environments/base.py that covers the subprocess-waiting side of
the same problem.  At the default 10s heartbeat cadence, a 300s
approval wait now pings activity ~30 times, well under the 1800s
idle threshold.

Observed in community user logs: 12 repeated 'Agent idle 1800s,
last_activity=executing tool: terminal' events across April 12-14.
Companion to PR #10501 which covered streaming / concurrent-tool /
Modal-backend gaps but did not touch approval.py.

Test: tests/tools/test_approval_heartbeat.py — verifies (1) heartbeats
fire during the wait, (2) user responses are still near-instant, and
(3) the approval path stays functional when the heartbeat helper
can't be imported.
2026-04-16 14:48:50 -07:00
Teknium
fce6c3cdf6
feat(tts): add Google Gemini TTS provider (#11229)
Adds Google Gemini TTS as the seventh voice provider, with 30 prebuilt
voices (Zephyr, Puck, Kore, Enceladus, Gacrux, etc.) and natural-language
prompt control. Integrates through the existing provider chain:

- tools/tts_tool.py: new _generate_gemini_tts() calls the
  generativelanguage REST endpoint with responseModalities=[AUDIO],
  wraps the returned 24kHz mono 16-bit PCM (L16) in a WAV RIFF header,
  then ffmpeg-converts to MP3 or Opus depending on output extension.
  For .ogg output, libopus is forced explicitly so Telegram voice
  bubbles get Opus (ffmpeg defaults to Vorbis for .ogg).
- hermes_cli/tools_config.py: exposes 'Google Gemini TTS' as a provider
  option in the curses-based 'hermes tools' UI.
- hermes_cli/setup.py: adds gemini to the setup wizard picker, tool
  status display, and API key prompt branch (accepts existing
  GEMINI_API_KEY or GOOGLE_API_KEY, falls back to Edge if neither set).
- tests/tools/test_tts_gemini.py: 15 unit tests covering WAV header
  wrap correctness, env var fallback (GEMINI/GOOGLE), voice/model
  overrides, snake_case vs camelCase inlineData handling, HTTP error
  surfacing, and empty-audio edge cases.
- docs: TTS features page updated to list seven providers with the new
  gemini config block and ffmpeg notes.

Live-tested against api key against gemini-2.5-flash-preview-tts: .wav,
.mp3, and Telegram-compatible .ogg (Opus codec) all produce valid
playable audio.
2026-04-16 14:23:16 -07:00
Brooklyn Nicholson
cb2a737bc8 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-16 14:48:33 -05:00
emozilla
f188ac74f0 feat: ungate Tool Gateway — subscription-based access with per-tool opt-in
Replace the HERMES_ENABLE_NOUS_MANAGED_TOOLS env-var feature flag with
subscription-based detection. The Tool Gateway is now available to any
paid Nous subscriber without needing a hidden env var.

Core changes:
- managed_nous_tools_enabled() checks get_nous_auth_status() +
  check_nous_free_tier() instead of an env var
- New use_gateway config flag per tool section (web, tts, browser,
  image_gen) records explicit user opt-in and overrides direct API
  keys at runtime
- New prefers_gateway(section) shared helper in tool_backend_helpers.py
  used by all 4 tool runtimes (web, tts, image gen, browser)

UX flow:
- hermes model: after Nous login/model selection, shows a curses
  prompt listing all gateway-eligible tools with current status.
  User chooses to enable all, enable only unconfigured tools, or skip.
  Defaults to Enable for new users, Skip when direct keys exist.
- hermes tools: provider selection now manages use_gateway flag —
  selecting Nous Subscription sets it, selecting any other provider
  clears it
- hermes status: renamed section to Nous Tool Gateway, added
  free-tier upgrade nudge for logged-in free users
- curses_radiolist: new description parameter for multi-line context
  that survives the screen clear

Runtime behavior:
- Each tool runtime (web_tools, tts_tool, image_generation_tool,
  browser_use) checks prefers_gateway() before falling back to
  direct env-var credentials
- get_nous_subscription_features() respects use_gateway flags,
  suppressing direct credential detection when the user opted in

Removed:
- HERMES_ENABLE_NOUS_MANAGED_TOOLS env var and all references
- apply_nous_provider_defaults() silent TTS auto-set
- get_nous_subscription_explainer_lines() static text
- Override env var warnings (use_gateway handles this properly now)
2026-04-16 12:36:49 -07:00
Brooklyn Nicholson
9c71f3a6ea Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-16 10:47:41 -05:00
Teknium
45fc0bd83a
fix: UnboundLocalError on 'entry' in parallel subagent polling loop (#11050)
The completion-line printing block (idx = entry['task_index'] etc.)
was outside the 'for future in done:' loop but referenced 'entry'
which is only assigned inside that loop. When concurrent.futures.wait()
returns with an empty 'done' set (timeout expired, no futures finished),
the loop body never executes and 'entry' is unbound.

Moved the completion-line printing and spinner-update code inside
the for loop so each completed future gets its own status line,
and empty poll cycles simply loop back without accessing 'entry'.
2026-04-16 06:53:44 -07:00
kshitijk4poor
a6142a8e08 fix: follow-up for salvaged PR #10854
- Extract duplicated activity-callback polling into shared
  touch_activity_if_due() helper in tools/environments/base.py
- Use helper from both base.py _wait_for_process and
  code_execution_tool.py local polling loop (DRY)
- Add test assertion that timeout output field contains the
  timeout message and emoji (#10807)
- Add stream_consumer test for tool-boundary fallback scenario
  where continuation is empty but final_text differs from
  visible prefix (#10807)
2026-04-16 06:42:45 -07:00
konsisumer
3e3ec35a5e fix: surface execute_code timeout to user instead of silently dropping (#10807)
When execute_code times out, the result JSON had status="timeout" and an
error field, but the output field was empty.  Many models treat empty
output as "nothing happened" and produce an empty/minimal response.  The
gateway stream consumer then considers the response "already sent" (from
pre-tool streaming) and silently drops it — leaving the user staring at
silence.

Three changes:

1. Include the timeout message in the output field (both local and remote
   paths) so the model always has visible content to relay to the user.

2. Add periodic activity callbacks to the local execution polling loop so
   the gateway's inactivity monitor knows execute_code is alive during
   long runs.

3. Fix stream_consumer._send_fallback_final to not silently drop content
   when the continuation appears empty but the final text differs from
   what was previously streamed (e.g. after a tool boundary reset).
2026-04-16 06:42:45 -07:00
Brooklyn Nicholson
f81dba0da2 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-16 08:23:20 -05:00
Teknium
f726b9b843 fix(browser): runtime fallback to local Chromium when cloud provider fails
Wraps provider.create_session() in _get_session_info() with try/except
to catch cloud provider runtime failures (timeouts, auth errors, rate
limits, invalid responses). Falls back to _create_local_session() so
browser automation continues working when cloud APIs are down.

Marks fallback sessions with fallback_from_cloud, fallback_reason, and
fallback_provider metadata for observability. If both cloud and local
fail, raises RuntimeError with chained context from both errors.

Closes #10883
Co-authored-by: konsisumer <konsisumer@users.noreply.github.com>
2026-04-16 04:19:34 -07:00
Teknium
e66b373351
fix: word-wrap spinner, interruptable agent join, and delegate_task interrupt (#10940)
* fix: stop /model from silently rerouting direct providers to OpenRouter (#10300)

detect_provider_for_model() silently remapped models to OpenRouter when
the direct provider's credentials weren't found via env vars. Three bugs:

1. Credential check only looked at env vars from PROVIDER_REGISTRY,
   missing credential pool entries, auth store, and OAuth tokens
2. When env var check failed, silently returned ('openrouter', slug)
   instead of the direct provider the model actually belongs to
3. Users with valid credentials via non-env-var mechanisms (pool,
   OAuth, Claude Code tokens) got silently rerouted

Fix:
- Expand credential check to also query credential pool and auth store
- Always return the direct provider match regardless of credential
  status -- let client init handle missing creds with a clear error
  rather than silently routing through the wrong provider

Same philosophy as the provider-required fix: don't guess, don't
silently reroute, error clearly when something is missing.

Closes #10300

* fix: word-wrap spinner, interruptable agent join, and delegate_task interrupt

Three fixes:

1. Spinner widget clips long tool commands — prompt_toolkit Window had
   height=1 and wrap_lines=False. Now uses wrap_lines=True with dynamic
   height from text length / terminal width. Long commands wrap naturally.

2. agent_thread.join() blocked forever after interrupt — if the agent
   thread took time to clean up, the process_loop thread froze. Now polls
   with 0.2s timeout on the interrupt path, checking _should_exit so
   double Ctrl+C breaks out immediately.

3. Root cause of 5-hour CLI hang: delegate_task() used as_completed()
   with no interrupt check. When subagent children got stuck, the parent
   blocked forever inside the ThreadPoolExecutor. Now polls with
   wait(timeout=0.5) and checks parent_agent._interrupt_requested each
   iteration. Stuck children are reported as interrupted, and the parent
   returns immediately.
2026-04-16 03:50:49 -07:00
Markus Corazzione
c928ebb1b1 retry transient telegram send failures 2026-04-16 03:47:00 -07:00
Teknium
e4cd62d07d
fix(tests): resolve remaining CI failures — commit_memory_session, already_sent, timezone leak, session env (#10785)
Fixes 12 CI test failures:

1. test_cli_new_session (4): _FakeAgent missing commit_memory_session
   attribute added in the memory provider refactoring. Added MagicMock.

2. test_run_progress_topics (1): already_sent detection only checked
   stream consumer flags, missing the response_previewed path from
   interim_assistant_callback. Restructured guard to check both paths.

3. test_timezone (1): HERMES_TIMEZONE leaked into child processes via
   _SAFE_ENV_PREFIXES matching HERMES_*. The code correctly converts
   it to TZ but didn't remove the original. Added child_env.pop().

4. test_session_env (1): contextvars baseline captured from a different
   context couldn't be restored after clear. Changed assertion to verify
   the test's value was removed rather than comparing to a fragile baseline.

5. test_discord_slash_commands (5): already fixed on current main.
2026-04-16 02:26:14 -07:00
Teknium
0c1217d01e feat(xai): upgrade to Responses API, add TTS provider
Cherry-picked and trimmed from PR #10600 by Jaaneek.

- Switch xAI transport from openai_chat to codex_responses (Responses API)
- Add codex_responses detection for xAI in all runtime_provider resolution paths
- Add xAI api_mode detection in AIAgent.__init__ (provider name + URL auto-detect)
- Add extra_headers passthrough for codex_responses requests
- Add x-grok-conv-id session header for xAI prompt caching
- Add xAI reasoning support (encrypted_content include, no effort param)
- Move x-grok-conv-id from chat_completions path to codex_responses path
- Add xAI TTS provider (dedicated /v1/tts endpoint with Opus conversion)
- Add xAI provider aliases (grok, x-ai, x.ai) across auth, models, providers, auxiliary
- Trim xAI model list to agentic models (grok-4.20-reasoning, grok-4-1-fast-reasoning)
- Add XAI_API_KEY/XAI_BASE_URL to OPTIONAL_ENV_VARS
- Add xAI TTS config section, setup wizard entry, tools_config provider option
- Add shared xai_http.py helper for User-Agent string

Co-authored-by: Jaaneek <Jaaneek@users.noreply.github.com>
2026-04-16 02:24:08 -07:00
Teknium
3ff18ffe14
fix: add circuit breaker to MCP tool handler to prevent retry burn loops (#10447) (#10776)
When an MCP server returns errors consistently (crashed, disconnected,
auth expired), the model sees each error and retries the tool call.
With no circuit breaker, this burned through all 90 iterations — each
one a full LLM API call plus failed MCP call — producing 15-45 minutes
of zero useful output while the gateway inactivity timeout never fired
(because the agent WAS active, just uselessly).

Fix: track consecutive error counts per MCP server. After 3 consecutive
failures (connection errors, MCP-level errors, or transport exceptions),
the handler short-circuits with a message telling the model to stop
retrying and use alternative approaches. The counter resets to 0 on
any successful call.

Closes #10447
2026-04-15 22:33:48 -07:00
Kovyrin Family Claw
00ff9a26cd Fix Telegram link preview suppression for bot sends 2026-04-15 17:54:43 -07:00
jneeee
4936b19144 fix(cron): guard telegram import in _send_to_platform against ImportError
Wrap the TelegramAdapter import in _send_to_platform() with a try/except
ImportError guard, matching the existing Feishu pattern in the same function.

When python-telegram-bot is not installed, the import no longer crashes the
cron scheduler. Instead, MAX_MESSAGE_LENGTH falls back to a hardcoded 4096.

The _send_telegram() function already had its own ImportError guard for the
telegram package; this fixes the remaining bare import of TelegramAdapter
in the platform-routing function.
2026-04-15 17:54:33 -07:00
Teknium
c850a40e4e fix: gate Matrix adapter path on media_files presence
Text-only Matrix sends should continue using the lightweight _send_matrix()
HTTP helper (~100ms). Only route through the heavy MatrixAdapter (full sync +
E2EE setup) when media files are present. Adds test verifying text-only
messages don't take the adapter path.
2026-04-15 17:37:43 -07:00
Teknium
276ed5c399 fix(send_message): deliver Matrix media via adapter
Matrix media delivery was silently dropped by send_message because Matrix
wasn't wired into the native adapter-backed media path. Only Telegram,
Discord, and Weixin had native media support.

Adds _send_matrix_via_adapter() which creates a MatrixAdapter instance,
connects, sends text + media via the adapter's native upload methods
(send_document, send_image_file, send_video, send_voice), then disconnects.

Also fixes a stale URL-encoding assertion in test_send_message_missing_platforms
that broke after PR #10151 added quote() to room IDs.

Cherry-picked from PR #10486 by helix4u.
2026-04-15 17:37:43 -07:00
handsdiff
933fbd8fea fix: prevent agent hang when backgrounding processes via terminal tool
bash -lic with a PTY enables job control (set -m), which waits for all
background jobs before the shell exits. A command like
`python3 -m http.server &>/dev/null &` hangs forever because the shell
never completes.

Prefix `set +m;` to disable job control while keeping -i for .bashrc
sourcing and PTY for interactive tools.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-15 17:26:31 -07:00
Teknium
4fdcae6c91
fix: use absolute skill_dir for external skills (#10313) (#10587)
_load_skill_payload() reconstructed skill_dir as SKILLS_DIR / relative_path,
which is wrong for external skills from skills.external_dirs — they live
outside SKILLS_DIR entirely. Scripts and linked files failed to load.

Fix: skill_view() now includes the absolute skill_dir in its result dict.
_load_skill_payload() uses that directly when available, falling back to
the SKILLS_DIR-relative reconstruction only for legacy responses.

Closes #10313
2026-04-15 17:22:55 -07:00
shin4
63d045b51a
fix: pass HERMES_HOME to execute_code subprocess (#6644)
Add "HERMES_" to _SAFE_ENV_PREFIXES in code_execution_tool.py so HERMES_HOME and other Hermes env vars pass through to execute_code subprocesses. Fixes vision_analyze and other tools that rely on get_hermes_home() failing in Docker environments with non-default HERMES_HOME.

Authored by @shin4.
2026-04-15 17:13:11 -07:00
Brooklyn Nicholson
097702c8a7 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-15 19:11:07 -05:00
Teknium
e402906d48
fix: five HERMES_HOME profile-isolation leaks (#10570)
* fix: show correct env var name in provider API key error (#9506)

The error message for missing provider API keys dynamically built
the env var name as PROVIDER_API_KEY (e.g. ALIBABA_API_KEY), but
some providers use different names (alibaba uses DASHSCOPE_API_KEY).
Users following the error message set the wrong variable.

Fix: look up the actual env var from PROVIDER_REGISTRY before
building the error. Falls back to the dynamic name if the registry
lookup fails.

Closes #9506

* fix: five HERMES_HOME profile-isolation leaks (#5947)

Bug A: Thread session_title from session_db to memory provider init kwargs
so honcho can derive chat-scoped session keys instead of falling back to
cwd-based naming that merges all gateway users into one session.

Bug B: Replace 14 hardcoded ~/.hermes/skills/ paths across 10 skill files
with HERMES_HOME-aware alternatives (${HERMES_HOME:-$HOME/.hermes} in
shell, os.environ.get('HERMES_HOME', ...) in Python).

Bug C: install.sh now respects HERMES_HOME env var and adds --hermes-home
flag. Previously --dir only set INSTALL_DIR while HERMES_HOME was always
hardcoded to $HOME/.hermes.

Bug D: Remove hardcoded ~/.hermes/honcho.json fallback in resolve_config_path().
Non-default profiles no longer silently inherit the default profile's honcho
config. Falls through to ~/.honcho/config.json (global) instead.

Bug E: Guard _edit_skill, _patch_skill, _delete_skill, _write_file, and
_remove_file against writing to skills found in external_dirs. Skills
outside the local SKILLS_DIR are now read-only from the agent's perspective.

Closes #5947
2026-04-15 17:09:41 -07:00
Brooklyn Nicholson
72aebfbb24 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-15 17:43:41 -05:00
Ruzzgar
de3f8bc6ce fix terminal workdir validation for Windows paths 2026-04-15 15:06:51 -07:00
Teknium
18396af31e
fix: handle cross-device shutil.move failure in tirith auto-install (#10127) (#10524)
_install_tirith() uses shutil.move() to place the binary from tmpdir
to ~/.hermes/bin/.  When these are on different filesystems (common in
Docker, NFS), shutil.move() falls back to copy2 + unlink, but copy2's
metadata step can raise PermissionError.  This exception propagated
past the fail_open guard, crashing the terminal tool entirely.

Additionally, a failed install could leave a non-executable tirith
binary at the destination, causing a retry loop on every subsequent
terminal command.

Fix:
- Catch OSError from shutil.move() and fall back to shutil.copy()
  (skips metadata/xattr copying that causes PermissionError)
- If even copy fails, clean up the partial dest file to prevent
  the non-executable retry loop
- Return (None, 'cross_device_copy_failed') so the failure routes
  through the existing install-failure caching and fail_open logic

Closes #10127
2026-04-15 14:50:07 -07:00
Brooklyn Nicholson
baa0de7649 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-15 16:35:01 -05:00
Teknium
305a702e09
fix: /browser connect CDP override now takes priority over Camofox (#10523)
When a user runs /browser connect to attach browser tools to their real
Chrome instance via CDP, the BROWSER_CDP_URL env var is set. However,
every browser tool function checks _is_camofox_mode() first, which
short-circuits to the Camofox backend before _get_session_info() ever
checks for the CDP override.

Fix: is_camofox_mode() now returns False when BROWSER_CDP_URL is set,
so the explicit CDP connection takes priority. This is the correct
behavior — /browser connect is an intentional user override.

Reported by SkyLinx on Discord.
2026-04-15 14:11:18 -07:00
Teknium
824c33729d
fix(session_search): coerce limit to int to prevent TypeError with non-int values (#10522)
Models (especially open-source like qwen3.5-plus) may send non-int values
for the limit parameter — None (JSON null), string, or even a type object.
This caused TypeError: '<=' not supported between instances of 'int' and
'type' when the value reached min()/comparison operations.

Changes:
- Add defensive int coercion at session_search() entry with fallback to 3
- Clamp limit to [1, 5] range (was only capped at 5, not floored)
- Add tests for None, type object, string, negative, and zero limit values

Reported by community user ludoSifu via Discord.
2026-04-15 14:11:05 -07:00
Teknium
861efe274b
fix: add ensure_ascii=False to all MCP json.dumps calls (#10234) (#10512)
Python's json.dumps() defaults to ensure_ascii=True, escaping non-ASCII
characters to \uXXXX sequences.  For CJK characters this inflates
token count 3-4x — a single Chinese character like '中' becomes
'\u4e2d' (6 chars vs 3 bytes, ~6 tokens vs ~1 token).

Since MCP tool results feed directly into the model's conversation
context, this silently multiplied API costs for Chinese, Japanese,
and Korean users.

Fix: add ensure_ascii=False to all 20 json.dumps calls in mcp_tool.py.
Raw UTF-8 is valid JSON per RFC 8259 and all downstream consumers
(LLM APIs, display) handle it correctly.

Closes #10234
2026-04-15 13:59:57 -07:00
Teknium
a418ddbd8b
fix: add activity heartbeats to prevent false gateway inactivity timeouts (#10501)
Multiple gaps in activity tracking could cause the gateway's inactivity
timeout to fire while the agent is actively working:

1. Streaming wait loop had no periodic heartbeat — the outer thread only
   touched activity when the stale-stream detector fired (180-300s), and
   for local providers (Ollama) the stale timeout was infinity, meaning
   zero heartbeats. Now touches activity every 30s.

2. Concurrent tool execution never set the activity callback on worker
   threads (threading.local invisible across threads) and never set
   _current_tool. Workers now set the callback, and the concurrent wait
   uses a polling loop with 30s heartbeats.

3. Modal backend's execute() override had its own polling loop without
   any activity callback. Now matches _wait_for_process cadence (10s).
2026-04-15 13:29:05 -07:00
Brooklyn Nicholson
53a024a941 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-15 14:37:54 -05:00
Brooklyn Nicholson
4b4b4d47bc feat: just more cleaning 2026-04-15 14:14:01 -05:00
kshitijk4poor
2276b72141 fix: follow-up improvements for watch notification routing (#9537)
- Populate watcher_* routing fields for watch-only processes (not just
  notify_on_complete), so watch-pattern events carry direct metadata
  instead of relying solely on session_key parsing fallback
- Extract _parse_session_key() helper to dedupe session key parsing
  at two call sites in gateway/run.py
- Add negative test proving cross-thread leakage doesn't happen
- Add edge-case tests for _build_process_event_source returning None
  (empty evt, invalid platform, short session_key)
- Add unit tests for _parse_session_key helper
2026-04-15 11:16:01 -07:00
etcircle
dee592a0b1 fix(gateway): route synthetic background events by session 2026-04-15 11:16:01 -07:00
Brooklyn Nicholson
371166fe26 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-15 10:21:00 -05:00
Teknium
722331a57d
fix: replace hardcoded ~/.hermes with display_hermes_home() in agent-facing text (#10285)
Tool schema descriptions and tool return values contained hardcoded
~/.hermes paths that the model sees and uses. When HERMES_HOME is set
to a custom path (Docker containers, profiles), the agent would still
reference ~/.hermes — looking at the wrong directory.

Fixes 6 locations across 5 files:
- tools/tts_tool.py: output_path schema description
- tools/cronjob_tools.py: script path schema description
- tools/skill_manager_tool.py: skill_manage schema description
- tools/skills_tool.py: two tool return messages
- agent/skill_commands.py: skill config injection text

All now use display_hermes_home() which resolves to the actual
HERMES_HOME path (e.g. /opt/data for Docker, ~/.hermes/profiles/X
for profiles, ~/.hermes for default).

Reported by: Sandeep Narahari (PrithviDevs)
2026-04-15 04:57:55 -07:00
Teknium
47e6ea84bb fix: file handle bug, warning text, and tests for Discord media send
- Fix file handle closed before POST: nest session.post() inside
  the 'with open()' block so aiohttp can read the file during upload
- Update warning text to include weixin (also supports media delivery)
- Add 8 unit tests covering: text+media, media-only, missing files,
  upload failures, multiple files, and _send_to_platform routing
2026-04-15 04:16:06 -07:00
sprmn24
4bcb2f2d26 feat(send_message): add native media attachment support for Discord
Previously send_message only supported media delivery for Telegram.
Discord users received a warning that media was omitted.

- Add media_files parameter to _send_discord()
- Upload media via Discord multipart/form-data API (files[0] field)
- Handle Discord in _send_to_platform() same way as Telegram block
- Remove Discord from generic chunk loop (now handled above)
- Update error/warning strings to mention telegram and discord
2026-04-15 04:16:06 -07:00
Teknium
1c4d3216d3
fix(cron): include job_id in delivery and guide models on removal workflow (#10242)
* fix(gateway): suppress duplicate replies on interrupt and streaming flood control

Three fixes for the duplicate reply bug affecting all gateway platforms:

1. base.py: Suppress stale response when the session was interrupted by a
   new message that hasn't been consumed yet. Checks both interrupt_event
   and _pending_messages to avoid false positives. (#8221, #2483)

2. run.py (return path): Remove response_previewed guard from already_sent
   check. Stream consumer's already_sent alone is authoritative — if
   content was delivered via streaming, the duplicate send must be
   suppressed regardless of the agent's response_previewed flag. (#8375)

3. run.py (queued-message path): Same fix — already_sent without
   response_previewed now correctly marks the first response as already
   streamed, preventing re-send before processing the queued message.

The response_previewed field is still produced by the agent (run_agent.py)
but is no longer required as a gate for duplicate suppression. The stream
consumer's already_sent flag is the delivery-level truth about what the
user actually saw.

Concepts from PR #8380 (konsisumer). Closes #8375, #8221, #2483.

* fix(cron): include job_id in delivery and guide models on removal workflow

Users reported cron reminders keep firing after asking the agent to stop.
Root cause: the conversational agent didn't know the job_id (not in delivery)
and models don't reliably do the list→remove two-step without guidance.

1. Include job_id in the cron delivery wrapper so users and agents can
   reference it when requesting removal.

2. Replace confusing footer ('The agent cannot see this message') with
   actionable guidance ('To stop or manage this job, send me a new
   message').

3. Add explicit list→remove guidance in the cronjob tool schema so models
   know to list first and never guess job IDs.
2026-04-15 03:46:58 -07:00
Teknium
e69526be79
fix(send_message): URL-encode Matrix room IDs and add Matrix to schema examples (#10151)
Matrix room IDs contain ! and : which must be percent-encoded in URI
path segments per the Matrix C-S spec. Without encoding, some
homeservers reject the PUT request.

Also adds 'matrix:!roomid:server.org' and 'matrix:@user:server.org'
to the tool schema examples so models know the correct target format.
2026-04-15 00:10:59 -07:00
bkadish
03446e06bb fix(send_message): accept Matrix room IDs and user MXIDs as explicit targets
`_parse_target_ref` has explicit-reference branches for Telegram, Feishu,
and numeric IDs, but none for Matrix. As a result, callers of
`send_message(target="matrix:!roomid:server")` or
`send_message(target="matrix:@user:server")` fall through to
`(None, None, False)` and the tool errors out with a resolution failure —
even though a raw Matrix room ID or MXID is the most unambiguous possible
target.

Three-line fix: recognize `!…` as a room ID and `@…` as a user MXID when
platform is `matrix`, and return them as explicit targets. Alias-based
targets (`#…`) continue to go through the normal resolve path.
2026-04-15 00:08:14 -07:00
Brooklyn Nicholson
561cea0d4a Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-15 00:02:31 -05:00
Teknium
8548893d14
feat: entry-level Podman support — find_docker() + rootless entrypoint (#10066)
- find_docker() now checks HERMES_DOCKER_BINARY env var first, then
  docker on PATH, then podman on PATH, then macOS known locations
- Entrypoint respects HERMES_HOME env var (was hardcoded to /opt/data)
- Entrypoint uses groupmod -o to tolerate non-unique GIDs (fixes macOS
  GID 20 conflict with Debian's dialout group)
- Entrypoint makes chown best-effort so rootless Podman continues
  instead of failing with 'Operation not permitted'
- 5 new tests covering env var override, podman fallback, precedence

Based on work by alanjds (PR #3996) and malaiwah (PR #8115).
Closes #4084.
2026-04-14 21:20:37 -07:00
Teknium
ba24f058ed docs: fix stale docstring reference to _discover_tools in mcp_tool.py 2026-04-14 21:12:29 -07:00
Teknium
fc6cb5b970 fix: tighten AST check to module-level only
The original tree-wide ast.walk() would match registry.register() calls
inside functions too. Restrict to top-level ast.Expr statements so helper
modules that call registry.register() inside a function are never picked
up as tool modules.
2026-04-14 21:12:29 -07:00
Greer Guthrie
4b2a1a4337 fix(tools): auto-discover built-in tool modules 2026-04-14 21:12:29 -07:00
Brooklyn Nicholson
77cd5bf565 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-14 19:33:03 -05:00
Greer Guthrie
c10fea8d26 fix(mcp): make server aliases explicit 2026-04-14 17:19:20 -07:00
Greer Guthrie
cda64a5961 fix(mcp): resolve toolsets from live registry 2026-04-14 17:19:20 -07:00
adybag14-cyber
56c34ac4f7 fix(browser): add termux PATH fallbacks
Refactor browser tool PATH construction to include Termux directories
(/data/data/com.termux/files/usr/bin, /data/data/com.termux/files/usr/sbin)
so agent-browser and npx are discoverable on Android/Termux.

Extracts _browser_candidate_path_dirs() and _merge_browser_path() helpers
to centralize PATH construction shared between _find_agent_browser() and
_run_browser_command(), replacing duplicated inline logic.

Also fixes os.pathsep usage (was hardcoded ':') for cross-platform correctness.

Cherry-picked from PR #9846.
2026-04-14 16:55:55 -07:00
Brooklyn Nicholson
bf54f1fb2f Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-14 18:26:05 -05:00
Teknium
1525624904 fix: block agent from self-destructing gateway via terminal (#6666)
Add dangerous command patterns that require approval when the agent
tries to run gateway lifecycle commands via the terminal tool:

- hermes gateway stop/restart — kills all running agents mid-work
- hermes update — pulls code and restarts the gateway
- systemctl restart/stop (with optional flags like --user)

These patterns fire the approval prompt so the user must explicitly
approve before the agent can kill its own gateway process. In YOLO
mode, the commands run without approval (by design — YOLO means the
user accepts all risks).

Also fixes the existing systemctl pattern to handle flags between
the command and action (e.g. 'systemctl --user restart' was previously
undetected because the regex expected the action immediately after
'systemctl').

Root cause: issue #6666 reported agents running 'hermes gateway
restart' via terminal, killing the gateway process mid-agent-loop.
The user sees the agent suddenly stop responding with no explanation.
Combined with the SIGTERM auto-recovery from PR #9875, the gateway
now both prevents accidental self-destruction AND recovers if it
happens anyway.

Test plan:
- Updated test_systemctl_restart_not_flagged → test_systemctl_restart_flagged
- All 119 approval tests pass
- E2E verified: hermes gateway restart, hermes update, systemctl
  --user restart all detected; hermes gateway status, systemctl
  status remain safe
2026-04-14 15:43:31 -07:00
Teknium
eed891f1bb
security: supply chain hardening — CI pinning, dep pinning, and code fixes (#9801)
CI/CD Hardening:
- Pin all 12 GitHub Actions to full commit SHAs (was mutable @vN tags)
- Add explicit permissions: {contents: read} to 4 workflows
- Pin CI pip installs to exact versions (pyyaml==6.0.2, httpx==0.28.1)
- Extend supply-chain-audit.yml to scan workflow, Dockerfile, dependency
  manifest, and Actions version changes

Dependency Pinning:
- Pin git-based Python deps to commit SHAs (atroposlib, tinker, yc-bench)
- Pin WhatsApp Baileys from mutable branch to commit SHA

Tool Registry:
- Reject tool name shadowing from different tool families (plugins/MCP
  cannot overwrite built-in tools). MCP-to-MCP overwrites still allowed.

MCP Security:
- Add tool description content scanning for prompt injection patterns
- Log detailed change diff on dynamic tool refresh at WARNING level

Skill Manager:
- Fix dangerous verdict bug: agent-created skills with dangerous
  findings were silently allowed (ask->None->allow). Now blocked.
2026-04-14 14:23:37 -07:00
N0nb0at
b21b3bfd68 feat(plugins): namespaced skill registration for plugin skill bundles
Add ctx.register_skill() API so plugins can ship SKILL.md files under
a 'plugin:skill' namespace, preventing name collisions with built-in
Hermes skills. skill_view() detects the ':' separator and routes to
the plugin registry while bare names continue through the existing
flat-tree scan unchanged.

Key additions:
- agent/skill_utils: parse_qualified_name(), is_valid_namespace()
- hermes_cli/plugins: PluginContext.register_skill(), PluginManager
  skill registry (find/list/remove)
- tools/skills_tool: qualified name dispatch in skill_view(),
  _serve_plugin_skill() with full guards (disabled, platform,
  injection scan), bundle context banner with sibling listing,
  stale registry self-heal
- Hoisted _INJECTION_PATTERNS to module level (dedup)
- Updated skill_view schema description

Based on PR #9334 by N0nb0at. Lean P1 salvage — omits autogen shim
(P2) for a simpler first merge.

Closes #8422
2026-04-14 10:42:58 -07:00
Teknium
0e7dd30acc
fix(browser): fix Camofox JS eval endpoint, userId, and package rename (#9774)
- Fix _camofox_eval() endpoint: /tabs/{id}/eval → /tabs/{id}/evaluate
  (correct Camofox REST API path)
- Add required userId field to JS eval request body (all other Camofox
  endpoints already include it)
- Update npm package from @askjo/camoufox-browser ^1.0.0 to
  @askjo/camofox-browser ^1.5.2 (upstream package was renamed)
- Update tools_config.py post-setup to reference new package directory
  and npx command
- Bump Node engine requirement from >=18 to >=20 (required by
  camoufox-js dependency in camofox-browser v1.5.2)
- Regenerate package-lock.json

Fixes issues reported in PRs #9472, #8267, #7208 (stale).
2026-04-14 10:21:54 -07:00
Teknium
5f36b42b2e fix: nest msvcrt import inside fcntl except block
Match cron/scheduler.py pattern — only attempt msvcrt import when
fcntl is unavailable. Pre-declare msvcrt = None at module level so
_file_lock() references don't NameError on Linux.
2026-04-14 10:18:05 -07:00
Dusk1e
420d27098f fix(tools): keep memory tool available when fcntl is unavailable 2026-04-14 10:18:05 -07:00
Brooklyn Nicholson
9a3a2925ed feat: scroll aware sticky prompt 2026-04-14 11:49:32 -05:00
Teknium
2558d28a9b
fix: resolve CI test failures — add missing functions, fix stale tests (#9483)
Production fixes:
- Add clear_session_context() to hermes_logging.py (fixes 48 teardown errors)
- Add clear_session() to tools/approval.py (fixes 9 setup errors)
- Add SyncError M_UNKNOWN_TOKEN check to Matrix _sync_loop (bug fix)
- Fall back to inline api_key in named custom providers when key_env
  is absent (runtime_provider.py)

Test fixes:
- test_memory_user_id: use builtin+external provider pair, fix honcho
  peer_name override test to match production behavior
- test_display_config: remove TestHelpers for non-existent functions
- test_auxiliary_client: fix OAuth tokens to match _is_oauth_token
  patterns, replace get_vision_auxiliary_client with resolve_vision_provider_client
- test_cli_interrupt_subagent: add missing _execution_thread_id attr
- test_compress_focus: add model/provider/api_key/base_url/api_mode
  to mock compressor
- test_auth_provider_gate: add autouse fixture to clean Anthropic env
  vars that leak from CI secrets
- test_opencode_go_in_model_list: accept both 'built-in' and 'hermes'
  source (models.dev API unavailable in CI)
- test_email: verify email Platform enum membership instead of source
  inspection (build_channel_directory now uses dynamic enum loop)
- test_feishu: add bot_added/bot_deleted handler mocks to _Builder
- test_ws_auth_retry: add AsyncMock for sync_store.get_next_batch,
  add _pending_megolm and _joined_rooms to Matrix adapter mocks
- test_restart_drain: monkeypatch-delete INVOCATION_ID (systemd sets
  this in CI, changing the restart call signature)
- test_session_hygiene: add user_id to SessionSource
- test_session_env: use relative baseline for contextvar clear check
  (pytest-xdist workers share context)
2026-04-14 01:43:45 -07:00
Teknium
8d545da3ff fix: add platform lock, send retry, message splitting, REST one-shot, shared strip_markdown
Improvements from our earlier #8269 salvage work applied to #7616:

- Platform token lock: acquire_scoped_lock/release_scoped_lock prevents
  two profiles from double-connecting the same QQ bot simultaneously
- Send retry with exponential backoff (3 attempts, 1s/2s/4s) with
  permanent vs transient error classification (matches Telegram pattern)
- Proper long-message splitting via truncate_message() instead of
  hard-truncating at MAX_MESSAGE_LENGTH (preserves code blocks, adds 1/N)
- REST-based one-shot send in send_message_tool — uses QQ Bot REST API
  directly with httpx instead of creating a full WebSocket adapter per
  message (fixes the connect→send race condition)
- Use shared strip_markdown() from helpers.py instead of 15 lines of
  inline regex with import-inside-method (DRY, same as BlueBubbles/SMS)
- format_message() now wired into send() pipeline
2026-04-14 00:11:49 -07:00
walli
884cd920d4 feat(gateway): unify QQBot branding, add PLATFORM_HINTS, fix streaming, restore missing setup functions
- Rename platform from 'qq' to 'qqbot' across all integration points
  (Platform enum, toolset, config keys, import paths, file rename qq.py → qqbot.py)
- Add PLATFORM_HINTS for QQBot in prompt_builder (QQ supports markdown)
- Set SUPPORTS_MESSAGE_EDITING = False to skip streaming on QQ
  (prevents duplicate messages from non-editable partial + final sends)
- Add _send_qqbot() standalone send function for cron/send_message tool
- Add interactive _setup_qq() wizard in hermes_cli/setup.py
- Restore missing _setup_signal/email/sms/dingtalk/feishu/wecom/wecom_callback
  functions that were lost during the original merge
2026-04-14 00:11:49 -07:00
Junjun Zhang
87bfc28e70 feat: add QQ Bot platform adapter (Official API v2)
Add full QQ Bot integration via the Official QQ Bot API (v2):
- WebSocket gateway for inbound events (C2C, group, guild, DM)
- REST API for outbound text/markdown/media messages
- Voice transcription (Tencent ASR + configurable STT provider)
- Attachment processing (images, voice, files)
- User authorization (allowlist + allow-all + DM pairing)

Integration points:
- gateway: Platform.QQ enum, adapter factory, allowlist maps
- CLI: setup wizard, gateway config, status display, tools config
- tools: send_message cross-platform routing, toolsets
- cron: delivery platform support
- docs: QQ Bot setup guide
2026-04-14 00:11:49 -07:00
Teknium
eb44abd6b1
feat: improve file search UX — fuzzy @ completions, mtime sorting, better suggestions (#9467)
Three improvements to file search based on user feedback:

1. Fuzzy @ completions (commands.py):
   - Bare @query now does project-wide fuzzy file search instead of
     prefix-only directory listing
   - Uses rg --files with 5-second cache for responsive completions
   - Scoring: exact name (100) > prefix (80) > substring (60) >
     path contains (40) > subsequence with boundary bonus (35/25)
   - Bare @ with no query shows recently modified files first

2. Mtime-sorted file search (file_operations.py):
   - _search_files_rg now uses --sortr=modified (rg 13+) to surface
     recently edited files first
   - Falls back to unsorted on older rg versions

3. Improved file-not-found suggestions (file_operations.py):
   - Replaced crude character-set overlap with ranked scoring:
     same basename (90) > prefix (70) > substring (60) >
     reverse substring (40) > same extension (30)
   - search_files path-not-found now suggests similar directories
     from the parent
2026-04-13 23:54:45 -07:00
Greer Guthrie
c7e2fe655a fix: make tool registry reads thread-safe 2026-04-13 23:52:32 -07:00
helix4u
e08590888a fix: honor interrupts during MCP tool waits 2026-04-13 22:14:55 -07:00
haileymarshall
f0b353bade feat(skills): add fitness-nutrition skill to optional-skills
Cherry-picked from PR #9177 by @haileymarshall.

Adds a fitness and nutrition skill for gym-goers and health-conscious users:
- Exercise search via wger API (690+ exercises, free, no auth)
- Nutrition lookup via USDA FoodData Central (380K+ foods, DEMO_KEY fallback)
- Offline body composition calculators (BMI, TDEE, 1RM, macros, body fat %)
- Pure stdlib Python, no pip dependencies

Changes from original PR:
- Moved from skills/ to optional-skills/health/ (correct location)
- Fixed BMR formula in FORMULAS.md (removed confusing -5+10, now just +5)
- Fixed author attribution to match PR submitter
- Marked USDA_API_KEY as optional (DEMO_KEY works without signup)

Also adds optional env var support to the skill readiness checker:
- New 'optional: true' field in required_environment_variables entries
- Optional vars are preserved in metadata but don't block skill readiness
- Optional vars skip the CLI capture prompt flow
- Skills with only optional missing vars show as 'available' not 'setup_needed'
2026-04-13 22:10:00 -07:00
Brooklyn Nicholson
1b573b7b21 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-13 21:17:41 -05:00
Teknium
f324222b79
fix: add vLLM/local server error patterns + MCP initial connection retry (#9281)
Port two improvements inspired by Kilo-Org/kilocode analysis:

1. Error classifier: add context overflow patterns for vLLM, Ollama,
   and llama.cpp/llama-server. These local inference servers return
   different error formats than cloud providers (e.g., 'exceeds the
   max_model_len', 'context length exceeded', 'slot context'). Without
   these patterns, context overflow errors from local servers are
   misclassified as format errors, causing infinite retries instead
   of triggering compression.

2. MCP initial connection retry: previously, if the very first
   connection attempt to an MCP server failed (e.g., transient DNS
   blip at startup), the server was permanently marked as failed with
   no retry. Post-connect reconnection had 5 retries with exponential
   backoff, but initial connection had zero. Now initial connections
   retry up to 3 times with backoff before giving up, matching the
   resilience of post-connect reconnection.
   (Inspired by Kilo Code's MCP server disappearing fix in v1.3.3)

Tests: 6 new error classifier tests, 4 new MCP retry tests, 1
updated existing test. All 276 affected tests pass.
2026-04-13 18:46:14 -07:00
Brooklyn Nicholson
7e4dd6ea02 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-13 18:32:13 -05:00
Teknium
8d023e43ed
refactor: remove dead code — 1,784 lines across 77 files (#9180)
Deep scan with vulture, pyflakes, and manual cross-referencing identified:
- 41 dead functions/methods (zero callers in production)
- 7 production-dead functions (only test callers, tests deleted)
- 5 dead constants/variables
- ~35 unused imports across agent/, hermes_cli/, tools/, gateway/

Categories of dead code removed:
- Refactoring leftovers: _set_default_model, _setup_copilot_reasoning_selection,
  rebuild_lookups, clear_session_context, get_logs_dir, clear_session
- Unused API surface: search_models_dev, get_pricing, skills_categories,
  get_read_files_summary, clear_read_tracker, menu_labels, get_spinner_list
- Dead compatibility wrappers: schedule_cronjob, list_cronjobs, remove_cronjob
- Stale debug helpers: get_debug_session_info copies in 4 tool files
  (centralized version in debug_helpers.py already exists)
- Dead gateway methods: send_emote, send_notice (matrix), send_reaction
  (bluebubbles), _normalize_inbound_text (feishu), fetch_room_history
  (matrix), _start_typing_indicator (signal), parse_feishu_post_content
- Dead constants: NOUS_API_BASE_URL, SKILLS_TOOL_DESCRIPTION,
  FILE_TOOLS, VALID_ASPECT_RATIOS, MEMORY_DIR
- Unused UI code: _interactive_provider_selection,
  _interactive_model_selection (superseded by prompt_toolkit picker)

Test suite verified: 609 tests covering affected files all pass.
Tests for removed functions deleted. Tests using removed utilities
(clear_read_tracker, MEMORY_DIR) updated to use internal APIs directly.
2026-04-13 16:32:04 -07:00
Teknium
0dd26c9495
fix(tests): fix 78 CI test failures and remove dead test (#9036)
Production fixes:
- voice_mode.py: add is_recording property to AudioRecorder (parity with TermuxAudioRecorder)
- cronjob_tools.py: add sms example to deliver description

Test fixes:
- test_real_interrupt_subagent: add missing _execution_thread_id (fixes 19 cascading failures from leaked _build_system_prompt patch)
- test_anthropic_error_handling: add _FakeMessages, override _interruptible_streaming_api_call (6 fixes)
- test_ctx_halving_fix: add missing request_overrides attribute (4 fixes)
- test_context_token_tracking: set _disable_streaming=True for non-streaming test path (4 fixes)
- test_dict_tool_call_args: set _disable_streaming=True (1 fix)
- test_provider_parity: add model='gpt-4o' for AIGateway tests to meet 64K minimum context (4 fixes)
- test_session_race_guard: add user_id to SessionSource (5 fixes)
- test_restart_drain/helpers: add user_id to SessionSource (2 fixes)
- test_telegram_photo_interrupts: add user_id to SessionSource
- test_interrupt: target thread_id for per-thread interrupt system (2 fixes)
- test_zombie_process_cleanup: rewrite with object.__new__ for refactored GatewayRunner.stop() (1 fix)
- test_browser_camofox_state: update config version 15->17 (1 fix)
- test_trajectory_compressor_async: widen lookback window 10->20 for line-shifted AsyncOpenAI (1 fix)
- test_voice_mode: fixed by production is_recording addition (5 fixes)
- test_voice_cli_integration: add _attached_images to CLI stub (2 fixes)
- test_hermes_logging: explicit propagation/level reset for cross-test pollution defense (1 fix)
- test_run_agent: add base_url for OpenRouter detection tests (2 fixes)

Deleted:
- test_inline_think_blocks_reasoning_only_accepted: tested unimplemented inline <think> handling
2026-04-13 10:50:24 -07:00
konsisumer
311dac1971 fix(file_tools): block /private/etc writes on macOS symlink bypass
On macOS, /etc is a symlink to /private/etc, so os.path.realpath()
resolves /etc/hosts to /private/etc/hosts. The sensitive path check
only matched /etc/ prefixes against the resolved path, allowing
writes to system files on macOS.

- Add /private/etc/ and /private/var/ to _SENSITIVE_PATH_PREFIXES
- Check both realpath-resolved and normpath-normalized paths
- Add regression tests for macOS symlink bypass

Closes #8734
Co-authored-by: ElhamDevelopmentStudio (PR #8829)
2026-04-13 05:15:05 -07:00
Al Sayed Hoota
a5bc698b9a fix(session_search): improve truncation to center on actual query matches
Three-tier match strategy for _truncate_around_matches():
1. Full-phrase search (exact query string positions)
2. Proximity co-occurrence (all terms within 200 chars)
3. Individual terms (fallback, preserves existing behavior)

Sliding window picks the start offset covering the most matches.

Moved inline import re to module level.

Co-authored-by: Al Sayed Hoota <78100282+AlsayedHoota@users.noreply.github.com>
2026-04-13 04:54:42 -07:00
Teknium
8dfee98d06 fix: clean up description escaping, add string-data tests
Follow-up for cherry-picked PR #8918.
2026-04-13 04:45:07 -07:00
dippwho
bca22f3090 fix(homeassistant): #8912 resolve XML tool calling loop by casting nested object to JSON string 2026-04-13 04:45:07 -07:00
Teknium
39b83f3443
fix: remove sandbox language from tool descriptions
The terminal and execute_code tool schemas unconditionally mentioned
'cloud sandboxes' in their descriptions sent to the model. This caused
agents running on local backends to believe they were in a sandboxed
environment, refusing networking tasks and other operations. Worse,
agents sometimes saved this false belief to persistent memory, making
it persist across sessions.

Reported by multiple users (XLion, 林泽).
2026-04-13 04:23:27 -07:00
Dusk1e
c052cf0eea fix(security): validate domain/service params in ha_call_service to prevent path traversal 2026-04-12 22:26:15 -07:00
Teknium
9e992df8ae
fix(telegram): use UTF-16 code units for message length splitting (#8725)
Port from nearai/ironclaw#2304: Telegram's 4096 character limit is
measured in UTF-16 code units, not Unicode codepoints. Characters
outside the Basic Multilingual Plane (emoji like 😀, CJK Extension B,
musical symbols) are surrogate pairs: 1 Python char but 2 UTF-16 units.

Previously, truncate_message() used Python's len() which counts
codepoints. This could produce chunks exceeding Telegram's actual limit
when messages contain many astral-plane characters.

Changes:
- Add utf16_len() helper and _prefix_within_utf16_limit() for
  UTF-16-aware string measurement and truncation
- Add _custom_unit_to_cp() binary-search helper that maps a custom-unit
  budget to the largest safe codepoint slice position
- Update truncate_message() to accept optional len_fn parameter
- Telegram adapter now passes len_fn=utf16_len when splitting messages
- Fix fallback truncation in Telegram error handler to use
  _prefix_within_utf16_limit instead of codepoint slicing
- Update send_message_tool.py to use utf16_len for Telegram platform
- Add comprehensive tests: utf16_len, _prefix_within_utf16_limit,
  truncate_message with len_fn (emoji splitting, content preservation,
  code block handling)
- Update mock lambdas in reply_mode tests to accept **kw for len_fn
2026-04-12 19:06:20 -07:00
0xbyt4
8ec0656f53 feat(tts): add speed support for Edge TTS and OpenAI TTS
Read tts.speed (global) or tts.<provider>.speed (provider-specific) from
config. Provider-specific takes precedence over global.

- Edge TTS: converts speed float to SSML prosody rate string
- OpenAI TTS: passes speed param clamped to 0.25-4.0
- MiniMax: wired into global tts.speed fallback for consistency

Co-authored-by: 0xbyt4 <0xbyt4@users.noreply.github.com>
2026-04-12 16:46:18 -07:00
Teknium
76019320fb feat(skills): centralized skills index — eliminate GitHub API calls for search/install
Add a CI-built skills index served from the docs site. The index is
crawled daily by GitHub Actions, resolves all GitHub paths upfront, and
is cached locally by the client. When the index is available:

- Search uses the cached index (0 GitHub API calls, was 23+)
- Install uses resolved paths from index (6 API calls for file
  downloads only, was 31-45 for discovery + downloads)

Total: 68 → 6 GitHub API calls for a typical search + install flow.
Unauthenticated users (60 req/hr) can now search and install without
hitting rate limits.

Components:
- scripts/build_skills_index.py: Crawl all sources (skills.sh, GitHub
  taps, official, clawhub, lobehub), batch-resolve GitHub paths via
  tree API, output JSON index
- tools/skills_hub.py: HermesIndexSource class — search/fetch/inspect
  backed by the index, with lazy GitHubSource for file downloads
- parallel_search_sources() skips external API sources when index is
  available (0 GitHub calls for search)
- .github/workflows/skills-index.yml: twice-daily CI build + deploy
- .github/workflows/deploy-site.yml: also builds index during docs deploy

Graceful degradation: when the index is unavailable (first run, network
down, stale), all methods return empty/None and downstream sources
handle the request via direct API as before.
2026-04-12 16:39:04 -07:00
Teknium
7e0e5ea03b fix(skills): cache GitHub repo trees to avoid rate-limit exhaustion on install
Skills.sh installs hit the GitHub API 45 times per install because the
same repo tree was fetched 6 times redundantly. Combined with search
(23 API calls), this totals 68 — exceeding the unauthenticated rate
limit of 60 req/hr, causing 'Could not fetch' errors for users without
a GITHUB_TOKEN.

Changes:
- Add _get_repo_tree() cache to GitHubSource — repo info + recursive
  tree fetched once per repo per source instance, eliminating 10
  redundant API calls (6 tree + 4 candidate 404s)
- _download_directory_via_tree returns {} (not None) when cached tree
  shows path doesn't exist, skipping unnecessary Contents API fallback
- _check_rate_limit_response() detects exhausted quota and sets
  is_rate_limited flag
- do_install() shows actionable hint when rate limited: set
  GITHUB_TOKEN or install gh CLI

Before: 45 API calls per install (68 total with search)
After:  31 API calls per install (54 total with search — under 60/hr)

Reported by community user from Vietnam (no GitHub auth configured).
2026-04-12 16:39:04 -07:00
alt-glitch
5e1197a42e fix(gateway): harden Docker/container gateway pathway
Centralize container detection in hermes_constants.is_container() with
process-lifetime caching, matching existing is_wsl()/is_termux() patterns.
Dedup _is_inside_container() in config.py to delegate to the new function.

Add _run_systemctl() wrapper that converts FileNotFoundError to RuntimeError
for defense-in-depth — all 10 bare subprocess.run(_systemctl_cmd(...)) call
sites now route through it.

Make supports_systemd_services() return False in containers and when
systemctl binary is absent (shutil.which check).

Add Docker-specific guidance in gateway_command() for install/uninstall/start
subcommands — exit 0 with helpful instructions instead of crashing.

Make 'hermes status' show 'Manager: docker (foreground)' and 'hermes dump'
show 'running (docker, pid N)' inside containers.

Fix setup_gateway() to use supports_systemd instead of _is_linux for all
systemd-related branches, and show Docker restart policy instructions in
containers.

Replace inline /.dockerenv check in voice_mode.py with is_container().

Fixes #7420

Co-authored-by: teknium1 <teknium1@users.noreply.github.com>
2026-04-12 16:36:11 -07:00
Brooklyn Nicholson
2aea75e91e Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-12 13:18:55 -05:00
Teknium
c52f6348b6
fix: list all available toolsets in delegate_task schema description (#8231)
* fix: list all available toolsets in delegate_task schema description

The delegate_task tool's toolsets parameter description only mentioned
'terminal', 'file', and 'web' as examples. Models (especially smaller
ones like Gemma) would substitute 'web' for 'browser' because they
didn't know 'browser' was a valid option.

Now dynamically builds the toolset list from the TOOLSETS dict at import
time, excluding blocked, composite, and platform-specific toolsets.
Auto-updates when new toolsets are added.

Reported by jeffutter on Discord.

* chore: exclude moa and rl from delegate_task toolset list
2026-04-12 00:54:35 -07:00
Teknium
f53a5a7fe1
fix: suppress duplicate completion notifications when agent already consumed output via wait/poll/log (#8228)
When the agent calls process(action='wait') or process(action='poll')
and gets the exited status, the completion_queue notification is
redundant — the agent already has the output from the tool return.
Previously, the drain loops in CLI and gateway would still inject
the [SYSTEM: Background process completed] message, causing the
agent to receive the same information twice.

Fix: track session IDs in _completion_consumed set when wait/poll/log
returns an exited process. Drain loops in cli.py and gateway watcher
skip completion events for consumed sessions. Watch pattern events
are never suppressed (they have independent semantics).

Adds 4 tests covering wait/poll/log marking and running-process
negative case.
2026-04-12 00:36:22 -07:00
Teknium
8e00b3a69e
fix(cron): steer model away from explicit deliver targets that lose topic context (#8187)
Rewrite the cronjob tool's 'deliver' parameter description to strongly
guide models toward omitting the parameter (which auto-detects origin
including thread/topic). The previous description listed all platform
names equally, inviting models to construct explicit targets like
'telegram:<chat_id>' which silently drops the thread_id.

New description:
- Leads with 'Omit this parameter' as the recommended path
- Explicitly warns that platform:chat_id without :thread_id loses topics
- Removes the long flat list of platform names that invited construction

Also adds diagnostic logging at two key points:
- _origin_from_env(): logs when thread_id is captured during job creation
- _deliver_result(): warns when origin has thread_id but delivery target
  lost it; logs at debug when delivering to a specific thread

Helps diagnose user-reported issue where cron responses from Telegram
topics are delivered to the main chat instead of the originating topic.
2026-04-11 23:20:39 -07:00
Siddharth Balyan
27eeea0555
perf(ssh,modal): bulk file sync via tar pipe and tar/base64 archive (#8014)
* perf(ssh,modal): bulk file sync via tar pipe and tar/base64 archive

SSH: symlink-staging + tar -ch piped over SSH in a single TCP stream.
Eliminates per-file scp round-trips. Handles timeout (kills both
processes), SSH Popen failure (kills tar), and tar create failure.

Modal: in-memory gzipped tar archive, base64-encoded, decoded+extracted
in one exec call. Checks exit code and raises on failure.

Both backends use shared helpers extracted into file_sync.py:
- quoted_mkdir_command() — mirrors existing quoted_rm_command()
- unique_parent_dirs() — deduplicates parent dirs from file pairs

Migrates _ensure_remote_dirs to use the new helpers.

28 new tests (21 SSH + 7 Modal), all passing.

Closes #7465
Closes #7467

* fix(modal): pipe stdin to avoid ARG_MAX, clean up review findings

- Modal bulk upload: stream base64 payload through proc.stdin in 1MB
  chunks instead of embedding in command string (Modal SDK enforces
  64KB ARG_MAX_BYTES — typical payloads are ~4.3MB)
- Modal single-file upload: same stdin fix, add exit code checking
- Remove what-narrating comments in ssh.py and modal.py (keep WHY
  comments: symlink staging rationale, SIGPIPE, deadlock avoidance)
- Remove unnecessary `sandbox = self._sandbox` alias in modal bulk
- Daytona: use shared helpers (unique_parent_dirs, quoted_mkdir_command)
  instead of inlined duplicates

---------

Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>
2026-04-12 06:18:05 +05:30
Teknium
14ccd32cee
refactor(terminal): remove check_interval parameter (#8001)
The check_interval parameter on terminal_tool sent periodic output
updates to the gateway chat, but these were display-only — the agent
couldn't see or act on them. This added schema bloat and introduced
a bug where notify_on_complete=True was silently dropped when
check_interval was also set (the not-check_interval guard skipped
fast-watcher registration, and the check_interval watcher dict
was missing the notify_on_complete key).

Removing check_interval entirely:
- Eliminates the notify_on_complete interaction bug
- Reduces tool schema size (one fewer parameter for the model)
- Simplifies the watcher registration path
- notify_on_complete (agent wake-on-completion) still works
- watch_patterns (output alerting) still works
- process(action='poll') covers manual status checking

Closes #7947 (root cause eliminated rather than patched).
2026-04-11 17:16:11 -07:00
WAXLYY
6d272ba477 fix(tools): enforce ID uniqueness in TODO store during replace operations
Deduplicate todo items by ID before writing to the store, keeping the
last occurrence. Prevents ghost entries when the model sends duplicate
IDs in a single write() call, which corrupts subsequent merge operations.

Co-authored-by: WAXLYY <WAXLYY@users.noreply.github.com>
2026-04-11 16:22:50 -07:00
Teknium
c8aff74632
fix: prevent agent from stopping mid-task — compression floor, budget overhaul, activity tracking
Three root causes of the 'agent stops mid-task' gateway bug:

1. Compression threshold floor (64K tokens minimum)
   - The 50% threshold on a 100K-context model fired at 50K tokens,
     causing premature compression that made models lose track of
     multi-step plans.  Now threshold_tokens = max(50% * context, 64K).
   - Models with <64K context are rejected at startup with a clear error.

2. Budget warning removal — grace call instead
   - Removed the 70%/90% iteration budget warnings entirely.  These
     injected '[BUDGET WARNING: Provide your final response NOW]' into
     tool results, causing models to abandon complex tasks prematurely.
   - Now: no warnings during normal execution.  When the budget is
     actually exhausted (90/90), inject a user message asking the model
     to summarise, allow one grace API call, and only then fall back
     to _handle_max_iterations.

3. Activity touches during long terminal execution
   - _wait_for_process polls every 0.2s but never reported activity.
     The gateway's inactivity timeout (default 1800s) would fire during
     long-running commands that appeared 'idle.'
   - Now: thread-local activity callback fires every 10s during the
     poll loop, keeping the gateway's activity tracker alive.
   - Agent wires _touch_activity into the callback before each tool call.

Also: docs update noting 64K minimum context requirement.

Closes #7915 (root cause was agent-loop termination, not Weixin delivery limits).
2026-04-11 16:18:57 -07:00
0xbyt4
32519066dc fix(gateway): add HERMES_SESSION_KEY to session_context contextvars
Complete the contextvars migration by adding HERMES_SESSION_KEY to the
unified _VAR_MAP in session_context.py. Without this, concurrent gateway
handlers race on os.environ["HERMES_SESSION_KEY"].

- Add _SESSION_KEY ContextVar to _VAR_MAP, set_session_vars(), clear_session_vars()
- Wire session_key through _set_session_env() from SessionContext
- Replace os.getenv fallback in tools/approval.py with get_session_env()
  (function-level import to avoid cross-layer coupling)
- Keep os.environ set as CLI/cron fallback

Cherry-picked from PR #7878 by 0xbyt4.
2026-04-11 15:35:04 -07:00
chqchshj
5f0caf54d6 feat(gateway): add WeCom callback-mode adapter for self-built apps
Add a second WeCom integration mode for regular enterprise self-built
applications.  Unlike the existing bot/websocket adapter (wecom.py),
this handles WeCom's standard callback flow: WeCom POSTs encrypted XML
to an HTTP endpoint, the adapter decrypts, queues for the agent, and
immediately acknowledges.  The agent's reply is delivered proactively
via the message/send API.

Key design choice: always acknowledge immediately and use proactive
send — agent sessions take 3-30 minutes, so the 5-second inline reply
window is never useful.  The original PR's Future/pending-reply
machinery was removed in favour of this simpler architecture.

Features:
- AES-CBC encrypt/decrypt (BizMsgCrypt-compatible)
- Multi-app routing scoped by corp_id:user_id
- Legacy bare user_id fallback for backward compat
- Access-token management with auto-refresh
- WECOM_CALLBACK_* env var overrides
- Port-in-use pre-check before binding
- Health endpoint at /health

Salvaged from PR #7774 by @chqchshj.  Simplified by removing the
inline reply Future system and fixing: secrets.choice for nonce
generation, immediate plain-text acknowledgment (not encrypted XML
containing 'success'), and initial token refresh error handling.
2026-04-11 15:22:49 -07:00
Brooklyn Nicholson
ec553fdb49 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-11 17:15:41 -05:00
faishal
90352b2adf fix: normalize checkpoint manager home-relative paths
Adds _normalize_path() helper that calls expanduser().resolve() to
properly handle tilde paths (e.g. ~/.hermes, ~/.config).  Previously
Path.resolve() alone treated ~ as a literal directory name, producing
invalid paths like /root/~/.hermes.

Also improves _run_git() error handling to distinguish missing working
directories from missing git executable, and adds pre-flight directory
validation.

Cherry-picked from PR #7898 by faishal882.
Fixes #7807
2026-04-11 14:50:44 -07:00
Teknium
f2893fe51a
fix(tools): neutralize shell injection in _write_to_sandbox via path quoting (#7940)
_write_to_sandbox interpolated storage_dir and remote_path directly into
a shell command passed to env.execute(). Paths containing shell
metacharacters (spaces, semicolons, $(), backticks) could trigger
arbitrary command execution inside the sandbox.

Fix: wrap both paths with shlex.quote(). Clean paths (alphanumeric +
slashes/hyphens/dots) are left unmodified by shlex.quote, so existing
behavior is unchanged. Paths with unsafe characters get single-quoted.

Tests added for spaces, $(command) substitution, and semicolon injection.
2026-04-11 14:26:11 -07:00
Dusk1e
255f59de18 fix(tools): prevent command argument injection and path traversal in checkpoint manager
This commit addresses a security vulnerability where unsanitized user inputs for commit_hash and file_path were passed directly to git commands in CheckpointManager.restore() and diff(). It validates commit hashes to be strictly hexadecimal characters without leading dashes (preventing flag injection like '--patch') and enforces file paths to stay within the working directory via root resolution. Regression tests test_restore_rejects_argument_injection, test_restore_rejects_invalid_hex_chars, and test_restore_rejects_path_traversal were added.
2026-04-11 14:25:57 -07:00
Teknium
dfc820345d
fix: scope tool interrupt signal per-thread to prevent cross-session leaks (#7930)
The interrupt mechanism in tools/interrupt.py used a process-global
threading.Event. In the gateway, multiple agents run concurrently in
the same process via run_in_executor. When any agent was interrupted
(user sends a follow-up message), the global flag killed ALL agents'
running tools — terminal commands, browser ops, web requests — across
all sessions.

Changes:
- tools/interrupt.py: Replace single threading.Event with a set of
  interrupted thread IDs. set_interrupt() targets a specific thread;
  is_interrupted() checks the current thread. Includes a backward-
  compatible _ThreadAwareEventProxy for legacy _interrupt_event usage.
- run_agent.py: Store execution thread ID at start of run_conversation().
  interrupt() and clear_interrupt() pass it to set_interrupt() so only
  this agent's thread is affected.
- tools/code_execution_tool.py: Use is_interrupted() instead of
  directly checking _interrupt_event.is_set().
- tools/process_registry.py: Same — use is_interrupted().
- tests: Update interrupt tests for per-thread semantics. Add new
  TestPerThreadInterruptIsolation with two tests verifying cross-thread
  isolation.
2026-04-11 14:02:58 -07:00
Teknium
75380de430
fix: reap orphaned browser sessions on startup (#7931)
When a Python process exits uncleanly (SIGKILL, crash, gateway restart
via hermes update), in-memory _active_sessions tracking is lost but the
agent-browser node daemons and their Chromium child processes keep
running indefinitely. On a long-running system this causes unbounded
memory growth — 24 orphaned sessions consumed 7.6 GB on a production
machine over 9 days.

Add _reap_orphaned_browser_sessions() which scans the tmp directory for
agent-browser-{h_*,cdp_*} socket dirs on cleanup thread startup.  For
each dir not tracked by the current process, reads the daemon PID file
and sends SIGTERM if the daemon is still alive.  Handles edge cases:
dead PIDs, corrupt PID files, permission errors, foreign processes.

The reaper runs once on thread startup (not every 30s) to avoid races
with sessions being actively created by concurrent agents.
2026-04-11 14:02:46 -07:00
Teknium
04c1c5d53f
refactor: extract shared helpers to deduplicate repeated code patterns (#7917)
* refactor: add shared helper modules for code deduplication

New modules:
- gateway/platforms/helpers.py: MessageDeduplicator, TextBatchAggregator,
  strip_markdown, ThreadParticipationTracker, redact_phone
- hermes_cli/cli_output.py: print_info/success/warning/error, prompt helpers
- tools/path_security.py: validate_within_dir, has_traversal_component
- utils.py additions: safe_json_loads, read_json_file, read_jsonl,
  append_jsonl, env_str/lower/int/bool helpers
- hermes_constants.py additions: get_config_path, get_skills_dir,
  get_logs_dir, get_env_path

* refactor: migrate gateway adapters to shared helpers

- MessageDeduplicator: discord, slack, dingtalk, wecom, weixin, mattermost
- strip_markdown: bluebubbles, feishu, sms
- redact_phone: sms, signal
- ThreadParticipationTracker: discord, matrix
- _acquire/_release_platform_lock: telegram, discord, slack, whatsapp,
  signal, weixin

Net -316 lines across 19 files.

* refactor: migrate CLI modules to shared helpers

- tools_config.py: use cli_output print/prompt + curses_radiolist (-117 lines)
- setup.py: use cli_output print helpers + curses_radiolist (-101 lines)
- mcp_config.py: use cli_output prompt (-15 lines)
- memory_setup.py: use curses_radiolist (-86 lines)

Net -263 lines across 5 files.

* refactor: migrate to shared utility helpers

- safe_json_loads: agent/display.py (4 sites)
- get_config_path: skill_utils.py, hermes_logging.py, hermes_time.py
- get_skills_dir: skill_utils.py, prompt_builder.py
- Token estimation dedup: skills_tool.py imports from model_metadata
- Path security: skills_tool, cronjob_tools, skill_manager_tool, credential_files
- Non-atomic YAML writes: doctor.py, config.py now use atomic_yaml_write
- Platform dict: new platforms.py, skills_config + tools_config derive from it
- Anthropic key: new get_anthropic_key() in auth.py, used by doctor/status/config/main

* test: update tests for shared helper migrations

- test_dingtalk: use _dedup.is_duplicate() instead of _is_duplicate()
- test_mattermost: use _dedup instead of _seen_posts/_prune_seen
- test_signal: import redact_phone from helpers instead of signal
- test_discord_connect: _platform_lock_identity instead of _token_lock_identity
- test_telegram_conflict: updated lock error message format
- test_skill_manager_tool: 'escapes' instead of 'boundary' in error msgs
2026-04-11 13:59:52 -07:00
Teknium
cac6178104 fix(gateway): propagate user identity through process watcher pipeline
Background process watchers (notify_on_complete, check_interval) created
synthetic SessionSource objects without user_id/user_name. While the
internal=True bypass (1d8d4f28) prevented false pairing for agent-
generated notifications, the missing identity caused:

- Garbage entries in pairing rate limiters (discord:None, telegram:None)
- 'User None' in approval messages and logs
- No user identity available for future code paths that need it

Additionally, platform messages arriving without from_user (Telegram
service messages, channel forwards, anonymous admin actions) could still
trigger false pairing because they are not internal events.

Fix:
1. Propagate user_id/user_name through the full watcher chain:
   session_context.py → gateway/run.py → terminal_tool.py →
   process_registry.py (including checkpoint persistence/recovery)

2. Add None user_id guard in _handle_message() — silently drop
   non-internal messages with no user identity instead of triggering
   the pairing flow.

Salvaged from PRs #7664 (kagura-agent, ContextVar approach),
#6540 (MestreY0d4-Uninter, tests), and #7709 (guang384, None guard).

Closes #6341, #6485, #7643
Relates to #6516, #7392
2026-04-11 13:46:16 -07:00
Brooklyn Nicholson
9ccb490cf3 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-11 15:30:23 -05:00
0xbyt4
3ec8809b78 fix(vision): preserve aspect ratio during auto-resize
Independent halving of width and height caused aspect ratio distortion
for extreme dimensions (e.g. 8000x200 panoramas). When one axis hit the
64px floor, the other kept shrinking — collapsing the ratio toward 1:1.

Use proportional scaling instead: when either dimension hits the floor,
derive the effective scale factor and apply it to both axes.

Add tests for extreme panorama (8000x200) and tall narrow (200x6000)
images to verify aspect ratio preservation.
2026-04-11 11:53:04 -07:00
Brooklyn Nicholson
bf6af95ff5 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-11 13:14:36 -05:00
kshitijk4poor
50bb4fe010 fix(vision): auto-resize oversized images, increase default timeout, fix vision capability detection
Cherry-picked from PR #7749 by kshitijk4poor with modifications:

- Raise hard image limit from 5 MB to 20 MB (matches most restrictive provider)
- Send images at full resolution first; only auto-resize to 5 MB on API failure
- Add _is_image_size_error() helper to detect size-related API rejections
- Auto-resize uses Pillow (soft dep) with progressive downscale + JPEG quality reduction
- Fix get_model_capabilities() to check modalities.input for vision support
- Increase default vision timeout from 30s to 120s (matches hardcoded fallback intent)
- Applied retry-with-resize to both vision_analyze_tool and browser_vision

Closes #7740
2026-04-11 11:12:50 -07:00
Brooklyn Nicholson
b04248f4d5 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor
# Conflicts:
#	gateway/platforms/base.py
#	gateway/run.py
#	tests/gateway/test_command_bypass_active_session.py
2026-04-11 11:39:47 -05:00
Teknium
f459214010
feat: background process monitoring — watch_patterns for real-time output alerts
* feat: add watch_patterns to background processes for output monitoring

Adds a new 'watch_patterns' parameter to terminal(background=true) that
lets the agent specify strings to watch for in process output. When a
matching line appears, a notification is queued and injected as a
synthetic message — triggering a new agent turn, similar to
notify_on_complete but mid-process.

Implementation:
- ProcessSession gets watch_patterns field + rate-limit state
- _check_watch_patterns() in ProcessRegistry scans new output chunks
  from all three reader threads (local, PTY, env-poller)
- Rate limited: max 8 notifications per 10s window
- Sustained overload (45s) permanently disables watching for that process
- watch_queue alongside completion_queue, same consumption pattern
- CLI drains watch_queue in both idle loop and post-turn drain
- Gateway drains after agent runs via _inject_watch_notification()
- Checkpoint persistence + crash recovery includes watch_patterns
- Blocked in execute_code sandbox (like other bg params)
- 20 new tests covering matching, rate limiting, overload kill,
  checkpoint persistence, schema, and handler passthrough

Usage:
  terminal(
      command='npm run dev',
      background=true,
      watch_patterns=['ERROR', 'WARN', 'listening on port']
  )

* refactor: merge watch_queue into completion_queue

Unified queue with 'type' field distinguishing 'completion',
'watch_match', and 'watch_disabled' events. Extracted
_format_process_notification() in CLI and gateway to handle
all event types in a single drain loop. Removes duplication
across both CLI drain sites and the gateway.
2026-04-11 03:13:23 -07:00
Tranquil-Flow
4e56eacdce fix(vision): reject oversized images before API call, handle file:// URIs, improve 400 errors
Three fixes for vision_analyze returning cryptic 400 "Invalid request data":

1. Pre-flight base64 size check — base64 inflates data ~33%, so a 3.8 MB
   file exceeds the 5 MB API limit. Reject early with a clear message
   instead of letting the provider return a generic 400.

2. Handle file:// URIs — strip the scheme and resolve as a local path.
   Previously file:///path/to/image.png fell through to the "invalid
   image source" error since it matched neither is_file() nor http(s).

3. Separate invalid_request errors from "does not support vision" errors
   so the user gets actionable guidance (resize/compress/retry) instead
   of a misleading "model does not support vision" message.

Closes #6677
2026-04-11 02:03:20 -07:00
aaronagent
1909877e6e fix: cap image download size at 50 MB, validate tool call parser fields
vision_tools.py: _download_image() loads the full HTTP response body into
memory via response.content (line 190) with no Content-Length check and no
max file size limit.  An attacker-hosted multi-gigabyte file causes OOM.
Add a 50 MB hard cap: check Content-Length header before download, and
verify actual body size before writing to disk.

hermes_parser.py: tc_data["name"] at line 57 raises KeyError when the LLM
outputs a tool call JSON without a "name" field.  The outer except catches
it silently, causing the entire tool call to be lost with zero diagnostics.
Add "name" field validation before constructing the ChatCompletionMessage.

mistral_parser.py: tc["name"] at line 101 has the same KeyError issue in
the pre-v11 format path.  The fallback decoder (line 112) already checks
"name" correctly, but the primary path does not.  Add validation to match.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 02:03:20 -07:00
aaronagent
307697688e fix: prevent zombie processes, redact cron stderr, skip symlinks in skill enumeration
process_registry.py: _reader_loop() has process.wait() after the try-except
block (line 380).  If the reader thread crashes with an unexpected exception
(e.g. MemoryError, KeyboardInterrupt), control exits the except handler but
skips wait() — leaving the child as a zombie process.  Move wait() and the
cleanup into a finally block so the child is always reaped.

cron/scheduler.py: _run_job_script() only redacts secrets in stdout on the
SUCCESS path (line 417-421).  When a cron script fails (non-zero exit), both
stdout and stderr are returned WITHOUT redaction (lines 407-413).  A script
that accidentally prints an API key to stderr during a failure would leak it
into the LLM context.  Move redaction before the success/failure branch so
both paths benefit.

skill_commands.py: _build_skill_message() enumerates supporting files using
rglob("*") but only checks is_file() (line 171) without filtering symlinks.
PR #6693 added symlink protection to scan_skill_commands() but missed this
function.  A malicious skill can create symlinks in references/ pointing to
arbitrary files, exposing their paths (and potentially content via skill_view)
to the LLM.  Add is_symlink() check to match the guard in scan_skill_commands.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
2026-04-11 02:03:20 -07:00
jjovalle99
640441b865 feat(tools): add Voxtral TTS provider (Mistral AI) 2026-04-11 01:56:55 -07:00
konsisumer
b87e0f59cc fix(skills): read name from SKILL.md frontmatter in skills_sync
_discover_bundled_skills() used the directory name to identify skills,
but skills_tool.py and skills_hub.py use the `name:` field from SKILL.md
frontmatter.  This mismatch caused 9 builtin skills whose directory name
differs from their SKILL.md name to be written to .bundled_manifest
under the wrong key, so `hermes skills list` showed them as "local"
instead of "builtin".

Read the frontmatter name field (with directory-name fallback) so the
manifest keys match what the rest of the codebase expects.

Closes #6835
2026-04-11 01:21:20 -07:00
luyao618
fc06a0147e fix(tools): remove dead code in _is_likely_binary and harden _check_lint against brace paths
- Remove unreachable `if not content_sample` branch inside the truthy
  `if content_sample` block in `_is_likely_binary()` (dead code that
  could never execute).
- Replace `linter_cmd.format(file=...)` with `linter_cmd.replace("{file}", ...)`
  in `_check_lint()` so file paths containing curly braces (e.g.
  `src/{test}.py`) no longer raise KeyError/ValueError.
- Add 16 unit tests covering both fixes and edge cases.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 21:16:53 -07:00
hermes-agent-dhabibi
718e8ad6fa feat(delegation): add configurable reasoning_effort for subagents
Add delegation.reasoning_effort config key so subagents can run at a
different thinking level than the parent agent. When set, overrides
the parent's reasoning_config; when empty, inherits as before.

Valid values: xhigh, high, medium, low, minimal, none (disables thinking).

Config path: delegation.reasoning_effort in config.yaml

Files changed:
- tools/delegate_tool.py: resolve override in _build_child_agent
- hermes_cli/config.py: add reasoning_effort to DEFAULT_CONFIG
- tests/tools/test_delegate.py: 4 new tests covering all cases
2026-04-10 21:16:53 -07:00
Hermes Agent
830040f937 fix: remove unused BulkUploadFn import from daytona.py 2026-04-10 21:14:32 -07:00
Hermes Agent
223a0623ee fix(daytona): use logger.warning instead of warnings.warn for disk cap
warnings.warn() is suppressed/invisible when running as a gateway
or agent. Switch to logger.warning() so the disk cap message
actually appears in logs.

Fixes #7362 (item 3).
2026-04-10 21:14:32 -07:00
Hermes Agent
bff64858f9 perf(daytona): bulk upload files in single HTTP call
FileSyncManager now accepts an optional bulk_upload_fn callback.
When provided, all changed files are uploaded in one call instead
of iterating one-by-one with individual HTTP POSTs.

DaytonaEnvironment wires this to sandbox.fs.upload_files() which
batches everything into a single multipart POST — ~580 files goes
from ~5 min to <2s on init.

Parent directories are pre-created in one mkdir -p call.

Fixes #7362 (item 1).
2026-04-10 21:14:32 -07:00
0xFrank-eth
e8034e2f6a fix(gateway): replace os.environ session state with contextvars for concurrency safety
When two gateway messages arrived concurrently, _set_session_env wrote
HERMES_SESSION_PLATFORM/CHAT_ID/CHAT_NAME/THREAD_ID into the process-global
os.environ. Because asyncio tasks share the same process, Message B would
overwrite Message A's values mid-flight, causing background-task notifications
and tool calls to route to the wrong thread/chat.

Replace os.environ with Python's contextvars.ContextVar. Each asyncio task
(and any run_in_executor thread it spawns) gets its own copy, so concurrent
messages never interfere.

Changes:
- New gateway/session_context.py with ContextVar definitions, set/clear/get
  helpers, and os.environ fallback for CLI/cron/test backward compatibility
- gateway/run.py: _set_session_env returns reset tokens, _clear_session_env
  accepts them for proper cleanup in finally blocks
- All tool consumers updated: cronjob_tools, send_message_tool, skills_tool,
  terminal_tool (both notify_on_complete AND check_interval blocks), tts_tool,
  agent/skill_utils, agent/prompt_builder
- Tests updated for new contextvar-based API

Fixes #7358

Co-authored-by: teknium1 <127238744+teknium1@users.noreply.github.com>
2026-04-10 17:04:38 -07:00
pefontana
672cc80915 fix(delegate): close child agent after delegation completes
Call child.close() in the _run_single_child finally block after
unregistering the child from the parent's active children list.

Previously child AIAgent instances were only removed from the tracking
list but never had their resources released — the OpenAI/httpx client
and any tool subprocesses relied entirely on garbage collection.

Ref: #7131
2026-04-10 16:51:44 -07:00
KUSH42
0e939af7c2 fix(patch): harden V4A patch parser and fuzzy match — 9 correctness bugs
- Bug 1: replace read_file(limit=10000) with read_file_raw in _apply_update,
  preventing silent truncation of files >2000 lines and corruption of lines
  >2000 chars; add read_file_raw to FileOperations abstract interface and
  ShellFileOperations

- Bug 2: split apply_v4a_operations into validate-then-apply phases; if any
  hunk fails validation, zero writes occur (was: continue after failure,
  leaving filesystem partially modified)

- Bug 3: parse_v4a_patch now returns an error for begin-marker-with-no-ops,
  empty file paths, and moves missing a destination (was: always returned
  error=None)

- Bug 4: raise strategy 7 (block anchor) single-candidate similarity threshold
  from 0.10 to 0.50, eliminating false-positive matches in repetitive code

- Bug 5: add _strategy_unicode_normalized (new strategy 7) with position
  mapping via _build_orig_to_norm_map; smart quotes and em-dashes in
  LLM-generated patches now match via strategies 1-6 before falling through
  to fuzzy strategies

- Bug 6: extend fuzzy_find_and_replace to return 4-tuple (content, count,
  error, strategy); update all 5 call sites across patch_parser.py,
  file_operations.py, and skill_manager_tool.py

- Bug 7: guard in _apply_update returns error when addition-only context hint
  is ambiguous (>1 occurrences); validation phase errors on both 0 and >1

- Bug 8: _apply_delete returns error (not silent success) on missing file

- Bug 9: _validate_operations checks source existence and destination absence
  for MOVE operations before any write occurs
2026-04-10 16:47:44 -07:00
coffee
c1f832a610 fix(tools): guard against ValueError on int() env var and header parsing
Three locations perform `int()` conversion on environment variables or
HTTP headers without error handling, causing unhandled `ValueError` crashes
when the values are non-numeric:

1. `send_message_tool.py` — `EMAIL_SMTP_PORT` env var parsed outside the
   try/except block; a non-numeric value crashes `_send_email()` instead
   of returning a user-friendly error.

2. `process_registry.py` — `TERMINAL_TIMEOUT` env var parsed without
   protection; a non-numeric value crashes the `wait()` method.

3. `skills_hub.py` — HTTP `Retry-After` header can contain date strings
   per RFC 7231; `int()` conversion crashes on non-numeric values.

All three now fall back to their default values on `ValueError`/`TypeError`.
2026-04-10 16:47:44 -07:00
Awsh1
6f63ba9c8f fix(mcp): fall back when SIGKILL is unavailable 2026-04-10 16:47:44 -07:00
angelos
8254b820ec fix(docker): --init for zombie reaping + sleep infinity for idle-based lifetime
Two issues with sandbox container spawning:

1. PID 1 was `sleep 2h` which doesn't call wait() — every background
   process that exited became a zombie (<defunct>), and the process
   tool reported them as "running" because zombie PIDs still exist in
   the process table. Fix: add --init to docker run, which uses
   tini (Docker) or catatonit (Podman) as PID 1 to reap children
   automatically. Both runtimes support --init natively.

2. The fixed 2-hour lifetime was arbitrary and sometimes too short
   for long agent sessions. Fix: replace 'sleep 2h' with
   'sleep infinity'. The idle reaper (_cleanup_inactive_envs, gated
   by terminal.lifetime_seconds, default 300s) already handles
   cleanup based on last activity timestamp — there's no need for
   the container itself to have a fixed death timer.

Fixes #6908.
2026-04-10 15:42:30 -07:00
angelos
7ccdb74364 fix(delegate): make max_concurrent_children configurable + error on excess
`delegate_task` silently truncated batch tasks to 3 — the model sends
5 tasks, gets results for 3, never told 2 were dropped. Now returns a
clear tool_error explaining the limit and how to fix it.

The limit is configurable via:
  - delegation.max_concurrent_children in config.yaml (priority 1)
  - DELEGATION_MAX_CONCURRENT_CHILDREN env var (priority 2)
  - default: 3

Uses the same _load_config() path as the rest of delegate_task for
consistent config priority. Clamps to min 1, warns on non-integer
config values.

Also removes the hardcoded maxItems: 3 from the JSON schema — the
schema was blocking the model from even attempting >3 tasks before
the runtime check could fire. The runtime check gives a much more
actionable error message.

Backwards compatible: default remains 3, existing configs unchanged.
2026-04-10 13:38:14 -07:00
Teknium
4fb42d0193
fix: per-profile subprocess HOME isolation (#4426) (#7357)
Isolate system tool configs (git, ssh, gh, npm) per profile by injecting
a per-profile HOME into subprocess environments only.  The Python
process's own os.environ['HOME'] and Path.home() are never modified,
preserving all existing profile infrastructure.

Activation is directory-based: when {HERMES_HOME}/home/ exists on disk,
subprocesses see it as HOME.  The directory is created automatically for:
- Docker: entrypoint.sh bootstraps it inside the persistent volume
- Named profiles: added to _PROFILE_DIRS in profiles.py

Injection points (all three subprocess env builders):
- tools/environments/local.py _make_run_env() — foreground terminal
- tools/environments/local.py _sanitize_subprocess_env() — background procs
- tools/code_execution_tool.py child_env — execute_code sandbox

Single source of truth: hermes_constants.get_subprocess_home()

Closes #4426
2026-04-10 13:37:45 -07:00
kshitijk4poor
37a1c75716 fix(browser): hardening — dead code, caching, scroll perf, security, thread safety
Salvaged from PR #7276 (hardening-only subset; excluded 6 new tools
and unrelated scope additions from the contributor's commit).

- Remove dead DEFAULT_SESSION_TIMEOUT and unregistered browser_close schema
- Fix _camofox_eval wrong call signatures (_ensure_tab, _post args)
- Cache _find_agent_browser, _get_command_timeout, _discover_homebrew_node_dirs
- Replace 5x subprocess scroll loop with single pixel-arg call
- URL-decode before secret exfiltration check (bypass prevention)
- Protect _recording_sessions with _cleanup_lock (thread safety)
- Return failure on empty stdout instead of silent success
- Structure-aware _truncate_snapshot (cut at line boundaries)

Follow-up improvements over contributor's original:
- Move _EMPTY_OK_COMMANDS to module-level frozenset (avoid per-call allocation)
- Fix list+tuple concat in _run_browser_command PATH construction
- Update test_browser_homebrew_paths.py for tuple returns and cache fixtures

Co-authored-by: kshitijk4poor <82637225+kshitijk4poor@users.noreply.github.com>
Closes #7168, closes #7171, closes #7172, closes #7173
2026-04-10 13:05:44 -07:00
Teknium
7e28b7b5d5
fix: parallelize skills browse/search to prevent hanging (#7301)
hermes skills browse ran all 7 source adapters serially with no overall
timeout and no progress indicator. On a cold cache, GitHubSource alone
could make 100+ sequential HTTP calls (directory listing + inspect per
skill per tap), taking 5+ minutes with no output — appearing to hang.

Changes:
- Add parallel_search_sources() in tools/skills_hub.py that runs all
  source adapters concurrently via ThreadPoolExecutor with a 30s
  overall timeout. Sources that finish in time contribute results;
  slow ones are skipped gracefully with a visible notice.
- Update unified_search() to use parallel_search_sources() internally.
- Update do_browse() and do_search() in hermes_cli/skills_hub.py to
  show a Rich spinner while fetching, so the user sees activity.
- Bump per-source limits (clawhub 50→500, lobehub 50→500, etc.) now
  that fetching is parallel — yields far more results per browse.
- Report timed-out sources and suggest re-running for cached results.
- Replace 'inspect/install' footer with 'search deeper' tip.

Worst-case latency drops from 5+ minutes (serial) to ~30s (parallel
with timeout cap). Result count should jump from ~242 to 1000+.
2026-04-10 12:54:18 -07:00
Teknium
a093eb47f7
fix: propagate child activity to parent during delegate_task (#7295)
When delegate_task runs, the parent agent's activity tracker freezes
because child.run_conversation() blocks and the child's own
_touch_activity() never propagates back to the parent. The gateway
inactivity timeout then fires a spurious 'No activity' warning and
eventually kills the agent, even though the subagent is actively working.

Fix: add a heartbeat thread in _run_single_child that calls
parent._touch_activity() every 30 seconds with detail from the child's
activity summary (current tool, iteration count). The thread is a daemon
that starts before child.run_conversation() and is cleaned up in the
finally block.

This also improves the gateway 'Still working...' status messages —
instead of just 'running: delegate_task', users now see what the
subagent is actually doing (e.g., 'delegate_task: subagent running
terminal (iteration 5/50)').
2026-04-10 12:51:30 -07:00
Teknium
be4f049f46 fix: salvage follow-ups for Weixin adapter (#6747)
- Remove sys.path.insert hack (leftover from standalone dev)
- Add token lock (acquire_scoped_lock/release_scoped_lock) in
  connect()/disconnect() to prevent duplicate pollers across profiles
- Fix get_connected_platforms: WEIXIN check must precede generic
  token/api_key check (requires both token AND account_id)
- Add WEIXIN_HOME_CHANNEL_NAME to _EXTRA_ENV_KEYS
- Add gateway setup wizard with QR login flow
- Add platform status check for partially configured state
- Add weixin.md docs page with full adapter documentation
- Update environment-variables.md reference with all 11 env vars
- Update sidebars.ts to include weixin docs page
- Wire all gateway integration points onto current main

Salvaged from PR #6747 by Zihan Huang.
2026-04-10 05:54:37 -07:00
win4r
aedf6c7964 security(approval): close 4 pattern gaps found by source-grounded audit
Four gaps in DANGEROUS_PATTERNS found by running 10 targeted tests that
each mapped to a specific pattern in approval.py and checked whether the
documented defense actually held.

1. **Heredoc script injection** — `python3 << 'EOF'` bypasses the
   existing `-e`/`-c` flag pattern. Adds pattern for interpreter + `<<`
   covering python{2,3}, perl, ruby, node.

2. **PID expansion self-termination** — `kill -9 $(pgrep hermes)` is
   opaque to the existing `pkill|killall` + name pattern because command
   substitution is not expanded at detection time. Adds structural
   patterns matching `kill` + `$(pgrep` and backtick variants.

3. **Git destructive operations** — `git reset --hard`, `push --force`,
   `push -f`, `clean -f*`, and `branch -D` were entirely absent.
   Note: `branch -d` also triggers because IGNORECASE is global —
   acceptable since -d is still a delete, just a safe one, and the
   prompt is only a confirmation, not a hard block.

4. **chmod +x then execute** — two-step social engineering where a
   script containing dangerous commands is first written to disk (not
   checked by write_file), then made executable and run as `./script`.
   Pattern catches `chmod +x ... [;&|]+ ./` combos. Does not solve the
   deeper architectural issue (write_file not checking content) — that
   is called out in the PR description as a known limitation.

Tests: 23 new cases across 4 test classes, all in test_approval.py:
  - TestHeredocScriptExecution (7 cases, incl. regressions for -c)
  - TestPgrepKillExpansion (5 cases, incl. safe kill PID negative)
  - TestGitDestructiveOps (8 cases, incl. safe git status/push negatives)
  - TestChmodExecuteCombo (3 cases, incl. safe chmod-only negative)

Full suite: 146 passed, 0 failed.
2026-04-10 05:19:21 -07:00
Dusk1e
e683c9db90 fix(security): enforce path boundary checks in skill manager operations 2026-04-10 05:19:21 -07:00
Teknium
7663c98c1e fix: make safe_url_for_log public, add SSRF redirect guards to base.py cache helpers
Follow-up to Dusk1e's PR #7120 (Slack send_image redirect guard):
- Rename _safe_url_for_log -> safe_url_for_log (drop underscore) since
  it is now imported cross-module by the Slack adapter
- Add _ssrf_redirect_guard httpx event hook to cache_image_from_url()
  and cache_audio_from_url() in base.py — same pattern as vision_tools
  and the Slack adapter fix
- Update url_safety.py docstring to reflect broader coverage
- Add regression tests for image/audio redirect blocking + safe passthrough
2026-04-10 05:04:28 -07:00
Teknium
c8e4dcf412
fix: prevent duplicate completion notifications on process kill (#7124)
When kill_process() sends SIGTERM, both it and the reader thread race
to call _move_to_finished() — kill_process sets exit_code=-15 and
enqueues a notification, then the reader thread's process.wait()
returns with exit_code=143 (128+SIGTERM) and enqueues a second one.

Fix: make _move_to_finished() idempotent by tracking whether the
session was actually removed from _running. The second call sees it
was already moved and skips the completion_queue.put().

Adds regression test: test_move_to_finished_idempotent_no_duplicate
2026-04-10 03:52:16 -07:00
alt-glitch
96c060018a fix: remove 115 verified dead code symbols across 46 production files
Automated dead code audit using vulture + coverage.py + ast-grep intersection,
confirmed by Opus deep verification pass. Every symbol verified to have zero
production callers (test imports excluded from reachability analysis).

Removes ~1,534 lines of dead production code across 46 files and ~1,382 lines
of stale test code. 3 entire files deleted (agent/builtin_memory_provider.py,
hermes_cli/checklist.py, tests/hermes_cli/test_setup_model_selection.py).

Co-authored-by: alt-glitch <balyan.sid@gmail.com>
2026-04-10 03:44:43 -07:00
Teknium
04baab5422
fix(mcp): combine content and structuredContent when both present (#7118)
When an MCP server returns both content (model-oriented text) and
structuredContent (machine-oriented JSON), the client now combines
them instead of discarding content.  The text content becomes the
primary result (what the agent reads), and structuredContent is
included as supplementary metadata.

Previously, structuredContent took full precedence — causing data
loss for servers like Desktop Commander that put the actual file
text in content and metadata in structuredContent.

MCP spec guidance: for conversational/agent UX, prefer content.
2026-04-10 03:44:35 -07:00
tars
9a0dfb5a6d fix(gateway): scope /yolo to the active session 2026-04-10 03:38:44 -07:00
Teknium
0f597dd127
fix: STT provider-model mismatch — whisper-1 fed to faster-whisper (#7113)
Legacy flat stt.model config key (from cli-config.yaml.example and older
versions) was passed as a model override to transcribe_audio() by the
gateway, bypassing provider-specific model resolution. When the provider
was 'local' (faster-whisper), this caused:
  ValueError: Invalid model size 'whisper-1'

Changes:
- gateway/run.py, discord.py: stop passing model override — let
  transcribe_audio() handle provider-specific model resolution internally
- get_stt_model_from_config(): now provider-aware, reads from the correct
  nested section (stt.local.model, stt.openai.model, etc.); ignores
  legacy flat key for local provider to prevent model name mismatch
- cli-config.yaml.example: updated STT section to show nested provider
  config structure instead of legacy flat key
- config migration v13→v14: moves legacy stt.model to the correct
  provider section and removes the flat key

Reported by community user on Discord.
2026-04-10 03:27:30 -07:00
maxyangcn
19292eb8bf feat(cron): support Discord thread_id in deliver targets
Add Discord thread support to cron delivery and send_message_tool.

- _parse_target_ref: handle discord platform with chat_id:thread_id format
- _send_discord: add thread_id param, route to /channels/{thread_id}/messages
- _send_to_platform: pass thread_id through for Discord
- Discord adapter send(): read thread_id from metadata for gateway path
- Update tool schema description to document Discord thread targets

Cherry-picked from PR #7046 by pandacooming (maxyangcn).

Follow-up fixes:
- Restore proxy support (resolve_proxy_url/proxy_kwargs_for_aiohttp) that was
  accidentally deleted — would have caused NameError at runtime
- Remove duplicate _DISCORD_TARGET_RE regex; reuse existing _TELEGRAM_TOPIC_TARGET_RE
  via _NUMERIC_TOPIC_RE alias (identical pattern)
- Fix misleading test comments about Discord negative snowflake IDs
  (Discord uses positive snowflakes; negative IDs are a Telegram convention)
- Rewrite misleading scheduler test that claimed to exercise home channel
  fallback but actually tested the explicit platform:chat_id parsing path
2026-04-10 03:20:05 -07:00
Teknium
30ae68dd33 fix: apply hidden_div regex newline bypass fix to skills_guard.py
The same .* pattern vulnerable to newline bypass that was fixed in
prompt_builder.py (PR #6925) also existed in skills_guard.py. Changed
to [\s\S]*? to match across newlines.
2026-04-10 03:05:04 -07:00
aaronagent
9afe1784bd fix: hidden_div regex bypass with newlines, credential config silent failure, webhook route error severity
prompt_builder.py: The `hidden_div` detection pattern uses `.*` which does not
match newlines in Python regex (re.DOTALL is not passed).  An attacker can bypass
detection by splitting the style attribute across lines:
  `<div style="color:red;\ndisplay: none">injected content</div>`
Replace `.*` with `[\s\S]*?` to match across line boundaries.

credential_files.py: `_load_config_files()` catches all exceptions at DEBUG level
(line 171), making YAML parse failures invisible in production logs.  Users whose
credential files silently fail to mount into sandboxes have no diagnostic clue.
Promote to WARNING to match the severity pattern used by the path validation
warnings at lines 150 and 158 in the same function.

webhook.py: `_reload_dynamic_routes()` logs JSON parse failures at WARNING (line
265) but the impact — stale/corrupted dynamic routes persisting silently — warrants
ERROR level to ensure operator visibility in alerting pipelines.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 03:05:04 -07:00
aaronagent
94f5979cc2 fix(approval,mcp): log silent exception handlers, narrow OAuth catches, close server on error
Three silent `except Exception` blocks in approval.py (lines 345, 387, 469) return
fallback values with zero logging — making it impossible to debug callback failures,
allowlist load errors, or config read issues.  Add logger.warning/error calls that
match the pattern already used by save_permanent_allowlist() and _smart_approve()
in the same file.

In mcp_oauth.py, narrow the overly-broad `except Exception` in get_tokens() and
get_client_info() to the specific exceptions Pydantic's model_validate() can raise
(ValueError, TypeError, KeyError), and include the exception message in the warning.
Also wrap the _wait_for_callback() polling loop in try/finally so the HTTPServer is
always closed — previously an asyncio.CancelledError or any exception in the loop
would leak the server socket.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 03:05:04 -07:00
aaronagent
738f0bac13 fix: align auth-by-message classification with status-code path, decode URLs before secret check
error_classifier.py: Message-only auth errors ("invalid api key", "unauthorized",
etc.) were classified as retryable=True (line 707), inconsistent with the HTTP 401
path (line 432) which correctly uses retryable=False + should_fallback=True.  The
mismatch causes 3 wasted retries with the same broken credential before fallback,
while 401 errors immediately attempt fallback.  Align the message-based path to
match: retryable=False, should_fallback=True.

web_tools.py: The _PREFIX_RE secret-detection check in web_extract_tool() runs
against the raw URL string (line 1196).  URL-encoded secrets like %73k-1234... (
sk-1234...) bypass the filter because the regex expects literal ASCII.  Add
urllib.parse.unquote() before the check so percent-encoded variants are also caught.

Co-Authored-By: Claude Sonnet 4.6 (1M context) <noreply@anthropic.com>
2026-04-10 03:05:04 -07:00
alt-glitch
1f1f297528 feat(environments): unified file sync with change tracking and deletion
Replace per-backend ad-hoc file sync with a shared FileSyncManager
that handles mtime-based change detection, remote deletion of
locally-removed files, and transactional state updates.

- New FileSyncManager class (tools/environments/file_sync.py)
  with callbacks for upload/delete, rate limiting, and rollback
- Shared iter_sync_files() eliminates 3 duplicate implementations
- SSH: replace unconditional rsync with scp + mtime skip
- Modal/Daytona: replace inline _synced_files dict with manager
- All 3 backends now sync credentials + skills + cache uniformly
- Remote deletion: files removed locally are cleaned from remote
- HERMES_FORCE_FILE_SYNC=1 env var for debugging
- Base class _before_execute() simplified to empty hook
- 12 unit tests covering mtime skip, deletion, rollback, rate limiting
2026-04-10 03:01:46 -07:00
Teknium
a420235b66 fix: reject foreground timeout above cap instead of clamping
Change behavior from silent clamping to returning an error when the
model requests a foreground timeout exceeding FOREGROUND_MAX_TIMEOUT.
This forces the model to use background=true for long-running commands
rather than silently changing its intent.

- Config default timeouts above the cap are NOT rejected (user's choice)
- Only explicit model-requested timeouts trigger rejection
- Added boundary test for timeout exactly at the limit
2026-04-10 02:58:54 -07:00
kshitijk4poor
6c3565df57 fix(terminal): cap foreground timeout to prevent session deadlocks
When the model calls terminal() in foreground mode without background=true
(e.g. to start a server), the tool call blocks until the command exits or
the timeout expires. Without an upper bound the model can request arbitrarily
high timeouts (the schema had minimum=1 but no maximum), blocking the entire
agent session for hours until the gateway idle watchdog kills it.

Changes:
- Add FOREGROUND_MAX_TIMEOUT (600s, configurable via
  TERMINAL_MAX_FOREGROUND_TIMEOUT env var) that caps foreground timeout
- Clamp effective_timeout to the cap when background=false and timeout
  exceeds the limit
- Include a timeout_note in the tool result when clamped, nudging the
  model to use background=true for long-running processes
- Update schema description to show the max timeout value
- Remove dead clamping code in the background branch that could never
  fire (max_timeout was set to effective_timeout, so timeout > max_timeout
  was always false)
- Add 7 tests covering clamping, no-clamping, config-default-exceeds-cap
  edge case, background bypass, default timeout, constant value, and
  schema content

Self-review fixes:
- Fixed bug where timeout_note said 'Requested timeout Nones' when
  clamping fired from config default exceeding cap (timeout param is
  None). Now uses unclamped_timeout instead of the raw timeout param.
- Removed unused pytest import from test file
- Extracted test config dict into _make_env_config() helper
- Fixed tautological test_default_value assertion
- Added missing test for config default > cap with no model timeout
2026-04-10 02:58:54 -07:00
Austin Pickett
f805323517 chore: merge main 2026-04-09 20:00:34 -04:00
Teknium
69a0092c38 fix: deduplicate _is_termux() into hermes_constants.is_termux()
Replace 6 identical copies of the Termux detection function across
cli.py, browser_tool.py, voice_mode.py, status.py, doctor.py, and
gateway.py with a single shared implementation in hermes_constants.py.

Each call site imports with its original local name to preserve all
existing callers (internal references and test monkeypatches).
2026-04-09 16:24:53 -07:00
adybag14-cyber
c3141429b7 fix(termux): tighten voice setup and mobile chat UX 2026-04-09 16:24:53 -07:00
adybag14-cyber
769ec1ee1a fix(termux): deepen browser, voice, and tui support 2026-04-09 16:24:53 -07:00
adybag14-cyber
3237733ca5 fix(termux): harden execute_code and mobile browser/audio UX 2026-04-09 16:24:53 -07:00
adybag14-cyber
54d5138a54 fix(termux): harden env-backed background jobs 2026-04-09 16:24:53 -07:00
adybag14-cyber
122925a6f2 fix(termux): honor temp dirs for local temp artifacts 2026-04-09 16:24:53 -07:00
adybag14-cyber
e79cc88985 feat: add tested Termux install path and EOF-aware gh auth 2026-04-09 16:24:53 -07:00
Brooklyn Nicholson
99fd3b518d feat: add /copy and /agents 2026-04-09 17:19:36 -05:00
Teknium
49d8c9557f
fix: cleanup_all_camofox_sessions respects managed persistence (#6820)
When managed_persistence is enabled, cleanup_all now only clears local
tracking state without sending DELETE requests to the Camofox server.
This prevents persistent browser profiles (cookies, logins, localStorage)
from being destroyed during process-wide cleanup.

Ephemeral sessions still get full server-side deletion as before.
2026-04-09 14:54:07 -07:00
Teknium
6f8e426275 fix: add SOCKS proxy support, DISCORD_PROXY env var, and send_message proxy coverage
Follow-up improvements on top of the shared resolver from PR #6562:

- Add platform_env_var parameter to resolve_proxy_url() so DISCORD_PROXY
  takes priority over generic HTTPS_PROXY/ALL_PROXY env vars
- Add SOCKS proxy support via aiohttp_socks.ProxyConnector with rdns=True
  (critical for GFW/Shadowrocket/Clash users — issue #6649)
- proxy_kwargs_for_bot() returns connector= for SOCKS, proxy= for HTTP
- proxy_kwargs_for_aiohttp() returns split (session_kw, request_kw) for
  standalone aiohttp sessions
- Add proxy support to send_message_tool.py (Discord REST, Slack, SMS)
  for cron job delivery behind proxies (from PR #2208)
- Add proxy support to Discord image/document downloads
- Fix duplicate import sys in base.py
2026-04-09 14:19:06 -07:00
dashed
7f7b02b764 fix(slack): comprehensive mrkdwn formatting — 6 bug fixes + 52 tests
Fixes blockquote > escaping, edit_message raw markdown, ***bold italic***
handling, HTML entity double-escaping (&amp;amp;), Wikipedia URL parens
truncation, and step numbering format. Also adds format_message to the
tool-layer _send_to_platform for consistent formatting across all
delivery paths.

Changes:
- Protect Slack entities (<@user>, <https://...|label>, <!here>) from
  escaping passes
- Protect blockquote > markers before HTML entity escaping
- Unescape-before-escape for idempotent HTML entity handling
- ***bold italic*** → *_text_* conversion (before **bold** pass)
- URL regex upgraded to handle balanced parentheses
- mrkdwn:True flag on chat_postMessage payloads
- format_message applied in edit_message and send_message_tool
- 52 new tests (format, edit, streaming, splitting, tool chunking)
- Use reversed(dict) idiom for placeholder restoration

Based on PR #3715 by dashed, cherry-picked onto current main.
2026-04-09 14:07:32 -07:00
Lumen Radley
e22416dd9b fix: handle empty sudo password and false prompts 2026-04-09 02:50:07 -07:00
Teknium
d97f6cec7f
feat(gateway): add BlueBubbles iMessage platform adapter (#6437)
Adds Apple iMessage as a gateway platform via BlueBubbles macOS server.

Architecture:
- Webhook-based inbound (event-driven, no polling/dedup needed)
- Email/phone → chat GUID resolution for user-friendly addressing
- Private API safety (checks helper_connected before tapback/typing)
- Inbound attachment downloading (images, audio, documents cached locally)
- Markdown stripping for clean iMessage delivery
- Smart progress suppression for platforms without message editing

Based on PR #5869 by @benjaminsehl (webhook architecture, GUID resolution,
Private API safety, progress suppression) with inbound attachment downloading
from PR #4588 by @1960697431 (attachment cache routing).

Integration points: Platform enum, env config, adapter factory, auth maps,
cron delivery, send_message routing, channel directory, platform hints,
toolset definition, setup wizard, status display.

27 tests covering config, adapter, webhook parsing, GUID resolution,
attachment download routing, toolset consistency, and prompt hints.
2026-04-08 23:54:03 -07:00
helix4u
e94008c404 fix(terminal): guard invalid command values 2026-04-08 21:37:51 -07:00
angelos
e7d3e9d767 fix(terminal): persistent sandbox envs survive between turns
`_cleanup_task_resources` was unconditionally calling `cleanup_vm()` at
the end of every `run_conversation` (i.e. every user turn), tearing down
the docker/daytona/modal sandbox container regardless of its
`persistent_filesystem` setting. This contradicted the documented intent
of `terminal.lifetime_seconds` (idle reaper) and `container_persistent`,
and caused per-turn loss of `/workspace`, `~/.config`, agent CLI auth
state, and any other content living inside the sandbox.

The unconditional teardown was introduced in fbd3a2fd ("prevent leakage
of morph instances between tasks", 2025-11-04) to plug a Morph backend
leak, two days after `lifetime_seconds` shipped in faecbddd. It was
later refactored into `_cleanup_task_resources` in 70dd3a16 without
changing semantics. Code and docs have disagreed since.

Fix: introduce `terminal_tool.is_persistent_env(task_id)` and skip the
per-turn `cleanup_vm` when the active env is persistent. The idle reaper
(`_cleanup_inactive_envs`) still tears persistent envs down once
`terminal.lifetime_seconds` is exceeded. Non-persistent backends (Morph)
are unchanged — still torn down per turn, preserving the original
leak-prevention intent.
2026-04-08 21:31:57 -07:00
alt-glitch
d684d7ee7e feat(environments): unified spawn-per-call execution layer
Replace dual execution model (PersistentShellMixin + per-backend oneshot)
with spawn-per-call + session snapshot for all backends except ManagedModal.

Core changes:
- Every command spawns a fresh bash process; session snapshot (env vars,
  functions, aliases) captured at init and re-sourced before each command
- CWD persists via file-based read (local) or in-band stdout markers (remote)
- ProcessHandle protocol + _ThreadedProcessHandle adapter for SDK backends
- cancel_fn wired for Modal (sandbox.terminate) and Daytona (sandbox.stop)
- Shared utilities extracted: _pipe_stdin, _popen_bash, _load_json_store,
  _save_json_store, _file_mtime_key, _SYNC_INTERVAL_SECONDS
- Rate-limited file sync unified in base _before_execute() with _sync_files() hook
- execute_oneshot() removed; all 11 call sites in code_execution_tool.py
  migrated to execute()
- Daytona timeout wrapper replaced with SDK-native timeout parameter
- persistent_shell.py deleted (291 lines)

Backend-specific:
- Local: process-group kill via os.killpg, file-based CWD read
- Docker: -e env flags only on init_session, not per-command
- SSH: shlex.quote transport, ControlMaster connection reuse
- Singularity: apptainer exec with instance://, no forced --pwd
- Modal: _AsyncWorker + _ThreadedProcessHandle, cancel_fn -> sandbox.terminate
- Daytona: SDK-level timeout (not shell wrapper), cancel_fn -> sandbox.stop
- ManagedModal: unchanged (gateway owns execution); docstring added explaining why
2026-04-08 17:23:15 -07:00
Teknium
7156f8d866
fix: CI test failures — metadata key, cli console, docker env, vision order (#6294)
Fixes 9 test failures on current main, incorporating ideas from PR stack
#6219-#6222 by xinbenlv with corrections:

- model_metadata: sync HF context length key casing
  (minimaxai/minimax-m2.5 → MiniMaxAI/MiniMax-M2.5)

- cli.py: route quick command error output through self.console
  instead of creating a new ChatConsole() instance

- docker.py: explicit docker_forward_env entries now bypass the
  Hermes secret blocklist (intentional opt-in wins over generic filter)

- auxiliary_client: revert _read_main_provider() to simple
  provider.strip().lower() — the _normalize_aux_provider() call
  introduced in 5c03f2e7 stripped the custom: prefix, breaking
  named custom provider resolution

- auxiliary_client: flip vision auto-detection order to
  active provider → OpenRouter → Nous → stop (was OR → Nous → active)

- test: update vision priority test to match new order

Based on PR #6219-#6222 by xinbenlv.
2026-04-08 16:37:05 -07:00
jjovalle99
d46db0a1b4 fix(tools): use correct import path for mistralai SDK
mistralai v2.x is a namespace package — `Mistral` class lives at
`mistralai.client`, not at the top-level `mistralai` module. The
previous `from mistralai import Mistral` raises ImportError at runtime.

Update both production code and test fixture to use the correct path.
2026-04-08 13:47:08 -07:00
jjovalle99
5f4b93c20f feat(tools): add Voxtral Transcribe STT provider (Mistral AI) 2026-04-08 13:47:08 -07:00
Teknium
4f467700d4
fix(doctor): only check the active memory provider, not all providers unconditionally (#6285)
* fix(tools): skip camofox auto-cleanup when managed persistence is enabled

When managed_persistence is enabled, cleanup_browser() was calling
camofox_close() which destroys the server-side browser context via
DELETE /sessions/{userId}, killing login sessions across cron runs.

Add camofox_soft_cleanup() — a public wrapper that drops only the
in-memory session entry when managed persistence is on, returning True.
When persistence is off it returns False so the caller falls back to
the full camofox_close().  The inactivity reaper still handles idle
resource cleanup.

Also surface a logger.warning() when _managed_persistence_enabled()
fails to load config, replacing a silent except-and-return-False.

Salvaged from #6182 by el-analista (Eduardo Perea Fernandez).
Added public API wrapper to avoid cross-module private imports,
and test coverage for both persistence paths.

Co-authored-by: Eduardo Perea Fernandez <el-analista@users.noreply.github.com>

* fix(doctor): only check the active memory provider, not all providers unconditionally

hermes doctor had hardcoded Honcho Memory and Mem0 Memory sections that
always ran regardless of the user's memory.provider config setting. After
the swappable memory provider update (#4623), users with leftover Honcho
config but no active provider saw false 'broken' errors.

Replaced both sections with a single Memory Provider section that reads
memory.provider from config.yaml and only checks the configured provider.
Users with no external provider see a green 'Built-in memory active' check.

Reported by community user michaelruiz001, confirmed by Eri (Honcho).

---------

Co-authored-by: Eduardo Perea Fernandez <el-analista@users.noreply.github.com>
2026-04-08 13:44:58 -07:00
mrshu
19b0ddce40 fix(process): correct detached crash recovery state
Previously crash recovery recreated detached sessions as if they were
fully managed, so polls and kills could lie about liveness and the
checkpoint could forget recovered jobs after the next restart.
This commit refreshes recovered host-backed sessions from real PID
state, keeps checkpoint data durable, and preserves notify watcher
metadata while treating sandbox-only PIDs as non-recoverable.

- Persist `pid_scope` in `tools/process_registry.py` and skip
  recovering sandbox-backed entries without a host-visible PID handle
- Refresh detached sessions on access so `get`/`poll`/`wait` and active
  session queries observe exited processes instead of hanging forever
- Allow recovered host PIDs to be terminated honestly and requeue
  `notify_on_complete` watchers during checkpoint recovery
- Add regression tests for durable checkpoints, detached exit/kill
  behavior, sandbox skip logic, and recovered notify watchers
2026-04-08 03:35:43 -07:00
Vasanthdev2004
085c1c6875 fix(browser): preserve agent-browser paths with spaces 2026-04-08 02:35:48 -07:00
Teknium
3696c74bfb fix: preserve existing thresholds, remove pre-read byte guard
- DEFAULT_RESULT_SIZE_CHARS: 50K -> 100K (match current _LARGE_RESULT_CHARS)
- DEFAULT_PREVIEW_SIZE_CHARS: 2K -> 1.5K (match current _LARGE_RESULT_PREVIEW_CHARS)
- Per-tool overrides all set to 100K (terminal, execute_code, search_files)
- Remove pre-read byte guard (no behavioral regression vs current main)
- Revert limit signature change to int=500 (match current default)
- Restore original read_file schema description
- Update test assertions to match 100K thresholds
2026-04-08 02:24:32 -07:00
alt-glitch
bbcff8dcd0 fix(tools): address PR review — remove _extract_raw_output, BudgetConfig everywhere, read_file hardening
- Remove _extract_raw_output: persist content verbatim (fixes size mismatch bug)
- Drop import aliases: import from budget_config directly, one canonical name
- BudgetConfig param on maybe_persist_tool_result and enforce_turn_budget
- read_file: limit=None signature, pre-read guard fires only when limit omitted (256KB)
- Unify binary extensions: file_operations.py imports from binary_extensions.py
- Exclude .pdf and .svg from binary set (text-based, agents may inspect)
- Remove redundant outer try/except in eval path (internal fallback handles it)
- Fix broken tests: update assertion strings for new persistence format
- Module-level constants: _PRE_READ_MAX_BYTES, _DEFAULT_READ_LIMIT
- Remove redundant pathlib import (Path already at module level)
- Update spec.md with IMPLEMENTED annotations and design decisions
2026-04-08 02:24:32 -07:00
alt-glitch
77c5bc9da9 feat(budget): make tool result persistence thresholds configurable
Add BudgetConfig dataclass to centralize and make overridable the
hardcoded constants (50K per-result, 200K per-turn, 2K preview) that
control when tool outputs get persisted to sandbox. Configurable at
the RL environment level via HermesAgentEnvConfig fields, threaded
through HermesAgentLoop to the storage layer.

Resolution: pinned (read_file=inf) > env config overrides > registry
per-tool > default. CLI override: --env.turn_budget_chars 80000
2026-04-08 02:24:32 -07:00
alt-glitch
65e24c942e wip: tool result fixes -- persistence 2026-04-08 02:24:32 -07:00
Teknium
fff237e111
feat(cron): track delivery failures in job status (#6042)
_deliver_result() now returns Optional[str] — None on success, error
message on failure. All failure paths (unknown platform, platform
disabled, config load error, send failure, unresolvable target)
return descriptive error strings.

mark_job_run() gains delivery_error param, tracked as
last_delivery_error on the job — separate from agent execution errors.
A job where the agent succeeded but delivery failed shows
last_status='ok' + last_delivery_error='...'.

The cronjob list tool now surfaces last_delivery_error so agents and
users can see when cron outputs aren't arriving.

Inspired by PR #5863 (oxngon) — reimplemented with proper wiring.

Tests: 3 new mark_job_run tests + 6 new _deliver_result return tests.
2026-04-07 22:49:01 -07:00
Teknium
b9a5e6e247 fix: use camelCase structuredContent attr, prefer structured over text
- The MCP SDK Pydantic model uses camelCase (structuredContent), not
  snake_case (structured_content). The original getattr was a silent no-op.
- When structuredContent is present, return it AS the result instead of
  alongside text — the structured payload is the machine-readable data.
- Move test file to tests/tools/ and fix fake class to use camelCase.
- Patch _run_on_mcp_loop in tests so the handler actually executes.
2026-04-07 18:00:01 -07:00
r266-tech
2ad7694874 fix(mcp): preserve structured_content in tool call results
MCP CallToolResult may include structured_content (a JSON object) alongside
content blocks. The tool handler previously only forwarded concatenated text
from content blocks, silently dropping the structured payload.

This breaks MCP tools that return a minimal human text in content while
putting the actual machine-usable payload in structured_content.

Now, when structured_content is present, it is included in the returned
JSON under the 'structuredContent' key.

Fixes NousResearch/hermes-agent#5874
2026-04-07 18:00:01 -07:00
Teknium
f3c59321af fix: add _profile_arg tests + move STT language to config.yaml
- Add 7 unit tests for _profile_arg: default home, named profile,
  hash path, nested path, invalid name, systemd integration, launchd integration
- Add stt.local.language to config.yaml (empty = auto-detect)
- Both STT code paths now read config.yaml first, env var fallback,
  then default (auto-detect for faster-whisper, 'en' for CLI command)
- HERMES_LOCAL_STT_LANGUAGE env var still works as backward-compat fallback
2026-04-07 17:59:16 -07:00
Marc Bickel
6e02fa73c2 fix(discord): discard empty placeholder on voice transcription + force STT language
- gateway/run.py: Strip "(The user sent a message with no text content)"
  placeholder when voice transcription succeeds — it was being appended
  alongside the transcript, creating duplicate user turns.
- tools/transcription_tools.py: Wire HERMES_LOCAL_STT_LANGUAGE env var
  into the faster-whisper backend. It was only used by the CLI fallback
  path (_transcribe_local_command), not the primary faster-whisper path.
2026-04-07 17:59:16 -07:00
Teknium
469cd16fe0
fix(security): consolidated security hardening — SSRF, timing attack, tar traversal, credential leakage (#5944)
Salvaged from PRs #5800 (memosr), #5806 (memosr), #5915 (Ruzzgar), #5928 (Awsh1).

Changes:
- Use hmac.compare_digest for API key comparison (timing attack prevention)
- Apply provider env var blocklist to Docker containers (credential leakage)
- Replace tar.extractall() with safe extraction in TerminalBench2 (CVE-2007-4559)
- Add SSRF protection via is_safe_url to ALL platform adapters:
  base.py (cache_image_from_url, cache_audio_from_url),
  discord, slack, telegram, matrix, mattermost, feishu, wecom
  (Signal and WhatsApp protected via base.py helpers)
- Update tests: mock is_safe_url in Mattermost download tests
- Add security tests for tar extraction (traversal, symlinks, safe files)
2026-04-07 17:28:37 -07:00
Teknium
b1a66d55b4
refactor: migrate 10 config.yaml inline loaders to read_raw_config()
Replace 10 callsites across 6 files that manually opened config.yaml,
called yaml.safe_load(), and handled missing-file/parse-error fallbacks
with the new read_raw_config() helper from hermes_cli/config.py.

Each migrated site previously had 5-8 lines of boilerplate:
    config_path = get_hermes_home() / 'config.yaml'
    if config_path.exists():
        import yaml
        with open(config_path) as f:
            cfg = yaml.safe_load(f) or {}

Now reduced to:
    from hermes_cli.config import read_raw_config
    cfg = read_raw_config()

Migrated files:
- tools/browser_tool.py (4 sites): command_timeout, cloud_provider,
  allow_private_urls, record_sessions
- tools/env_passthrough.py: terminal.env_passthrough
- tools/credential_files.py: terminal.credential_files
- tools/transcription_tools.py: stt.model
- hermes_cli/commands.py: config-gated command resolution
- hermes_cli/auth.py (2 sites): model config read + provider reset

Skipped (intentionally):
- gateway/run.py: 10+ sites with local aliases, critical path
- hermes_cli/profiles.py: profile-specific config path
- hermes_cli/doctor.py: reads raw then writes fixes back
- agent/model_metadata.py: different file (context_length_cache.yaml)
- tools/website_policy.py: custom config_path param + error types
2026-04-07 17:28:23 -07:00
Siddharth Balyan
f3006ebef9
refactor(tests): re-architect tests + fix CI failures (#5946)
* refactor: re-architect tests to mirror the codebase

* Update tests.yml

* fix: add missing tool_error imports after registry refactor

* fix(tests): replace patch.dict with monkeypatch to prevent env var leaks under xdist

patch.dict(os.environ) can leak TERMINAL_ENV across xdist workers,
causing test_code_execution tests to hit the Modal remote path.

* fix(tests): fix update_check and telegram xdist failures

- test_update_check: replace patch("hermes_cli.banner.os.getenv") with
  monkeypatch.setenv("HERMES_HOME") — banner.py no longer imports os
  directly, it uses get_hermes_home() from hermes_constants.

- test_telegram_conflict/approval_buttons: provide real exception classes
  for telegram.error mock (NetworkError, TimedOut, BadRequest) so the
  except clause in connect() doesn't fail with "catching classes that do
  not inherit from BaseException" when xdist pollutes sys.modules.

* fix(tests): accept unavailable_models kwarg in _prompt_model_selection mock
2026-04-07 17:19:07 -07:00
Teknium
678a87c477
refactor: add tool_error/tool_result helpers + read_raw_config, migrate 129 callsites
Add three reusable helpers to eliminate pervasive boilerplate:

tools/registry.py — tool_error() and tool_result():
  Every tool handler returns JSON strings. The pattern
  json.dumps({"error": msg}, ensure_ascii=False) appeared 106 times,
  and json.dumps({"success": False, "error": msg}, ...) another 23.
  Now: tool_error(msg) or tool_error(msg, success=False).

  tool_result() handles arbitrary result dicts:
  tool_result(success=True, data=payload) or tool_result(some_dict).

hermes_cli/config.py — read_raw_config():
  Lightweight YAML reader that returns the raw config dict without
  load_config()'s deep-merge + migration overhead. Available for
  callsites that just need a single config value.

Migration (129 callsites across 32 files):
- tools/: browser_camofox (18), file_tools (10), homeassistant (8),
  web_tools (7), skill_manager (7), cronjob (11), code_execution (4),
  delegate (5), send_message (4), tts (4), memory (7), session_search (3),
  mcp (2), clarify (2), skills_tool (3), todo (1), vision (1),
  browser (1), process_registry (2), image_gen (1)
- plugins/memory/: honcho (9), supermemory (9), hindsight (8),
  holographic (7), openviking (7), mem0 (7), byterover (6), retaindb (2)
- agent/: memory_manager (2), builtin_memory_provider (1)
2026-04-07 13:36:38 -07:00
Teknium
ca0459d109
refactor: remove 24 confirmed dead functions — 432 lines of unused code
Each function was verified to have exactly 1 reference in the entire
codebase (its own definition). Zero calls, zero imports, zero string
references anywhere including tests.

Removed by category:

Superseded wrappers (replaced by newer implementations):
- agent/anthropic_adapter.py: run_hermes_oauth_login, refresh_hermes_oauth_token
- hermes_cli/callbacks.py: sudo_password_callback (superseded by CLI method)
- hermes_cli/setup.py: _set_model_provider, _sync_model_from_disk
- tools/file_tools.py: get_file_tools (superseded by registry.register)
- tools/cronjob_tools.py: get_cronjob_tool_definitions (same)
- tools/terminal_tool.py: _check_dangerous_command (_check_all_guards used)

Dead private helpers (lost their callers during refactors):
- agent/anthropic_adapter.py: _convert_user_content_part_to_anthropic
- agent/display.py: honcho_session_line, write_tty
- hermes_cli/providers.py: _build_labels (+ dead _labels_cache var)
- hermes_cli/tools_config.py: _prompt_yes_no
- hermes_cli/models.py: _extract_model_ids
- hermes_cli/uninstall.py: log_error
- gateway/platforms/feishu.py: _is_loop_ready
- tools/file_operations.py: _read_image (64-line method)
- tools/process_registry.py: cleanup_expired
- tools/skill_manager_tool.py: check_skill_manage_requirements

Dead class methods (zero callers):
- run_agent.py: _is_anthropic_url (logic duplicated inline at L618)
- run_agent.py: _classify_empty_content_response (68-line method, never wired)
- cli.py: reset_conversation (callers all use new_session directly)
- cli.py: _clear_current_input (added but never wired in)

Other:
- gateway/delivery.py: build_delivery_context_for_tool
- tools/browser_tool.py: get_active_browser_sessions
2026-04-07 11:41:26 -07:00
Teknium
187e90e425
refactor: replace inline HERMES_HOME re-implementations with get_hermes_home()
16 callsites across 14 files were re-deriving the hermes home path
via os.environ.get('HERMES_HOME', ...) instead of using the canonical
get_hermes_home() from hermes_constants. This breaks profiles — each
profile has its own HERMES_HOME, and the inline fallback defaults to
~/.hermes regardless.

Fixed by importing and calling get_hermes_home() at each site. For
files already inside the hermes process (agent/, hermes_cli/, tools/,
gateway/, plugins/), this is always safe. Files that run outside the
process context (mcp_serve.py, mcp_oauth.py) already had correct
try/except ImportError fallbacks and were left alone.

Skipped: hermes_constants.py (IS the implementation), env_loader.py
(bootstrap), profiles.py (intentionally manipulates the env var),
standalone scripts (optional-skills/, skills/), and tests.
2026-04-07 10:40:34 -07:00
Teknium
d0ffb111c2
refactor: codebase-wide lint cleanup — unused imports, dead code, and inefficient patterns (#5821)
Comprehensive cleanup across 80 files based on automated (ruff, pyflakes, vulture)
and manual analysis of the entire codebase.

Changes by category:

Unused imports removed (~95 across 55 files):
- Removed genuinely unused imports from all major subsystems
- agent/, hermes_cli/, tools/, gateway/, plugins/, cron/
- Includes imports in try/except blocks that were truly unused
  (vs availability checks which were left alone)

Unused variables removed (~25):
- Removed dead variables: connected, inner, channels, last_exc,
  source, new_server_names, verify, pconfig, default_terminal,
  result, pending_handled, temperature, loop
- Dropped unused argparse subparser assignments in hermes_cli/main.py
  (12 instances of add_parser() where result was never used)

Dead code removed:
- run_agent.py: Removed dead ternary (None if False else None) and
  surrounding unreachable branch in identity fallback
- run_agent.py: Removed write-only attribute _last_reported_tool
- hermes_cli/providers.py: Removed dead @property decorator on
  module-level function (decorator has no effect outside a class)
- gateway/run.py: Removed unused MCP config load before reconnect
- gateway/platforms/slack.py: Removed dead SessionSource construction

Undefined name bugs fixed (would cause NameError at runtime):
- batch_runner.py: Added missing logger = logging.getLogger(__name__)
- tools/environments/daytona.py: Added missing Dict and Path imports

Unnecessary global statements removed (14):
- tools/terminal_tool.py: 5 functions declared global for dicts
  they only mutated via .pop()/[key]=value (no rebinding)
- tools/browser_tool.py: cleanup thread loop only reads flag
- tools/rl_training_tool.py: 4 functions only do dict mutations
- tools/mcp_oauth.py: only reads the global
- hermes_time.py: only reads cached values

Inefficient patterns fixed:
- startswith/endswith tuple form: 15 instances of
  x.startswith('a') or x.startswith('b') consolidated to
  x.startswith(('a', 'b'))
- len(x)==0 / len(x)>0: 13 instances replaced with pythonic
  truthiness checks (not x / bool(x))
- in dict.keys(): 5 instances simplified to in dict
- Redefined unused name: removed duplicate _strip_mdv2 import in
  send_message_tool.py

Other fixes:
- hermes_cli/doctor.py: Replaced undefined logger.debug() with pass
- hermes_cli/config.py: Consolidated chained .endswith() calls

Test results: 3934 passed, 17 failed (all pre-existing on main),
19 skipped. Zero regressions.
2026-04-07 10:25:31 -07:00
Ben Barclay
b2f477a30b
feat: switch managed browser provider from Browserbase to Browser Use (#5750)
* feat: switch managed browser provider from Browserbase to Browser Use

The Nous subscription tool gateway now routes browser automation through
Browser Use instead of Browserbase. This commit:

- Adds managed Nous gateway support to BrowserUseProvider (idempotency
  keys, X-BB-API-Key auth header, external_call_id persistence)
- Removes managed gateway support from BrowserbaseProvider (now
  direct-only via BROWSERBASE_API_KEY/BROWSERBASE_PROJECT_ID)
- Updates browser_tool.py fallback: prefers Browser Use over Browserbase
- Updates nous_subscription.py: gateway vendor 'browser-use', auto-config
  sets cloud_provider='browser-use' for new subscribers
- Updates tools_config.py: Nous Subscription entry now uses Browser Use
- Updates setup.py, cli.py, status.py, prompt_builder.py display strings
- Updates all affected tests to match new behavior

Browserbase remains fully functional for users with direct API credentials.
The change only affects the managed/subscription path.

* chore: remove redundant Browser Use hint from system prompt

* fix: upgrade Browser Use provider to v3 API

- Base URL: api/v2 -> api/v3 (v2 is legacy)
- Unified all endpoints to use native Browser Use paths:
  - POST /browsers (create session, returns cdpUrl)
  - PATCH /browsers/{id} with {action: stop} (close session)
- Removed managed-mode branching that used Browserbase-style
  /v1/sessions paths — v3 gateway now supports /browsers directly
- Removed unused managed_mode variable in close_session

* fix(browser-use): use X-Browser-Use-API-Key header for managed mode

The managed gateway expects X-Browser-Use-API-Key, not X-BB-API-Key
(which is a Browserbase-specific header). Using the wrong header caused
a 401 AUTH_ERROR on every managed-mode browser session create.

Simplified _headers() to always use X-Browser-Use-API-Key regardless
of direct vs managed mode.

* fix(nous_subscription): browserbase explicit provider is direct-only

Since managed Nous gateway now routes through Browser Use, the
browserbase explicit provider path should not check managed_browser_available
(which resolves against the browser-use gateway). Simplified to direct-only
with managed=False.

* fix(browser-use): port missing improvements from PR #5605

- CDP URL normalization: resolve HTTP discovery URLs to websocket after
  cloud provider create_session() (prevents agent-browser failures)
- Managed session payload: send timeout=5 and proxyCountryCode=us for
  gateway-backed sessions (prevents billing overruns)
- Update prompt builder, browser_close schema, and module docstring to
  replace remaining Browserbase references with Browser Use
- Dynamic /browser status detection via _get_cloud_provider() instead
  of hardcoded env var checks (future-proof for new providers)
- Rename post_setup key from 'browserbase' to 'agent_browser'
- Update setup hint to mention Browser Use alongside Browserbase
- Add tests: CDP normalization, browserbase direct-only guard,
  managed browser-use gateway, direct browserbase fallback

---------

Co-authored-by: rob-maron <132852777+rob-maron@users.noreply.github.com>
2026-04-07 08:40:22 -04:00
Teknium
8b861b77c1
refactor: remove browser_close tool — auto-cleanup handles it (#5792)
* refactor: remove browser_close tool — auto-cleanup handles it

The browser_close tool was called in only 9% of browser sessions (13/144
navigations across 66 sessions), always redundantly — cleanup_browser()
already runs via _cleanup_task_resources() at conversation end, and the
background inactivity reaper catches anything else.

Removing it saves one tool schema slot in every browser-enabled API call.

Also fixes a latent bug: cleanup_browser() now handles Camofox sessions
too (previously only Browserbase). Camofox sessions were never auto-cleaned
per-task because they live in a separate dict from _active_sessions.

Files changed (13):
- tools/browser_tool.py: remove function, schema, registry entry; add
  camofox cleanup to cleanup_browser()
- toolsets.py, model_tools.py, prompt_builder.py, display.py,
  acp_adapter/tools.py: remove browser_close from all tool lists
- tests/: remove browser_close test, update toolset assertion
- docs/skills: remove all browser_close references

* fix: repeat browser_scroll 5x per call for meaningful page movement

Most backends scroll ~100px per call — barely visible on a typical
viewport. Repeating 5x gives ~500px (~half a viewport), making each
scroll tool call actually useful.

Backend-agnostic approach: works across all 7+ browser backends without
needing to configure each one's scroll amount individually. Breaks
early on error for the agent-browser path.

* feat: auto-return compact snapshot from browser_navigate

Every browser session starts with navigate → snapshot. Now navigate
returns the compact accessibility tree snapshot inline, saving one
tool call per browser task.

The snapshot captures the full page DOM (not viewport-limited), so
scroll position doesn't affect it. browser_snapshot remains available
for refreshing after interactions or getting full=true content.

Both Browserbase and Camofox paths auto-snapshot. If the snapshot
fails for any reason, navigation still succeeds — the snapshot is
a bonus, not a requirement.

Schema descriptions updated to guide models: navigate mentions it
returns a snapshot, snapshot mentions it's for refresh/full content.

* refactor: slim cronjob tool schema — consolidate model/provider, drop unused params

Session data (151 calls across 67 sessions) showed several schema
properties were never used by models. Consolidated and cleaned up:

Removed from schema (still work via backend/CLI):
- skill (singular): use skills array instead
- reason: pause-only, unnecessary
- include_disabled: now defaults to true
- base_url: extreme edge case, zero usage
- provider (standalone): merged into model object

Consolidated:
- model + provider → single 'model' object with {model, provider} fields.
  If provider is omitted, the current main provider is pinned at creation
  time so the job stays stable even if the user changes their default.

Kept:
- script: useful data collection feature
- skills array: standard interface for skill loading

Schema shrinks from 14 to 10 properties. All backend functionality
preserved — the Python function signature and handler lambda still
accept every parameter.

* fix: remove mixture_of_agents from core toolsets — opt-in only via hermes tools

MoA was in _HERMES_CORE_TOOLS and composite toolsets (hermes-cli,
hermes-messaging, safe), which meant it appeared in every session
for anyone with OPENROUTER_API_KEY set. The _DEFAULT_OFF_TOOLSETS
gate only works after running 'hermes tools' explicitly.

Now MoA only appears when a user explicitly enables it via
'hermes tools'. The moa toolset definition and check_fn remain
unchanged — it just needs to be opted into.
2026-04-07 03:28:44 -07:00
Teknium
e120d2afac
feat: notify_on_complete for background processes (#5779)
* feat: notify_on_complete for background processes

When terminal(background=true, notify_on_complete=true), the system
auto-triggers a new agent turn when the process exits — no polling needed.

Changes:
- ProcessSession: add notify_on_complete field
- ProcessRegistry: add completion_queue, populate on _move_to_finished()
- Terminal tool: add notify_on_complete parameter to schema + handler
- CLI: drain completion_queue after agent turn AND during idle loop
- Gateway: enhanced _run_process_watcher injects synthetic MessageEvent
  on completion, triggering a full agent turn
- Checkpoint persistence includes notify_on_complete for crash recovery
- code_execution_tool: block notify_on_complete in sandbox scripts
- 15 new tests covering queue mechanics, checkpoint round-trip, schema

* docs: update terminal tool descriptions for notify_on_complete

- background: remove 'ONLY for servers' language, describe both patterns
  (long-lived processes AND long-running tasks with notify_on_complete)
- notify_on_complete: more prescriptive about when to use it
- TERMINAL_TOOL_DESCRIPTION: remove 'Do NOT use background for builds'
  guidance that contradicted the new feature
2026-04-07 02:40:16 -07:00
Teknium
d9e7e42d0b
fix(approval): load permanent command allowlist on startup (#5076)
Co-authored-by: Timo Karp <timo@timos-macbook-pro.taildbbd26.ts.net>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-07 01:00:02 -07:00