Commit Graph

768 Commits

Author SHA1 Message Date
Teknium
9c416e20ab
feat(skills): install skills from a direct HTTP(S) URL (#16323)
* feat(skills): install skills from a direct HTTP(S) URL

Adds UrlSource adapter so `hermes skills install <url-to-SKILL.md>` and
`/skills install <url>` work as first-class operations — no more
improvising with curl + patch + cp.

- Claims identifiers that start with http(s):// and end in .md
- Skips /.well-known/skills/ URLs (WellKnownSkillSource handles those)
- Skill name from YAML frontmatter, URL-slug fallback
- Single-file SKILL.md only (v1 scope — multi-file skills need a manifest)
- Trust level 'community'; full security scan still runs
- Lock file stores the URL as identifier so `hermes skills update`
  re-fetches from the same URL cleanly

Scope matches real user need from @versun's docx feedback where
`https://sharethis.chat/SKILL.md` had no first-class install path.

* feat(skills): interactive name/category for URL installs + --name override

Follow-up to the UrlSource adapter. The previous commit fell back to weak
heuristics when frontmatter had no ``name:`` and could produce garbage names
like ``SKILL`` or ``unnamed-skill``. Now:

tools/skills_hub.py
- ``UrlSource._is_valid_skill_name()`` — strict identifier check
  (``^[a-z][a-z0-9_-]*$``), rejects sentinel values (``SKILL``, ``README``,
  ``INDEX``, ``unnamed-skill``, empty, non-strings).
- ``_resolve_skill_name()`` returns ``Optional[str]`` — ``None`` when
  nothing valid is resolvable. Also ignores unsafe frontmatter names
  (``../evil``) and falls through to URL slug instead of returning None
  immediately, so a URL with a bad frontmatter but a good path still
  works.
- ``fetch()``/``inspect()`` carry an ``awaiting_name=True`` marker in
  metadata/extra when resolution fails, letting ``do_install`` decide
  whether to prompt, apply an override, or error out.

hermes_cli/skills_hub.py
- ``do_install`` gains a ``name_override`` parameter.
- On URL-sourced bundles with ``awaiting_name=True``:
  1. If ``name_override`` is valid → use it.
  2. If ``name_override`` is invalid → refuse with a clear error.
  3. Else if ``skip_confirm=True`` (non-interactive: slash / TUI /
     gateway / scripts) → refuse with an actionable retry hint pointing
     at ``--name <your-name>`` on both CLI and slash forms.
  4. Else (interactive TTY) → prompt for the name.
- Interactive TTY also prompts for a category when none is given for a
  URL-sourced install, hinting existing category buckets so users can
  reuse ``productivity``, ``devops``, etc. Empty input → flat install.
- ``_existing_categories()`` scans ``~/.hermes/skills/`` for subdirs that
  look like category buckets (contain nested SKILL.md files); skips
  top-level skills and hidden dirs.
- ``_prompt_for_skill_name()`` / ``_prompt_for_category()`` helpers
  (EOF/Ctrl-C-safe, match the existing ``Confirm [y/N]`` prompt style).

hermes_cli/main.py
- ``hermes skills install`` argparse gains ``--name <name>``.

hermes_cli/skills_hub.py (slash)
- ``/skills install <url> --name <x>`` parsing added.

Tests
- tests/tools/test_skills_hub.py: updated ``UrlSource`` tests to assert
  the new ``awaiting_name`` metadata; added 4 new tests for
  ``_is_valid_skill_name`` rejection sets and the awaiting-name marker.
- tests/hermes_cli/test_skills_hub.py: 8 new tests covering --name
  override accept/reject, non-interactive error, interactive name prompt,
  interactive category prompt, cancel-aborts-install, and
  ``_existing_categories`` scan behavior (buckets vs flat skills).
- E2E verified all four paths (no-name/no-override → error;
  --name override → install; frontmatter name → install;
  invalid --name → rejection).

---------

Co-authored-by: teknium1 <teknium@noreply.github.com>
2026-04-26 20:57:10 -07:00
Brooklyn Nicholson
3e1664923d Revert "fix(tui): report actual session on exit"
This reverts commit 1566f1eecc.
2026-04-26 22:43:34 -05:00
Brooklyn Nicholson
c23463fce9 chore(tui): keep MRU resume split out of perf PR
- remove the temporary -c MRU logic and companion test from this branch so PR #15926 stays focused on TUI perf work
- keep the resume-ordering change isolated in the dedicated follow-up PR
2026-04-26 22:40:35 -05:00
Brooklyn Nicholson
625c31fcea fix(tui): run built TUI with production React by default
CPU profiling showed the built TUI loading React development modules unless NODE_ENV was set. Default CLI and dashboard TUI children to production while preserving explicit user overrides.
2026-04-26 21:34:31 -05:00
Brooklyn Nicholson
7da2f07641 Merge remote-tracking branch 'origin/main' into bb/tui-long-session-perf 2026-04-26 21:07:15 -05:00
Yoimex
f66ebe64e8 fix(cli): coerce use_gateway config flags in tool routing 2026-04-26 19:02:55 -07:00
Teknium
5eb6cd82b2
fix(sessions): /save lands under $HERMES_HOME, widen browse+TUI picker, force-refresh ollama-cloud on setup (#16296)
Four independent session-UX bugs reported by an external user (#16294).

/save wrote hermes_conversation_<ts>.json to CWD — invisible to
'hermes sessions browse' and easy to lose. Snapshots now write under
~/.hermes/sessions/saved/ and the command prints the absolute path plus
a 'hermes --resume <id>' hint for the live DB-indexed session.

'hermes sessions browse' default --limit raised from 50 to 500. With the
old ceiling, users with moderately long histories saw only the most
recent 50 rows and assumed older sessions had been lost.

TUI session.list (`/resume` picker) switched from a hardcoded allow-list
of 13 gateway source names to a deny-list of just { 'tool' }. Sessions
tagged acp / webhook / user-defined HERMES_SESSION_SOURCE values and
any newly-added platform now surface. Default limit 20 → 200.

ollama-cloud provider setup passes force_refresh=True to
fetch_ollama_cloud_models() so a user entering their API key sees the
fresh catalog (e.g. deepseek v4 flash, kimi k2.6) immediately instead
of waiting up to an hour for the disk cache TTL to expire.

Closes #16294.
2026-04-26 18:49:48 -07:00
George Glessner
5b5a53a155 fix(cli): check hermes_cli/web_dist/ not web/dist/ for build staleness
_web_ui_build_needed() in PR #14914 checked web_dir/"dist" as the
sentinel, but vite.config.ts sets outDir: "../hermes_cli/web_dist" so
the build output lands in hermes_cli/web_dist/, never in web/dist/.
The sentinel was therefore always missing → _web_ui_build_needed always
returned True → npm install + Vite build ran on every startup → OOM on
low-memory VPS persisted unchanged.

Fix: derive dist_dir as web_dir.parent / "hermes_cli" / "web_dist" so
the sentinel points to the actual build output directory.

Fixes #14898
2026-04-26 18:43:57 -07:00
Teknium
cd2aee36ca test(sessions): wire sessions_dir through auto-prune + file-cleanup regression tests
- TestAutoMaintenance gains 3 tests: auto-prune deletes transcript files
  when sessions_dir is passed, preserves them when it isn't (backward-
  compat), and never touches active-session files during prune.
- FakeDB helpers in test_sessions_delete.py accept **kwargs so they
  don't break when delete_session signature gains sessions_dir.
2026-04-26 18:31:07 -07:00
Brooklyn Nicholson
cb7cfba6de fix(cli): surface last_active in search_sessions so -c works 2026-04-26 16:21:57 -05:00
Brooklyn Nicholson
bde89c169b fix(cli): -c picks the most recently used session 2026-04-26 16:17:39 -05:00
Brooklyn Nicholson
1566f1eecc fix(tui): report actual session on exit 2026-04-26 15:55:01 -05:00
Teknium
4921b26945
fix(cron): keep homeassistant toolset enabled when HASS_TOKEN is set (#16208)
After #14798 made cron honor per-platform `hermes tools` config, the
`_DEFAULT_OFF_TOOLSETS` filter silently stripped `homeassistant` from
cron jobs for users who'd been relying on the previous blanket toolset.
Norbert's HA cron reports regressed as a result.

The HA toolset is already runtime-gated by its `check_fn` (requires
HASS_TOKEN to register any tools). When HASS_TOKEN is set the user has
explicitly opted in — `_DEFAULT_OFF_TOOLSETS` adds nothing in that case,
so stop double-gating and restore HA for cron / cli / other platforms
without an explicit saved toolset list.

moa and rl stay off by default (original #14798 goal preserved).

Fixes HA cron regression reported by Norbert.
2026-04-26 12:55:58 -07:00
Teknium
087e74d4d7
feat(slack): register every gateway command as a native slash (Discord/Telegram parity) (#16164)
Every command in COMMAND_REGISTRY (/btw, /stop, /model, /help, /new,
/bg, /reset, ...) is now a first-class Slack slash command instead of
a /hermes <subcommand>. Users get the same autocomplete-driven slash
picker experience Slack users expect and that Discord and Telegram
already provide.

Previously Slack registered ONE native slash (/hermes) and split on
the first word, so typing /btw in Slack's composer got 'couldn't find
an app for /btw' because the workspace manifest never declared it.

Changes
- hermes_cli/commands.py: slack_native_slashes() + slack_app_manifest()
  generate a Slack manifest from the registry (canonical names +
  aliases + plugin commands), clamped to Slack's 50-slash cap with
  /hermes reserved as the catch-all.
- gateway/platforms/slack.py: single regex matcher dispatches every
  registered slash to _handle_slash_command, which dispatches on
  command['command']. Legacy /hermes <subcommand> keeps working for
  backward compat with older workspace manifests.
- hermes_cli/slack_cli.py + hermes_cli/main.py: new 'hermes slack
  manifest' command prints/writes a full manifest (display info,
  OAuth scopes, event subs, socket mode, slash commands) ready to
  paste into 'Create from manifest' or Features → App Manifest.
- hermes_cli/setup.py: _setup_slack() now writes the manifest up-front
  and points users at the 'From an app manifest' flow; also offers
  to refresh the manifest on reconfigure for picking up new commands.
- Tests: 14 new tests covering native-slash dispatch (/btw, /stop,
  /model), legacy /hermes <sub> compat, manifest structure, and
  telegram<->slack parity (every Telegram command must also register
  as a Slack slash). Existing /hermes-registration test updated to
  assert the new regex matches /hermes, /btw, /stop, /model, /help.
- Docs: slack.md gains a 'Slash Commands' section + Option A manifest
  flow in Step 1; cli-commands.md documents 'hermes slack manifest'.

Users pick up the new slashes by running 'hermes slack manifest --write'
and pasting into Features → App Manifest → Edit in their Slack app
config, then Save (Slack prompts for reinstall if scopes changed).
2026-04-26 11:38:32 -07:00
Teknium
0e2a53eab2
feat(skills): show enabled/disabled status in 'skills list' (#16129)
'hermes skills list' now shows every skill's enabled/disabled status
and accepts --enabled-only to filter down to what will actually load
for the active profile:

    hermes -p dario skills list --enabled-only

Previously the command was a flat catalog — it did not apply
skills.disabled from config.yaml, so there was no way to see the
live skill set for a profile without reading config by hand.
Profile switching already works via -p (swaps HERMES_HOME); this
just surfaces the result visibly.

Changes:
- hermes_cli/skills_hub.py: do_list adds a Status column and an
  enabled_only filter; summary reports enabled/disabled split
- hermes_cli/main.py: --enabled-only flag on 'skills list'
- /skills list slash command accepts --enabled-only too
- tests: 4 new (status column, disabled marking, enabled-only
  hiding, no platform leakage into get_disabled_skill_names);
  existing fixtures updated to accept skip_disabled kwarg

Reported by @mochizukimr on X.
2026-04-26 09:20:53 -07:00
Teknium
06f81752ed
Revert "feat(kanban): durable multi-profile collaboration board (#16081)" (#16098)
This reverts commit 15937a6b46.
2026-04-26 08:29:37 -07:00
Teknium
15937a6b46
feat(kanban): durable multi-profile collaboration board (#16081)
New `hermes kanban` CLI subcommand + `/kanban` slash command + skills for
worker and orchestrator profiles. SQLite-backed task board
(~/.hermes/kanban.db) shared across all profiles on the host. Zero
changes to run_agent.py, no new core tools, no tool-schema bloat.

Motivation: delegate_task is a function call — sync fork/join, anonymous
subagent, no resumability, no human-in-the-loop. Kanban is the durable
shape needed for research triage, scheduled ops, digital twins,
engineering pipelines, and fleet work. They coexist (workers may call
delegate_task internally).

What this adds
- hermes_cli/kanban_db.py — schema, CAS claim, dependency resolution,
  dispatcher, workspace resolution, worker-context builder.
- hermes_cli/kanban.py — 15-verb CLI surface and shared run_slash()
  entry point used by both CLI and gateway.
- skills/devops/kanban-worker — how a profile should work a claimed task.
- skills/devops/kanban-orchestrator — "you are a dispatcher, not a
  worker" template with anti-temptation rules.
- /kanban slash command wired into cli.py and gateway/run.py. Bypasses
  the running-agent guard (board writes don't touch agent state), so
  /kanban unblock can free a stuck worker mid-conversation.
- Design spec at docs/hermes-kanban-v1-spec.pdf — comparative analysis
  vs Cline Kanban, Paperclip, NanoClaw, Gemini Enterprise; 8 patterns;
  4 user stories; implementation plan; concurrency correctness.
- Docs: website/docs/user-guide/features/kanban.md, CLI reference
  updated, sidebar entry added.

Architecture highlights
- Three planes: control (user + gateway), state (board + dispatcher),
  execution (pool of profile processes).
- Every worker is a full OS process, spawned as `hermes -p <profile>`.
  No in-process subagent swarms — solves NanoClaw's SDK-lifecycle
  failure class.
- Atomic claim via SQLite CAS in a BEGIN IMMEDIATE transaction; stale
  claims reclaimed 15 min after their TTL expires.
- Tenant namespacing via one nullable column — one specialist fleet
  can serve many businesses with data isolation by workspace path.

Tests: 60 targeted tests (schema, CAS atomicity, dependency resolution,
dispatcher, workspace kinds, tenancy, CLI + slash surface). All pass
hermetic via scripts/run_tests.sh.
2026-04-26 08:24:26 -07:00
Teknium
1e37ddc929
feat(cli): add 'hermes fallback' command to manage fallback providers (#16052)
Manage the fallback_providers chain from the CLI instead of hand-editing
config.yaml. The picker reuses select_provider_and_model() from 'hermes
model' — same provider list, same credential prompts, same model picker.

  hermes fallback [list]   Show the current chain (primary + fallbacks)
  hermes fallback add      Run the model picker, append selection to chain
  hermes fallback remove   Pick an entry to delete (arrow-key menu)
  hermes fallback clear    Remove all entries (with confirmation)

'add' snapshots config['model'] before calling the picker, extracts the
user's selection from the post-picker state, then restores the primary
and appends {provider, model, base_url?, api_mode?} to fallback_providers.
Auth store's active_provider is snapshot/restored too so OAuth-provider
fallbacks don't silently deactivate the user's primary. Duplicates and
self-as-fallback are rejected. Legacy single-dict 'fallback_model' entries
are auto-migrated to the list format on first write.
2026-04-26 06:19:04 -07:00
Teknium
855366909f
feat(models): remote model catalog manifest for OpenRouter + Nous Portal (#16033)
OpenRouter and Nous Portal curated picker lists now resolve via a JSON
manifest served by the docs site, falling back to the in-repo snapshot
when unreachable. Lets us update model lists without shipping a release.

Live URL: https://hermes-agent.nousresearch.com/docs/api/model-catalog.json
(source at website/static/api/model-catalog.json; auto-deploys via the
existing deploy-site.yml GitHub Pages pipeline on every merge to main).

Schema (v1) carries id + optional description + free-form metadata at
manifest, provider, and model levels. Pricing and context length stay
live-fetched via existing machinery (/v1/models endpoints, models.dev).

Config (new model_catalog section, default enabled):
  model_catalog.url       master manifest URL
  model_catalog.ttl_hours disk cache TTL (default 24h)
  model_catalog.providers.<name>.url   optional per-provider override

Fetch pipeline: in-process cache -> disk cache (fresh < TTL) -> HTTP
fetch -> disk-cache-on-failure fallback -> in-repo snapshot as last
resort. Never raises to callers; at worst returns the bundled list.

Changes:
- website/static/api/model-catalog.json    initial manifest (35 OR + 31 Nous)
- scripts/build_model_catalog.py           regenerator from in-repo lists
- hermes_cli/model_catalog.py              fetch + validate + cache module
- hermes_cli/models.py                     fetch_openrouter_models() +
                                           new get_curated_nous_model_ids()
- hermes_cli/main.py, hermes_cli/auth.py   Nous flows use the helper
- hermes_cli/config.py                     model_catalog defaults
- website/docs/reference/model-catalog.md  + sidebars.ts
- tests/hermes_cli/test_model_catalog.py   21 tests (validation, fetch
                                           success/failure, accessors,
                                           disabled, overrides, integration)
2026-04-26 05:46:43 -07:00
Teknium
438db0c7b0
fix(cli): /model picker honors provider-specific context caps (#16030)
`_apply_model_switch_result` (the interactive `/model` picker's
confirmation path) printed `ModelInfo.context_window` straight from
models.dev, which reports the vendor-wide value (1.05M for gpt-5.5 on
openai). ChatGPT Codex OAuth caps the same slug at 272K, so the picker
showed 1M while the runtime (compressor, gateway `/model`, typed
`/model <name>`) correctly used 272K — the classic 'sometimes 1M,
sometimes 272K' mismatch on a single model.

Both display paths now go through `resolve_display_context_length()`,
matching the fix that `_handle_model_switch` received earlier.

Also bump the stale last-resort fallback in DEFAULT_CONTEXT_LENGTHS
(`gpt-5.5: 400000 -> 1050000`) to match the real OpenAI API value; the
272K Codex cap is already enforced via the Codex-OAuth branch, so the
fallback now reflects what every non-Codex probe-miss should see.

Tests: adds `test_apply_model_switch_result_context.py` with three
scenarios (Codex cap wins, OpenRouter shows 1.05M, resolver-empty falls
back to ModelInfo). Updates the existing non-Codex fallback test to
assert 1.05M (the correct value).

## Validation
| path                          | before    | after     |
|-------------------------------|-----------|-----------|
| picker -> gpt-5.5 on Codex    | 1,050,000 | 272,000   |
| picker -> gpt-5.5 on OpenAI   | 1,050,000 | 1,050,000 |
| picker -> gpt-5.5 on OpenRouter | 1,050,000 | 1,050,000 |
| typed /model gpt-5.5 on Codex | 272,000   | 272,000   |
2026-04-26 05:43:31 -07:00
Teknium
a55de5bcd0
feat(setup): auto-reconfigure on existing installs (#15879)
Bare `hermes setup` on a returning user now drops straight into the
full reconfigure wizard — every prompt shows the current value as its
default, press Enter to keep or type a new value to change it. The
returning-user menu is gone.

Behavior:
- First-time user: first-time wizard (unchanged)
- Returning user, bare command: full reconfigure wizard (new default)
- Returning user, `--quick`: only prompt for missing/unset items
- Returning user, one section: `hermes setup model|terminal|gateway|tools|agent`
- `--reconfigure`: preserved as backwards-compat alias (no-op since it's now default)

The section functions already used current values as prompt defaults —
this change just removes the extra click to get to them.

The 'Quick Setup - configure missing items only' menu option is now
exposed as the explicit `--quick` flag; it's the narrow case of
filling in missing config (e.g. after a partial OpenClaw migration or
when a required API key got cleared).

Inspired by Mercury Agent's `mercury doctor` UX.

Also removes:
- RETURNING_USER_MENU_SECTION_KEYS (orphaned constant)
- Two returning-user menu tests in test_setup_noninteractive.py
  (guarding behavior that no longer exists — covered by
  test_setup_reconfigure.py instead)
2026-04-25 22:02:02 -07:00
Teknium
731e1ef8cb feat(azure-foundry): auto-detect transport, models, context length
The azure-foundry wizard now probes the endpoint before asking the user
to pick anything by hand:

  1. URL path sniff — endpoints ending in /anthropic are Azure Foundry
     Claude routes and skip to anthropic_messages.
  2. GET <base>/models probe — if the endpoint returns an OpenAI-shaped
     model list, we switch to chat_completions and prefill the picker
     with the returned deployment/model IDs.
  3. Anthropic Messages probe — fallback for endpoints that don't expose
     /models but do speak the Anthropic Messages shape.
  4. Manual fallback — private endpoints / custom routes still work;
     the user picks API mode + types a deployment name.

Context length for the selected model is resolved through the existing
agent.model_metadata.get_model_context_length chain (models.dev,
provider metadata, hardcoded family fallbacks) and stored in
model.context_length when a non-default value is found.

Also refactors runtime_provider so Azure Foundry resolution is reused
between the explicit-credentials path and the default top-level path —
previously the /v1 strip for Anthropic-style Azure only ran when the
caller passed explicit_* args, which meant config-driven sessions
hit a double-/v1 URL.

New module hermes_cli/azure_detect.py with 19 unit tests covering:
- path sniff, model ID extraction, probe fallbacks
- HTTP error handling (URLError, HTTPError)
- context-length lookup passthrough
- DEFAULT_FALLBACK_CONTEXT rejection

New runtime tests cover:
- OpenAI-style Azure Foundry
- Anthropic-style Azure Foundry with /v1 stripping
- Missing base_url / API key raising AuthError

Rationale: Microsoft confirms there's no pure-API-key endpoint to list
Azure deployments (that requires ARM management auth).  The v1 Azure
OpenAI endpoint does expose /models with the resource's available
model catalog, which is good enough for picker prefill in the common
case.  Users on private/gated endpoints fall through to manual entry.
2026-04-25 18:48:43 -07:00
Teknium
125de02056
fix(context): honor custom_providers context_length on /model switch + bump probe tier to 256K (#15844)
Fixes #15779. Custom-provider per-model context_length (`custom_providers[].models.<id>.context_length`) is now honored across every resolution path, not just agent startup. Also adds 256K as the top probe tier and default fallback.

## What changed

New helper `hermes_cli.config.get_custom_provider_context_length()` — single source of truth for the per-model override lookup, with trailing-slash-insensitive base-url matching.

`agent.model_metadata.get_model_context_length()` gains an optional `custom_providers=` kwarg (step 0b — runs after explicit `config_context_length` but before every other probe).

Wired through five call sites that previously either duplicated the lookup or ignored it entirely:
- `run_agent.py` startup — refactored to use the new helper (dedups legacy inline loop, keeps invalid-value warning)
- `AIAgent.switch_model()` — re-reads custom_providers from live config on every /model switch
- `hermes_cli.model_switch.resolve_display_context_length()` — new `custom_providers=` kwarg
- `gateway/run.py` /model confirmation (picker callback + text path)
- `gateway/run.py` `_format_session_info` (/info)

## Context probe tiers

`CONTEXT_PROBE_TIERS = [256_000, 128_000, 64_000, 32_000, 16_000, 8_000]` — was `[128_000, ...]`. `DEFAULT_FALLBACK_CONTEXT` follows tier[0], so unknown models now default to 256K. The stale `128000` literal in the OpenRouter metadata-miss path is replaced with `DEFAULT_FALLBACK_CONTEXT` for consistency.

## Repro (from #15779)

```yaml
custom_providers:
  - name: my-custom-endpoint
    base_url: https://example.invalid/v1
    model: gpt-5.5
    models:
      gpt-5.5:
        context_length: 1050000
```

`/model gpt-5.5 --provider custom:my-custom-endpoint` → previously "Context: 128,000", now "Context: 1,050,000".

## Tests

- `tests/hermes_cli/test_custom_provider_context_length.py` — new file, 19 tests covering the helper, step-0b integration, and the 256K tier invariants
- `tests/hermes_cli/test_model_switch_context_display.py` — added regression tests for #15779 through the display resolver
- `tests/gateway/test_session_info.py` — updated default-fallback assertion (128K → 256K)
- `tests/agent/test_model_metadata.py` — updated tier assertions for the new top tier
2026-04-25 18:47:53 -07:00
voidborne-d
45e1228a8a fix(cli): suppress OSError EIO on interrupt shutdown
When the user interrupts a long-running task, prompt_toolkit tries to
flush stdout during emergency shutdown.  If stdout is in a broken state
(redirected to /dev/null, pipe closed, terminal gone), the flush raises
`OSError: [Errno 5] Input/output error` which propagates unhandled and
crashes the CLI.

Two defense layers:

1. `_suppress_closed_loop_errors`: add `OSError` with `errno.EIO` to
   the asyncio exception handler, matching the existing pattern for
   `RuntimeError("Event loop is closed")` and `KeyError("is not
   registered")`.

2. Outer `except (KeyError, OSError)` block: add `errno.EIO` check
   before the existing string-match guards, silently suppressing the
   error instead of printing a misleading stdin-related message.

Fixes #13710.
2026-04-25 18:25:13 -07:00
Teknium
8bbeaea6c7 fix(config): broaden api-key ref lookup to templated base_url
The raw-template lookup added in PR #15817 went through
`get_compatible_custom_providers(read_raw_config())`, which calls
`_normalize_custom_provider_entry` → `urlparse(base_url)`. Any
entry whose `base_url` is itself an env-ref (`${NEURALWATT_API_BASE}`)
was dropped as 'not a valid URL', so `api_key_ref` stayed empty and the
resolved secret was still written to `model.api_key` — the exact case
the original Discord report described.

Replace the normalizer-gated lookup with a direct read of
`raw['custom_providers']` and `raw['providers']`, indexed by name
(case-insensitive, optionally qualified by model) so the loaded
(expanded) entry can be matched regardless of how `base_url` is
written.

Add an integration regression test driving the real
`select_provider_and_model` entry point with the Discord-reported
NeuralWatt config (`${VAR}` in both `base_url` and `api_key`).
This test fails on the PR-only fix and passes with the broadened
lookup.
2026-04-25 18:10:52 -07:00
helix4u
1fdc31b214 fix(config): preserve custom provider api key refs 2026-04-25 18:10:52 -07:00
helix4u
b2d3308f98 fix(doctor): accept bare custom provider 2026-04-25 18:01:36 -07:00
Brooklyn Nicholson
fdcbd2257b fix(tui): resolve startup model aliases statically
- expand short model aliases like sonnet/opus via static catalogs during startup runtime resolution
- keep startup alias resolution network-free and add regression tests in models and tui gateway suites
2026-04-25 14:13:02 -05:00
Brooklyn Nicholson
e9c47c7042 fix(tui): honor launch model overrides 2026-04-25 13:21:59 -05:00
brooklyn!
ee0728c6c4
Merge pull request #15351 from helix4u/fix/tui-rebuild-missing-ink-bundle
fix(tui): rebuild when ink bundle is missing
2026-04-25 13:14:23 -05:00
Teknium
af22421e87
feat(dashboard): page-scoped plugin slots for built-in pages (#15658)
* fix(terminal): three-layer defense against watch_patterns notification spam

Background processes that stack notify_on_complete=True with watch_patterns
can flood the user with duplicate, delayed notifications — matches deliver
asynchronously via the completion queue and continue arriving minutes after
the process has exited. The docstring warning against this (PR #12113) has
proven insufficient; agents still misuse the combination.

Three layered defenses, each sufficient on its own:

1. Mutual exclusion (terminal_tool.py): When both flags are set on a
   background process, drop watch_patterns with a warning. notify_on_complete
   wins because 'let me know when it's done' is the more useful signal and
   fires exactly once. Extracted as _resolve_notification_flag_conflict() so
   the rule is testable in isolation.

2. Suppress-after-exit (process_registry.py): _check_watch_patterns() now
   bails the moment session.exited is True. Post-exit chunks (buffered reads
   draining after the process is gone) no longer produce notifications. This
   is the fix flagged as future work in session 20260418_020302_79881c.

3. Global circuit breaker (process_registry.py): Per-session rate limits don't
   catch the sibling-flood case — N concurrent processes can each stay under
   8/10s and still collectively spam. New WATCH_GLOBAL_MAX_PER_WINDOW=15 cap
   trips a 30-second cooldown across ALL sessions, emits a single
   watch_overflow_tripped event, silently counts dropped events, and emits a
   watch_overflow_released summary when the cooldown ends.

Also updates the tool schema + docstring to document the new behavior.

Tests: 8 new tests covering all three fixes (suppress-after-exit x2,
mutual-exclusion resolver x4, global breaker trip/cooldown/release x2).
All 60 tests across test_watch_patterns.py, test_notify_on_complete.py,
test_terminal_tool.py pass.

Real-world trigger: self-inflicted in session 20260425_051924 — three
concurrent hermes-sweeper review subprocesses each set watch_patterns=
['failed validation', 'errored'] AND notify_on_complete=True, then iterated
over multiple items, producing enough matches per process to defeat the
per-session cap while staying under the global cap that didn't yet exist.

* fix(terminal): aggressive 1-per-15s watch_patterns rate limit + strike-3 promotion

Per Teknium's direction, the watch_patterns rate limit is now much more
aggressive and self-healing.

## New rule — per session

- HARD cap: 1 watch-match notification per 15 seconds per process.
- Any match arriving inside the cooldown window is dropped and counts as
  ONE strike for that window (many drops in the same window still = 1 strike).
- After 3 consecutive strike windows, watch_patterns is permanently disabled
  for the session and the session is auto-promoted to notify_on_complete
  semantics — exactly one notification when the process actually exits.
- A cooldown window that expires with zero drops resets the consecutive
  strike counter — healthy cadence is forgiven.

## Schema + docstring rewritten

The tool schema description now gives the model explicit guidance:
- notify_on_complete is 'the right choice for almost every long-running task'
- watch_patterns is for RARE one-shot signals on LONG-LIVED processes
- Do NOT use watch_patterns with loops/batch jobs — error patterns fire every
  iteration and will hit the strike limit fast
- Mutual exclusion is stated on both parameter descriptions
- 1/15s cooldown and 3-strike promotion are stated in the watch_patterns
  description so the model sees the contract every turn

## Removed

- WATCH_MAX_PER_WINDOW (8/10s) and WATCH_OVERLOAD_KILL_SECONDS (45) — the
  new 1/15s limit subsumes both; keeping them would double-count.
- _watch_window_hits / _watch_window_start / _watch_overload_since fields
  on ProcessSession. Replaced by _watch_last_emit_at / _watch_cooldown_until
  / _watch_strike_candidate / _watch_consecutive_strikes.

## Kept

- Global circuit breaker across all sessions (15/10s → 30s cooldown) as a
  secondary safety net for concurrent siblings. Still valuable when 20
  short-lived processes each fire once — none individually violates the
  per-session limit.
- Suppress-after-exit guard.
- Mutual exclusion resolver at the tool entry point.

## Tests

- 6 new tests in TestPerSessionRateLimit covering: first match delivers,
  second in cooldown suppressed, multi-drop = single strike, 3 strikes
  disables + promotes, clean window resets counter, suppressed count
  carried to next emit.
- Global circuit breaker tests rewritten to use fresh sessions instead of
  hacking removed per-window fields.
- 50/50 watch_patterns + notify_on_complete tests pass.
- 60/60 including test_terminal_tool.py pass.

* feat(dashboard): page-scoped plugin slots for built-in pages

Dashboard plugins can now inject components into specific built-in
pages (Sessions, Analytics, Logs, Cron, Skills, Config, Env, Docs,
Chat) without overriding the whole route.

Previously, plugins could only:
  1. Add new tabs (tab.path)
  2. Replace whole built-in pages (tab.override)
  3. Inject into global shell slots (header-*, footer-*, pre-main, ...)

None of those let a plugin add a banner, card, or widget to an
existing page. The new <page>:top / <page>:bottom slots close that
gap, reusing the existing registerSlot() API.

Changes
- web/src/plugins/slots.ts: 18 new KNOWN_SLOT_NAMES entries
  (sessions:top, sessions:bottom, analytics:top, ..., chat:bottom),
  grouped under "Shell-wide" vs "Page-scoped" in the docblock
- web/src/pages/*: each built-in page now renders
    <PluginSlot name="<page>:top" />
  as the first child of its outer wrapper and
    <PluginSlot name="<page>:bottom" />
  as the last child -- zero visual cost when no plugin registers
- plugins/example-dashboard: registers a demo banner into
  sessions:top via registerSlot(), with matching slots entry in
  the manifest -- so freshly-setup users can see what page-scoped
  slots look like without writing any plugin code
- website/docs: new "Page-scoped slots" table in the plugin
  authoring guide, with a worked example
- tests/hermes_cli/test_web_server.py: round-trip test for
  colon-bearing slot names (sessions:top, analytics:bottom, ...)

Validation
- npm run build: clean (tsc -b + vite build, 2761 modules)
- scripts/run_tests.sh tests/hermes_cli/test_web_server.py::TestDashboardPluginManifestExtensions: 5/5 pass
2026-04-25 06:55:35 -07:00
Teknium
ac05daa189
fix(tools): dedupe bundled plugin toolsets with built-in entries (#15634)
`hermes tools` → "reconfigure existing" listed Spotify twice because
the Apr 24 refactor that moved Spotify into plugins/spotify/ (PR #15174)
left the entry in CONFIGURABLE_TOOLSETS. _get_effective_configurable_toolsets()
unconditionally appended get_plugin_toolsets() on top, so the same
'spotify' key showed up from both sources.

Dedupe by key — built-in CONFIGURABLE_TOOLSETS entry wins (it has the
nicer label and description). Also guards against future bundled plugins
that share a toolset key with a built-in.
2026-04-25 05:53:08 -07:00
Teknium
6ed37e0f42 feat(tools): make discord/discord_admin opt-in, Discord-only
Both discord (read/participate) and discord_admin (server admin) are now
configurable via `hermes tools` with default-OFF. Previously the core
discord tool (fetch_messages, search_members, create_thread) auto-loaded
on every Discord install with DISCORD_BOT_TOKEN set — 19 tools the user
never opted into.

Adds a platform-scoping mechanism (_TOOLSET_PLATFORM_RESTRICTIONS) so
the discord toolsets only show up in the Discord platform's checklist,
not on CLI/Telegram/Slack/etc. Applied at four gates:
  - _prompt_toolset_checklist: checklist filter
  - _get_platform_tools: resolution filter (both branches)
  - _save_platform_tools: save-time filter (covers 'Configure all
    platforms' and hand-edited config.yaml)
  - tools_disable_enable_command: rejects `hermes tools enable discord`
    on non-Discord platforms with a clear error

build_session_context_prompt now injects the Discord IDs block only
when both conditions hold: the discord/discord_admin toolset is
enabled AND DISCORD_BOT_TOKEN is set. Toolset alone isn't enough —
the tool's check_fn gates on the token at registry time, so opting
in without a token yields no tools and the IDs block would lie.
Otherwise keep the stale-API disclaimer.
2026-04-25 04:51:11 -07:00
alt-glitch
db09477b77 feat(feishu): wire feishu doc/drive tools into hermes-feishu composite
The feishu_doc and feishu_drive tools were registered in the tool
registry but never added to the hermes-feishu composite toolset.
The pipeline fix from the prior commit now recovers them automatically
once they are in the composite.
2026-04-25 04:50:14 -07:00
alt-glitch
81987f0350 feat(discord): split discord_server into discord + discord_admin tools
Split the monolithic discord_server tool (14 actions) into two:

- discord: core actions (fetch_messages, search_members, create_thread)
  that are useful for the agent's normal operation. Auto-enabled on
  the discord platform via the pipeline fix.

- discord_admin: server management actions (list channels/roles, pins,
  role assignment) that require explicit opt-in via hermes tools.
  Added to CONFIGURABLE_TOOLSETS and _DEFAULT_OFF_TOOLSETS.
2026-04-25 04:50:14 -07:00
alt-glitch
9830905dab fix(tools): recover non-configurable toolsets from composite resolution
The reverse-mapping loop in _get_platform_tools only checked
CONFIGURABLE_TOOLSETS, silently dropping platform-specific toolsets
like discord and feishu_doc whose tools were in the composite but
had no configurable key. Add a second pass over TOOLSETS that picks
up unclaimed toolsets whose tools are present in the resolved
composite.
2026-04-25 04:50:14 -07:00
alt-glitch
9d7b64b5dd fix(tools): normalize numeric entries and clear stale no_mcp in _save_platform_tools
YAML parses bare numeric toolset names (e.g. 12306:) as int, causing
TypeError in sorted() since the read path normalizes to str but the
save path did not.

The no_mcp sentinel was preserved in existing entries even when the
user re-enabled MCP servers, causing MCP to stay silently disabled.
2026-04-25 04:49:02 -07:00
Teknium
05d8f11085
fix(/model): show provider-enforced context length, not raw models.dev (#15438)
/model gpt-5.5 on openai-codex showed 'Context: 1,050,000 tokens' because
the display block used ModelInfo.context_window directly from models.dev.
Codex OAuth actually enforces 272K for the same slug, and the agent's
compressor already runs at 272K via get_model_context_length() — so the
banner + real context budget said 272K while /model lied with 1M.

Route the display context through a new resolve_display_context_length()
helper that always prefers agent.model_metadata.get_model_context_length
(which knows about Codex OAuth, Copilot, Nous caps) and only falls back
to models.dev when that returns nothing.

Fix applied to all 3 /model display sites:
  cli.py _handle_model_switch
  gateway/run.py picker on_model_selected callback
  gateway/run.py text-fallback confirmation

Reported by @emilstridell (Telegram, April 2026).
2026-04-24 17:21:38 -07:00
helix4u
0738b80833 fix(tui): rebuild when ink bundle is missing 2026-04-24 15:51:38 -06:00
Austin Pickett
63975aa75b fix: mobile chat in new layout 2026-04-24 12:07:46 -04:00
emozilla
f49afd3122 feat(web): add /api/pty WebSocket bridge to embed TUI in dashboard
Exposes hermes --tui over a PTY-backed WebSocket so the dashboard can
embed the real TUI rather than reimplement its surface. The browser
attaches xterm.js to the socket; keystrokes flow in, PTY output bytes
flow out.

Architecture:

    browser <Terminal> (xterm.js)
           │  onData ───► ws.send(keystrokes)
           │  onResize ► ws.send('\x1b[RESIZE:cols;rows]')
           │  write   ◄── ws.onmessage (PTY bytes)
           ▼
    FastAPI /api/pty (token-gated, loopback-only)
           ▼
    PtyBridge (ptyprocess) ── spawns node ui-tui/dist/entry.js ──► tui_gateway + AIAgent

Components
----------

hermes_cli/pty_bridge.py
  Thin wrapper around ptyprocess.PtyProcess: byte-safe read/write on the
  master fd via os.read/os.write (not PtyProcessUnicode — ANSI is
  inherently byte-oriented and UTF-8 boundaries may land mid-read),
  non-blocking select-based reads, TIOCSWINSZ resize, idempotent
  SIGHUP→SIGTERM→SIGKILL teardown, platform guard (POSIX-only; Windows
  is WSL-supported only).

hermes_cli/web_server.py
  @app.websocket("/api/pty") endpoint gated by the existing
  _SESSION_TOKEN (via ?token= query param since browsers can't set
  Authorization on WS upgrades). Loopback-only enforcement. Reader task
  uses run_in_executor to pump PTY bytes without blocking the event
  loop. Writer loop intercepts a custom \x1b[RESIZE:cols;rows] escape
  before forwarding to the PTY. The endpoint resolves the TUI argv
  through a _resolve_chat_argv hook so tests can inject fake commands
  without building the real TUI.

Tests
-----

tests/hermes_cli/test_pty_bridge.py — 12 unit tests: spawn, stdout,
stdin round-trip, EOF, resize (via TIOCSWINSZ + tput readback), close
idempotency, cwd, env forwarding, unavailable-platform error.

tests/hermes_cli/test_web_server.py — TestPtyWebSocket adds 7 tests:
missing/bad token rejection (close code 4401), stdout streaming,
stdin round-trip, resize escape forwarding, unavailable-platform ANSI
error frame + 1011 close, resume parameter forwarding to argv.

96 tests pass under scripts/run_tests.sh.

(cherry picked from commit 29b337bca70fc9efb082a5a852ea2cd5381af1a9)

feat(web): add Chat tab with xterm.js terminal + Sessions resume button

(cherry picked from commit 3d21aee8 by emozilla, conflicts resolved
 against current main: BUILTIN_ROUTES table + plugin slot layout)

fix(tui): replace OSC 52 jargon in /copy confirmation

When the user ran /copy successfully, Ink confirmed with:

  sent OSC52 copy sequence (terminal support required)

That reads like a protocol spec to everyone who isn't a terminal
implementer. The caveat was a historical artifact — OSC 52 wasn't
universally supported when this message was written, so the TUI
honestly couldn't guarantee the copy had landed anywhere.

Today every modern terminal (including the dashboard's embedded
xterm.js) handles OSC 52 reliably. Say what the user actually wants
to know — that it copied, and how much — matching the message the
TUI already uses for selection copy:

  copied 1482 chars

(cherry picked from commit a0701b1d5a598dd1d3b94038a7bcbb2a3ab559fc)

docs: document the dashboard Chat tab

AGENTS.md — new subsection under TUI Architecture explaining that the
dashboard embeds the real hermes --tui rather than rewriting it,
with pointers to the pty_bridge + WebSocket endpoint and the rule
'never add a parallel chat surface in React.'

website/docs/user-guide/features/web-dashboard.md — user-facing Chat
section inside the existing Web Dashboard page, covering how it works
(WebSocket + PTY + xterm.js), the Sessions-page resume flow, and
prerequisites (Node.js, ptyprocess, POSIX kernel / WSL on Windows).

(cherry picked from commit 2c2e32cc4519973c77b63016316b065c0f656704)

feat(tui-gateway): transport-aware dispatch + WebSocket sidecar

Decouples the JSON-RPC dispatcher from its I/O sink so the same handler
surface can drive multiple transports concurrently. The PTY chat tab
already speaks to the TUI binary as bytes — this adds a structured
event channel alongside it for dashboard-side React widgets that need
typed events (tool.start/complete, model picker state, slash catalog)
that PTY can't surface.

- `tui_gateway/transport.py` — `Transport` protocol + `contextvars` binding
  + module-level `StdioTransport` fallback. The stdio stream resolves
  through a lambda so existing tests that monkey-patch `_real_stdout`
  keep passing without modification.
- `tui_gateway/ws.py` — WebSocket transport implementation; FastAPI
  endpoint mounting lives in hermes_cli/web_server.py.
- `tui_gateway/server.py`:
  - `write_json` routes via session transport (for async events) →
    contextvar transport (for in-request writes) → stdio fallback.
  - `dispatch(req, transport=None)` binds the transport for the request
    lifetime and propagates it to pool workers via `contextvars.copy_context`
    so async handlers don't lose their sink.
  - `_init_session` and the manual-session create path stash the
    request's transport so out-of-band events (subagent.complete, etc.)
    fan out to the right peer.

`tui_gateway.entry` (Ink's stdio handshake) is unchanged externally —
it falls through every precedence step into the stdio fallback, byte-
identical to the previous behaviour.

feat(web): ChatSidebar — JSON-RPC sidecar next to xterm.js terminal

Composes the two transports into a single Chat tab:

  ┌─────────────────────────────────────────┬──────────────┐
  │  xterm.js / PTY  (emozilla #13379)      │ ChatSidebar  │
  │  the literal hermes --tui process       │  /api/ws     │
  └─────────────────────────────────────────┴──────────────┘
        terminal bytes                          structured events

The terminal pane stays the canonical chat surface — full TUI fidelity,
slash commands, model picker, mouse, skin engine, wide chars all paint
inside the terminal. The sidebar opens a parallel JSON-RPC WebSocket
to the same gateway and renders metadata that PTY can't surface to
React chrome:

  • model + provider badge with connection state (click → switch)
  • running tool-call list (driven by tool.start / tool.progress /
    tool.complete events)
  • model picker dialog (gateway-driven, reuses ModelPickerDialog)

The sidecar is best-effort. If the WS can't connect (older gateway,
network hiccup, missing token) the terminal pane keeps working
unimpaired — sidebar just shows the connection-state badge in the
appropriate tone.

- `web/src/components/ChatSidebar.tsx` — new component (~270 lines).
  Owns its GatewayClient, drives the model picker through
  `slash.exec`, fans tool events into a capped tool list.
- `web/src/pages/ChatPage.tsx` — split layout: terminal pane
  (`flex-1`) + sidebar (`w-80`, `lg+` only).
- `hermes_cli/web_server.py` — mount `/api/ws` (token + loopback
  guards mirror /api/pty), delegate to `tui_gateway.ws.handle_ws`.

Co-authored-by: emozilla <emozilla@nousresearch.com>

refactor(web): /clean pass on ChatSidebar + ChatPage lint debt

- ChatSidebar: lift gw out of useRef into a useMemo derived from a
  reconnect counter. React 19's react-hooks/refs and react-hooks/
  set-state-in-effect rules both fire when you touch a ref during
  render or call setState from inside a useEffect body. The
  counter-derived gw is the canonical pattern for "external resource
  that needs to be replaceable on user action" — re-creating the
  client comes from bumping `version`, the effect just wires + tears
  down. Drops the imperative `gwRef.current = …` reassign in
  reconnect, drops the truthy ref guard in JSX. modelLabel +
  banner inlined as derived locals (one-off useMemo was overkill).
- ChatPage: lazy-init the banner state from the missing-token check
  so the effect body doesn't have to setState on first run. Drops
  the unused react-hooks/exhaustive-deps eslint-disable. Adds a
  scoped no-control-regex disable on the SGR mouse parser regex
  (the \\x1b is intentional for xterm escape sequences).

All my-touched files now lint clean. Remaining warnings on web/
belong to pre-existing files this PR doesn't touch.

Verified: vitest 249/249, ui-tui eslint clean, web tsc clean,
python imports clean.

chore: uptick

fix(web): drop ChatSidebar tool list — events can't cross PTY/WS boundary

The /api/pty endpoint spawns `hermes --tui` as a child process with its
own tui_gateway and _sessions dict; /api/ws runs handle_ws in-process in
the dashboard server with a separate _sessions dict. Tool events fire on
the child's gateway and never reach the WS sidecar, so the sidebar's
tool.start/progress/complete listeners always observed an empty list.

Drop the misleading list (and the now-orphaned ToolCall primitive),
keep model badge + connection state + model picker + error banner —
those work because they're sidecar-local concerns. Surfacing tool calls
in the sidebar requires cross-process forwarding (PTY child opens a
back-WS to the dashboard, gateway tees emits onto stdio + sidecar
transport) — proper feature for a follow-up.

feat(web): wire ChatSidebar tool list to PTY child via /api/pub broadcast

The dashboard's /api/pty spawns hermes --tui as a child process; tool
events fire in the python tui_gateway grandchild and never crossed the
process boundary into the in-process WS sidecar — so the sidebar tool
list was always empty.

Cross-process forwarding:

- tui_gateway: TeeTransport (transport.py) + WsPublisherTransport
  (event_publisher.py, sync websockets client). entry.py installs the
  tee on _stdio_transport when HERMES_TUI_SIDECAR_URL is set, mirroring
  every dispatcher emit to a back-WS without disturbing Ink's stdio
  handshake.

- hermes_cli/web_server.py: new /api/pub (publisher) + /api/events
  (subscriber) endpoints with a per-channel registry. /api/pty now
  accepts ?channel= and propagates the sidecar URL via env. start_server
  also stashes app.state.bound_port so the URL is constructable.

- web/src/pages/ChatPage.tsx: generates a channel UUID per mount,
  passes it to /api/pty and as a prop to ChatSidebar.

- web/src/components/ChatSidebar.tsx: opens /api/events?channel=, fans
  tool.start/progress/complete back into the ToolCall list. Restores
  the ToolCall primitive.

Tests: 4 new TestPtyWebSocket cases cover channel propagation,
broadcast fan-out, and missing-channel rejection (10 PTY tests pass,
120 web_server tests overall).

fix(web): address Copilot review on #14890

Five threads, all real:

- gatewayClient.ts: register `message`/`close` listeners BEFORE awaiting
  the open handshake.  Server emits `gateway.ready` immediately after
  accept, so a listener attached after the open promise could race past
  the initial skin payload and lose it.

- ChatSidebar.tsx: wire `error`/`close` on the /api/events subscriber
  WS into the existing error banner.  4401/4403 (auth/loopback reject)
  surface as a "reload the page" message; mid-stream drops surface as
  "events feed disconnected" with the existing reconnect button.  Clean
  unmount closes (1000/1001) stay silent.

- web-dashboard.md: install hint was `pip install hermes-agent[web]` but
  ptyprocess lives in the `pty` extra, not `web`.  Switch to
  `hermes-agent[web,pty]` in both prerequisite blocks.

- AGENTS.md: previous "never add a parallel React chat surface" guidance
  was overbroad and contradicted this PR's sidebar.  Tightened to forbid
  re-implementing the transcript/composer/PTY terminal while explicitly
  allowing structured supporting widgets (sidebar / model picker /
  inspectors), matching the actual architecture.

- web/package-lock.json: regenerated cleanly so the wterm sibling
  workspace paths (extraneous machine-local entries) stop polluting CI.

Tests: 249/249 vitest, 10/10 PTY/events, web tsc clean.

refactor(web): /clean pass on ChatSidebar events handler

Spotted in the round-2 review:

- Banner flashed on clean unmount: `ws.close()` from the effect cleanup
  fires `close` with code 1005, opened=true, neither 1000 nor 1001 —
  hit the "unexpected drop" branch.  Track `unmounting` in the effect
  scope and gate the banner through a `surface()` helper so cleanup
  closes stay silent.

- DRY the duplicated "events feed disconnected" string into a local
  const used by both the error and close handlers.

- Drop the `opened` flag (no longer needed once the unmount guard is
  the source of truth for "is this an expected close?").
2026-04-24 10:51:49 -04:00
5park1e
e1106772d9 fix: re-auth on stale OAuth token; read Claude Code credentials from macOS Keychain
Bug 3 — Stale OAuth token not detected in 'hermes model':
- _model_flow_anthropic used 'has_creds = bool(existing_key)' which treats
  any non-empty token (including expired OAuth tokens) as valid.
- Added existing_is_stale_oauth check: if the only credential is an OAuth
  token (sk-ant- prefix) with no valid cc_creds fallback, mark it stale
  and force the re-auth menu instead of silently accepting a broken token.

Bug 4 — macOS Keychain credentials never read:
- Claude Code >=2.1.114 migrated from ~/.claude/.credentials.json to the
  macOS Keychain under service 'Claude Code-credentials'.
- Added _read_claude_code_credentials_from_keychain() using the 'security'
  CLI tool; read_claude_code_credentials() now tries Keychain first then
  falls back to JSON file.
- Non-Darwin platforms return None from Keychain read immediately.

Tests:
- tests/agent/test_anthropic_keychain.py: 11 cases covering Darwin-only
  guard, security command failures, JSON parsing, fallback priority.
- tests/hermes_cli/test_anthropic_model_flow_stale_oauth.py: 8 cases
  covering stale OAuth detection, API key passthrough, cc_creds fallback.

Refs: #12905
2026-04-24 07:14:00 -07:00
Teknium
8d12fb1e6b
refactor(spotify): convert to built-in bundled plugin under plugins/spotify (#15174)
Moves the Spotify integration from tools/ into plugins/spotify/,
matching the existing pattern established by plugins/image_gen/ for
third-party service integrations.

Why:
- tools/ should be reserved for foundational capabilities (terminal,
  read_file, web_search, etc.). tools/providers/ was a one-off
  directory created solely for spotify_client.py.
- plugins/ is already the home for image_gen backends, memory
  providers, context engines, and standalone hook-based plugins.
  Spotify is a third-party service integration and belongs alongside
  those, not in tools/.
- Future service integrations (eventually: Deezer, Apple Music, etc.)
  now have a pattern to copy.

Changes:
- tools/spotify_tool.py → plugins/spotify/tools.py (handlers + schemas)
- tools/providers/spotify_client.py → plugins/spotify/client.py
- tools/providers/ removed (was only used for Spotify)
- New plugins/spotify/__init__.py with register(ctx) calling
  ctx.register_tool() × 7. The handler/check_fn wiring is unchanged.
- New plugins/spotify/plugin.yaml (kind: backend, bundled, auto-load).
- tests/tools/test_spotify_client.py: import paths updated.

tools_config fix — _DEFAULT_OFF_TOOLSETS now wins over plugin auto-enable:
- _get_platform_tools() previously auto-enabled unknown plugin
  toolsets for new platforms. That was fine for image_gen (which has
  no toolset of its own) but bad for Spotify, which explicitly
  requires opt-in (don't ship 7 tool schemas to users who don't use
  it). Added a check: if a plugin toolset is in _DEFAULT_OFF_TOOLSETS,
  it stays off until the user picks it in 'hermes tools'.

Pre-existing test bug fix:
- tests/hermes_cli/test_plugins.py::test_list_returns_sorted
  asserted names were sorted, but list_plugins() sorts by key
  (path-derived, e.g. image_gen/openai). With only image_gen plugins
  bundled, name and key order happened to agree. Adding plugins/spotify
  broke that coincidence (spotify sorts between openai-codex and xai
  by name but after xai by key). Updated test to assert key order,
  which is what the code actually documents.

Validation:
- scripts/run_tests.sh tests/hermes_cli/test_plugins.py \
    tests/hermes_cli/test_tools_config.py \
    tests/hermes_cli/test_spotify_auth.py \
    tests/tools/test_spotify_client.py \
    tests/tools/test_registry.py
  → 143 passed
- E2E plugin load: 'spotify' appears in loaded plugins, all 7 tools
  register into the spotify toolset, check_fn gating intact.
2026-04-24 07:06:11 -07:00
XieNBi
4a51ab61eb fix(cli): non-zero /model counts for native OpenAI and direct API rows 2026-04-24 05:48:15 -07:00
wangshengyang2004
647900e813 fix(cli): support model validation for anthropic_messages and cloudflare-protected endpoints
- probe_api_models: add api_mode param; use x-api-key + anthropic-version
  headers for anthropic_messages mode (Anthropic's native Models API auth)
- probe_api_models: add User-Agent header to avoid Cloudflare 403 blocks
  on third-party OpenAI-compatible endpoints
- validate_requested_model: pass api_mode through from switch_model
- validate_requested_model: for anthropic_messages mode, attempt probe with
  correct auth; if probe fails (many proxies don't implement /v1/models),
  accept the model with an informational warning instead of rejecting
- fetch_api_models: propagate api_mode to probe_api_models
2026-04-24 05:48:15 -07:00
Teknium
05394f2f28
feat(spotify): interactive setup wizard + docs page (#15130)
Previously 'hermes auth spotify' crashed with 'HERMES_SPOTIFY_CLIENT_ID
is required' if the user hadn't manually created a Spotify developer
app and set env vars. Now the command detects a missing client_id and
walks the user through the one-time app registration inline:

- Opens https://developer.spotify.com/dashboard in the browser
- Tells the user exactly what to paste into the Spotify form
  (including the correct default redirect URI, 127.0.0.1:43827)
- Prompts for the Client ID
- Persists HERMES_SPOTIFY_CLIENT_ID to ~/.hermes/.env so subsequent
  runs skip the wizard
- Continues straight into the PKCE OAuth flow

Also prints the docs URL at both the start of the wizard and the end
of a successful login so users can find the full guide.

Adds website/docs/user-guide/features/spotify.md with the complete
setup walkthrough, tool reference, and troubleshooting, and wires it
into the sidebar under User Guide > Features > Advanced.

Fixes a stale redirect URI default in the hermes_cli/tools_config.py
TOOL_CATEGORIES entry (was 8888/callback from the PR description
instead of the actual DEFAULT_SPOTIFY_REDIRECT_URI value
43827/spotify/callback defined in auth.py).
2026-04-24 05:30:05 -07:00
0xbyt4
4ac731c841 fix(model-normalize): pass DeepSeek V-series IDs through instead of folding to deepseek-chat
`_normalize_for_deepseek` was mapping every non-reasoner input into
`deepseek-chat` on the assumption that DeepSeek's API accepts only two
model IDs. That assumption no longer holds — `deepseek-v4-pro` and
`deepseek-v4-flash` are first-class IDs accepted by the direct API,
and on aggregators `deepseek-chat` routes explicitly to V3 (DeepInfra
backend returns `deepseek-chat-v3`). So a user picking V4 Pro through
the model picker was being silently downgraded to V3.

Verified 2026-04-24 against Nous portal's OpenAI-compat surface:
  - `deepseek/deepseek-v4-flash` → provider: DeepSeek,
    model: deepseek-v4-flash-20260423
  - `deepseek/deepseek-chat`     → provider: DeepInfra,
    model: deepseek/deepseek-chat-v3

Fix:
- Add `deepseek-v4-pro` and `deepseek-v4-flash` to
  `_DEEPSEEK_CANONICAL_MODELS` so exact matches pass through.
- Add `_DEEPSEEK_V_SERIES_RE` (`^deepseek-v\d+(...)?$`) so future
  V-series IDs (`deepseek-v5-*`, dated variants) keep passing through
  without another code change.
- Update docstring + module header to reflect the new rule.

Tests:
- New `TestDeepseekVSeriesPassThrough` — 8 parametrized cases covering
  bare, vendor-prefixed, case-variant, dated, and future V-series IDs
  plus end-to-end `normalize_model_for_provider(..., "deepseek")`.
- New `TestDeepseekCanonicalAndReasonerMapping` — regression coverage
  for canonical pass-through, reasoner-keyword folding, and
  fall-back-to-chat behaviour.
- 77/77 pass.

Reported on Discord (Ufonik, Don Piedro): `/model > Deepseek >
deepseek-v4-pro` surfaced
`Normalized 'deepseek-v4-pro' to 'deepseek-chat'`. Picker listing
showed the v4 names, so validation also rejected the post-normalize
`deepseek-chat` as "not in provider listing" — the contradiction
users saw. Normalizer now respects the picker's choice.
2026-04-24 05:24:54 -07:00
Dilee
7e9dd9ca45 Add native Spotify tools with PKCE auth 2026-04-24 05:20:38 -07:00
Michael Steuer
cd221080ec fix: validate nous auth status against runtime credentials 2026-04-24 05:20:05 -07:00
jakubkrcmar
1af44a13c0 fix(model_picker): detect mapped-provider auth-store credentials 2026-04-24 05:20:05 -07:00
Andy
fff7ee31ae fix: clarify auth retry guidance 2026-04-24 05:20:05 -07:00
NiuNiu Xia
76329196c1 fix(copilot): wire live /models max_prompt_tokens into context-window resolver
The Copilot provider resolved context windows via models.dev static data,
which does not include account-specific models (e.g. claude-opus-4.6-1m
with 1M context). This adds the live Copilot /models API as a higher-
priority source for copilot/copilot-acp/github-copilot providers.

New helper get_copilot_model_context() in hermes_cli/models.py extracts
capabilities.limits.max_prompt_tokens from the cached catalog. Results
are cached in-process for 1 hour.

In agent/model_metadata.py, step 5a queries the live API before falling
through to models.dev (step 5b). This ensures account-specific models
get correct context windows while standard models still have a fallback.

Part 1 of #7731.
Refs: #7272
2026-04-24 05:09:08 -07:00
NiuNiu Xia
d7ad07d6fe fix(copilot): exchange raw GitHub token for Copilot API JWT
Raw GitHub tokens (gho_/github_pat_/ghu_) are now exchanged for
short-lived Copilot API tokens via /copilot_internal/v2/token before
being used as Bearer credentials. This is required to access
internal-only models (e.g. claude-opus-4.6-1m with 1M context).

Implementation:
- exchange_copilot_token(): calls the token exchange endpoint with
  in-process caching (dict keyed by SHA-256 fingerprint), refreshed
  2 minutes before expiry. No disk persistence — gateway is long-running
  so in-memory cache is sufficient.
- get_copilot_api_token(): convenience wrapper with graceful fallback —
  returns exchanged token on success, raw token on failure.
- Both callers (hermes_cli/auth.py and agent/credential_pool.py) now
  pipe the raw token through get_copilot_api_token() before use.

12 new tests covering exchange, caching, expiry, error handling,
fingerprinting, and caller integration. All 185 existing copilot/auth
tests pass.

Part 2 of #7731.
2026-04-24 05:09:08 -07:00
Teknium
78450c4bd6
fix(nous-oauth): preserve obtained_at in pool + actionable message on RT reuse (#15111)
Two narrow fixes motivated by #15099.

1. _seed_from_singletons() was dropping obtained_at, agent_key_obtained_at,
   expires_in, and friends when seeding device_code pool entries from the
   providers.nous singleton. Fresh credentials showed up with
   obtained_at=None, which broke downstream freshness-sensitive consumers
   (self-heal hooks, pool pruning by age) — they treated just-minted
   credentials as older than they actually were and evicted them.

2. When the Nous Portal OAuth 2.1 server returns invalid_grant with
   'Refresh token reuse detected' in the error_description, rewrite the
   message to explain the likely cause (an external process consumed the
   rotated RT without persisting it back) and the mitigation. The generic
   reuse message led users to report this as a Hermes persistence bug when
   the actual trigger was typically a third-party monitoring script calling
   /api/oauth/token directly. Non-reuse errors keep their original server
   description untouched.

Closes #15099.

Regression tests:
- tests/agent/test_credential_pool.py::test_nous_seed_from_singletons_preserves_obtained_at_timestamps
- tests/hermes_cli/test_auth_nous_provider.py::test_refresh_token_reuse_detection_surfaces_actionable_message
- tests/hermes_cli/test_auth_nous_provider.py::test_refresh_non_reuse_error_keeps_original_description
2026-04-24 05:08:46 -07:00
Teknium
0e235947b9
fix(redact): honor security.redact_secrets from config.yaml (#15109)
agent/redact.py snapshots _REDACT_ENABLED from HERMES_REDACT_SECRETS at
module-import time. hermes_cli/main.py calls setup_logging() early, which
transitively imports agent.redact — BEFORE any config bridge has run. So
users who set 'security.redact_secrets: false' in config.yaml (instead of
HERMES_REDACT_SECRETS=false in .env) had the toggle silently ignored in
both 'hermes chat' and 'hermes gateway run'.

Bridge config.yaml -> env var in hermes_cli/main.py BEFORE setup_logging.
.env still wins (only set env when unset) — config.yaml is the fallback.

Regression tests in tests/hermes_cli/test_redact_config_bridge.py spawn
fresh subprocesses to verify:
- redact_secrets: false in config.yaml disables redaction
- default (key absent) leaves redaction enabled
- .env HERMES_REDACT_SECRETS=true overrides config.yaml
2026-04-24 05:03:26 -07:00
Teknium
1eb29e6452
fix(opencode): derive api_mode from target model, not stale config default (#15106)
/model kimi-k2.6 on opencode-zen (or glm-5.1 on opencode-go) returned OpenCode's
website 404 HTML page when the user's persisted model.default was a Claude or
MiniMax model. The switched-to chat_completions request hit
https://opencode.ai/zen (or /zen/go) with no /v1 suffix.

Root cause: resolve_runtime_provider() computed api_mode from
model_cfg.get('default') instead of the model being requested. With a Claude
default, it resolved api_mode=anthropic_messages, stripped /v1 from base_url
(required for the Anthropic SDK), then switch_model()'s opencode_model_api_mode
override flipped api_mode back to chat_completions without restoring /v1.

Fix: thread an optional target_model kwarg through resolve_runtime_provider
and _resolve_runtime_from_pool_entry. When the caller is performing an explicit
mid-session model switch (i.e. switch_model()), the target model drives both
api_mode selection and the conditional /v1 strip. Other callers (CLI init,
gateway init, cron, ACP, aux client, delegate, account_usage, tui_gateway) pass
nothing and preserve the existing config-default behavior.

Regression tests added in test_model_switch_opencode_anthropic.py use the REAL
resolver (not a mock) to guard the exact Quentin-repro scenario. Existing tests
that mocked resolve_runtime_provider with 'lambda requested:' had their mock
signatures widened to '**kwargs' to accept the new kwarg.
2026-04-24 04:58:46 -07:00
georgex8001
1dca2e0a28 fix(runtime): resolve bare custom provider to loopback or CUSTOM_BASE_URL
When /model selects Custom but model.provider in YAML still reflects a prior provider, trust model.base_url only for loopback hosts or when provider is custom. Consult CUSTOM_BASE_URL before OpenRouter defaults (#14676).
2026-04-24 04:54:16 -07:00
Matt Maximo
271f0e6eb0 fix(model): let Codex setup reuse or reauthenticate 2026-04-24 04:53:32 -07:00
j3ffffff
f76df30e08 fix(auth): parse OpenAI nested error shape in Codex token refresh
OpenAI's OAuth token endpoint returns errors in a nested shape —
{"error": {"code": "refresh_token_reused", "message": "..."}} —
not the OAuth spec's flat {"error": "...", "error_description": "..."}.
The existing parser only handled the flat shape, so:

- `err.get("error")` returned a dict, the `isinstance(str)` guard
  rejected it, and `code` stayed `"codex_refresh_failed"`.
- The dedicated `refresh_token_reused` branch (with its actionable
  "re-run codex + hermes auth" message and `relogin_required=True`)
  never fired.
- Users saw the generic "Codex token refresh failed with status 401"
  when another Codex client (CLI, VS Code extension) had consumed
  their single-use refresh token — giving no hint that re-auth was
  required.

Parse both shapes, mapping OpenAI's nested `code`/`type` onto the
existing `code` variable so downstream branches (`refresh_token_reused`,
`invalid_grant`, etc.) fire correctly.

Add regression tests covering:
- nested `refresh_token_reused` → actionable message + relogin_required
- nested generic code → code + message surfaced
- flat OAuth-spec `invalid_grant` still handled (back-compat)
- unparseable body → generic fallback message, relogin_required=False

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-24 04:53:32 -07:00
LeonSGP43
ccc8fccf77 fix(cli): validate user-defined providers consistently 2026-04-24 04:48:56 -07:00
Teknium
3aa1a41e88
feat(gemini): block free-tier keys at setup + surface guidance on 429 (#15100)
Google AI Studio's free tier (<= 250 req/day for gemini-2.5-flash) is
exhausted in a handful of agent turns, so the setup wizard now refuses
to wire up Gemini when the supplied key is on the free tier, and the
runtime 429 handler appends actionable billing guidance.

Setup-time probe (hermes_cli/main.py):
- `_model_flow_api_key_provider` fires one minimal generateContent call
  when provider_id == 'gemini' and classifies the response as
  free/paid/unknown via x-ratelimit-limit-requests-per-day header or
  429 body containing 'free_tier'.
- Free  -> print block message, refuse to save the provider, return.
- Paid  -> 'Tier check: paid' and proceed.
- Unknown (network/auth error) -> 'could not verify', proceed anyway.

Runtime 429 handler (agent/gemini_native_adapter.py):
- `gemini_http_error` appends billing guidance when the 429 error body
  mentions 'free_tier', catching users who bypass setup by putting
  GOOGLE_API_KEY directly in .env.

Tests: 21 unit tests for the probe + error path, 4 tests for the
setup-flow block. All 67 existing gemini tests still pass.
2026-04-24 04:46:17 -07:00
Teknium
18f3fc8a6f
fix(tests): resolve 17 persistent CI test failures (#15084)
Make the main-branch test suite pass again. Most failures were tests
still asserting old shapes after recent refactors; two were real source
bugs.

Source fixes:
- tools/mcp_tool.py: _kill_orphaned_mcp_children() slept 2s on every
  shutdown even when no tracked PIDs existed, making test_shutdown_is_parallel
  measure ~3s for 3 parallel 1s shutdowns. Early-return when pids is empty.
- hermes_cli/tips.py: tip 105 was 157 chars; corpus max is 150.

Test fixes (mostly stale mock targets / missing fixture fields):
- test_zombie_process_cleanup, test_agent_cache: patch run_agent.cleanup_vm
  (the local name bound at import), not tools.terminal_tool.cleanup_vm.
- test_browser_camofox: patch tools.browser_camofox.load_config, not
  hermes_cli.config.load_config (the source module, not the resolved one).
- test_flush_memories_codex._chat_response_with_memory_call: add
  finish_reason, tool_call.id, tool_call.type so the chat_completions
  transport normalizer doesn't AttributeError.
- test_concurrent_interrupt: polling_tool signature now accepts
  messages= kwarg that _invoke_tool() passes through.
- test_minimax_provider: add _fallback_chain=[] to the __new__'d agent
  so switch_model() doesn't AttributeError.
- test_skills_config: SKILLS_DIR MagicMock + .rglob stopped working
  after the scanner switched to agent.skill_utils.iter_skill_index_files
  (os.walk-based). Point SKILLS_DIR at a real tmp_path and patch
  agent.skill_utils.get_external_skills_dirs.
- test_browser_cdp_tool: browser_cdp toolset was intentionally split into
  'browser-cdp' (commit 96b0f3700) so its stricter check_fn doesn't gate
  the whole browser toolset; test now expects 'browser-cdp'.
- test_registry: add tools.browser_dialog_tool to the expected
  builtin-discovery set (PR #14540 added it).
- test_file_tools TestPatchHints: patch_tool surfaces hints as a '_hint'
  key on the JSON payload, not inline '[Hint: ...' text.
- test_write_deny test_hermes_env: resolve .env via get_hermes_home() so
  the path matches the profile-aware denylist under hermetic HERMES_HOME.
- test_checkpoint_manager test_falls_back_to_parent: guard the walk-up
  so a stray /tmp/pyproject.toml on the host doesn't pick up /tmp as the
  project root.
- test_quick_commands: set cli.session_id in the __new__'d CLI so the
  alias-args path doesn't trip AttributeError when fuzzy-matching leaks
  a skill command across xdist test distribution.
2026-04-24 03:46:46 -07:00
Nicecsh
fe34741f32 fix(model): repair Discord Copilot /model flow
Keep Discord Copilot model switching responsive and current by refreshing picker data from the live catalog when possible, correcting the curated fallback list, and clearing stale controls before the switch completes.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
2026-04-24 03:33:29 -07:00
kshitij
7897f65a94
fix(normalize): lowercase Xiaomi model IDs for case-insensitive config (#15066)
Xiaomi's API (api.xiaomimimo.com) requires lowercase model IDs like
"mimo-v2.5-pro" but rejects mixed-case names like "MiMo-V2.5-Pro"
that users copy from marketing docs or the ProviderEntry description.

Add _LOWERCASE_MODEL_PROVIDERS set and apply .lower() to model names
for providers in this set (currently just xiaomi) after stripping the
provider prefix. This ensures any case variant in config.yaml is
normalized before hitting the API.

Other providers (minimax, zai, etc.) are NOT affected — their APIs
accept mixed case (e.g. MiniMax-M2.7).
2026-04-24 03:33:05 -07:00
Harry Riddle
ac25e6c99a feat(auth-codex): add config-provider fallback detection for logout in hermes-agent/hermes_cli/auth.py 2026-04-24 03:17:18 -07:00
Teknium
b2e124d082
refactor(commands): drop /provider, /plan handler, and clean up slash registry (#15047)
* refactor(commands): drop /provider and clean up slash registry

* refactor(commands): drop /plan special handler — use plain skill dispatch
2026-04-24 03:10:52 -07:00
Keira Voss
1ef1e4c669 feat(plugins): add pre_gateway_dispatch hook
Introduces a new plugin hook `pre_gateway_dispatch` fired once per
incoming MessageEvent in `_handle_message`, after the internal-event
guard but before the auth / pairing chain. Plugins may return a dict
to influence flow:

    {"action": "skip",    "reason": "..."}  -> drop (no reply)
    {"action": "rewrite", "text":   "..."}  -> replace event.text
    {"action": "allow"}  /  None             -> normal dispatch

Motivation: gateway-level message-flow patterns that don't fit cleanly
into any single adapter — e.g. listen-only group-chat windows (buffer
ambient messages, collapse on @mention), or human-handover silent
ingest (record messages while an owner handles the chat manually).
Today these require forking core; with this hook they can live in a
single profile-agnostic plugin.

Hook runs BEFORE auth so plugins can handle unauthorized senders
(e.g. customer-service handover ingest) without triggering the
pairing-code flow. Exceptions in plugin callbacks are caught and
logged; the first non-None action dict wins, remaining results are
ignored.

Includes:
- `VALID_HOOKS` entry + inline doc in `hermes_cli/plugins.py`
- Invocation block in `gateway/run.py::_handle_message`
- 5 new tests in `tests/gateway/test_pre_gateway_dispatch.py`
  (skip, rewrite, allow, exception safety, internal-event bypass)
- 2 additional tests in `tests/hermes_cli/test_plugins.py`
- Table entry in `website/docs/user-guide/features/plugins.md`

Made-with: Cursor
2026-04-24 03:02:03 -07:00
0xbyt4
8aa37a0cf9 fix(auth): honor SSL CA env vars across httpx + requests callsites
- hermes_cli/auth.py: add _default_verify() with macOS Homebrew certifi
  fallback (mirrors weixin 3a0ec1d93). Extend env var chain to include
  REQUESTS_CA_BUNDLE so one env var works across httpx + requests paths.
- agent/model_metadata.py: add _resolve_requests_verify() reading
  HERMES_CA_BUNDLE / REQUESTS_CA_BUNDLE / SSL_CERT_FILE in priority
  order. Apply explicit verify= to all 6 requests.get callsites.
- Tests: 18 new unit tests + autouse platform pin on existing
  TestResolveVerifyFallback to keep its "returns True" assertions
  platform-independent.

Empirically verified against self-signed HTTPS server: requests honors
REQUESTS_CA_BUNDLE only; httpx honors SSL_CERT_FILE only. Hermes now
honors all three everywhere.

Triggered by Discord reports — Nous OAuth SSL failure on macOS
Homebrew Python; custom provider self-signed cert ignored despite
REQUESTS_CA_BUNDLE set in env.
2026-04-24 03:00:33 -07:00
Teknium
6051fba9dc
feat(banner): hyperlink startup banner title to latest GitHub release (#14945)
Wrap the existing version label in the welcome-banner panel title
('Hermes Agent v… · upstream … · local …') with an OSC-8 terminal
hyperlink pointing at the latest git tag's GitHub release page
(https://github.com/NousResearch/hermes-agent/releases/tag/<tag>).

Clickable in modern terminals (iTerm2, WezTerm, Windows Terminal,
GNOME Terminal, Kitty, etc.); degrades to plain text on terminals
without OSC-8 support. No new line added to the banner.

New get_latest_release_tag() helper runs 'git describe --tags
--abbrev=0' in the Hermes checkout (3s timeout, per-process cache,
silent fallback for non-git/pip installs and forks without tags).
2026-04-23 23:28:34 -07:00
0xbyt4
04c489b587 feat(tui): match CLI's voice slash + VAD-continuous recording model
The TUI had drifted from the CLI's voice model in two ways:

- /voice on was lighting up the microphone immediately and Ctrl+B was
  interpreted as a mode toggle.  The CLI separates the two: /voice on
  just flips the umbrella bit, recording only starts once the user
  presses Ctrl+B, which also sets _voice_continuous so the VAD loop
  auto-restarts until the user presses Ctrl+B again or three silent
  cycles pass.
- /voice tts was missing entirely, so users couldn't turn agent reply
  speech on/off from inside the TUI.

This commit brings the TUI to parity.

Python

- hermes_cli/voice.py: continuous-mode API (start_continuous,
  stop_continuous, is_continuous_active) layered on the existing PTT
  wrappers. The silence callback transcribes, fires on_transcript,
  tracks consecutive no-speech cycles, and auto-restarts — mirroring
  cli.py:_voice_stop_and_transcribe + _restart_recording.
- tui_gateway/server.py:
  - voice.toggle now supports on / off / tts / status.  The umbrella
    bit lives in HERMES_VOICE + display.voice_enabled; tts lives in
    HERMES_VOICE_TTS + display.voice_tts.  /voice off also tears down
    any active continuous loop so a toggle-off really releases the
    microphone.
  - voice.record start/stop now drives start_continuous/stop_continuous.
    start is refused with a clear error when the mode is off, matching
    cli.py:handle_voice_record's early return on `not _voice_mode`.
  - New voice.transcript / voice.status events emit through
    _voice_emit (remembers the sid that last enabled the mode so
    events land in the right session).

TypeScript

- gatewayTypes.ts: voice.status + voice.transcript event
  discriminants; VoiceToggleResponse gains tts; VoiceRecordResponse
  gains status for the new "started/stopped" responses.
- interfaces.ts: GatewayEventHandlerContext gains composer.setInput +
  submission.submitRef + voice.{setRecording, setProcessing,
  setVoiceEnabled}; InputHandlerContext.voice gains enabled +
  setVoiceEnabled for the mode-aware Ctrl+B handler.
- createGatewayEventHandler.ts: voice.status drives REC/STT badges;
  voice.transcript auto-submits when the composer is empty (CLI
  _pending_input.put parity) and appends when a draft is in flight.
  no_speech_limit flips voice off + sys line.
- useInputHandlers.ts: Ctrl+B now calls voice.record (start/stop),
  not voice.toggle, and nudges the user with a sys line when the
  mode is off instead of silently flipping it on.
- useMainApp.ts: wires the new event-handler context fields.
- slash/commands/session.ts: /voice handles on / off / tts / status
  with CLI-matching output ("voice: mode on · tts off").

Backward compat preserved for voice.record (was always PTT shape;
gateway still honours start/stop with mode-gating added).
2026-04-23 16:18:15 -07:00
0xbyt4
0bb460b070 fix(tui): add missing hermes_cli.voice wrapper for gateway RPC
tui_gateway/server.py:3486/3491/3509 imports start_recording,
stop_and_transcribe, and speak_text from hermes_cli.voice, but the
module never existed (not in git history — never shipped, never
deleted). Every voice.record / voice.tts RPC call hit the ImportError
branch and the TUI surfaced it as "voice module not available — install
audio dependencies" even on boxes with sounddevice / faster-whisper /
numpy installed.

Adds a thin wrapper on top of tools.voice_mode (recording +
transcription) and tools.tts_tool (text-to-speech):

- start_recording() — idempotent; stores the active AudioRecorder in a
  module-global guarded by a Lock so repeat Ctrl+B presses don't fight
  over the mic.
- stop_and_transcribe() — returns None for no-op / no-speech /
  Whisper-hallucination cases so the TUI's existing "no speech detected"
  path keeps working unchanged.
- speak_text(text) — lazily imports tts_tool (optional provider SDKs
  stay unloaded until the first /voice tts call), parses the tool's
  JSON result, and plays the audio via play_audio_file.

Paired with the Ctrl+B keybinding fix in the prior commit, the TUI
voice pipeline now works end-to-end for the first time.
2026-04-23 16:18:15 -07:00
Teknium
ef5eaf8d87
feat(cron): honor hermes tools config for the cron platform (#14798)
Cron now resolves its toolset from the same per-platform config the
gateway uses — `_get_platform_tools(cfg, 'cron')` — instead of blindly
loading every default toolset.  Existing cron jobs without a per-job
override automatically lose `moa`, `homeassistant`, and `rl` (the
`_DEFAULT_OFF_TOOLSETS` set), which stops the "surprise $4.63
mixture_of_agents run" class of bug (Norbert, Discord).

Precedence inside `run_job`:
  1. per-job `enabled_toolsets` (PR #14767 / #6130) — wins if set
  2. `_get_platform_tools(cfg, 'cron')` — new, the blanket gate
  3. `None` fallback (legacy) — only on resolver exception

Changes:
- hermes_cli/platforms.py: register 'cron' with default_toolset
  'hermes-cron'
- toolsets.py: add 'hermes-cron' toolset (mirrors 'hermes-cli';
  `_get_platform_tools` then filters via `_DEFAULT_OFF_TOOLSETS`)
- cron/scheduler.py: add `_resolve_cron_enabled_toolsets(job, cfg)`,
  call it at the `AIAgent(...)` kwargs site
- tests/cron/test_scheduler.py: replace the 'None when not set' test
  (outdated contract) with an invariant ('moa not in default cron
  toolset') + new per-job-wins precedence test
- tests/hermes_cli/test_tools_config.py: mark 'cron' as non-messaging
  in the gateway-toolset-coverage test
2026-04-23 15:48:50 -07:00
Teknium
f593c367be
feat(dashboard): reskin extension points for themes and plugins (#14776)
Themes and plugins can now pull off arbitrary dashboard reskins (cockpit
HUD, retro terminal, etc.) without touching core code.

Themes gain four new fields:
- layoutVariant: standard | cockpit | tiled — shell layout selector
- assets: {bg, hero, logo, crest, sidebar, header, custom: {...}} —
  artwork URLs exposed as --theme-asset-* CSS vars
- customCSS: raw CSS injected as a scoped <style> tag on theme apply
  (32 KiB cap, cleaned up on theme switch)
- componentStyles: per-component CSS-var overrides (clipPath,
  borderImage, background, boxShadow, ...) for card/header/sidebar/
  backdrop/tab/progress/badge/footer/page

Plugin manifests gain three new fields:
- tab.override: replaces a built-in route instead of adding a tab
- tab.hidden: register component + slots without adding a nav entry
- slots: declares shell slots the plugin populates

10 named shell slots: backdrop, header-left/right/banner, sidebar,
pre-main, post-main, footer-left/right, overlay. Plugins register via
window.__HERMES_PLUGINS__.registerSlot(name, slot, Component). A
<PluginSlot> React helper is exported on the plugin SDK.

Ships a full demo at plugins/strike-freedom-cockpit/ — theme YAML +
slot-only plugin that reproduces a Gundam cockpit dashboard: MS-STATUS
sidebar with live telemetry, COMPASS crest in header, notched card
corners via componentStyles, scanline overlay via customCSS, gold/cyan
palette, Orbitron typography.

Validation:
- 15 new tests in test_web_server.py covering every extended field
- tests/hermes_cli/: 2615 passed (3 pre-existing unrelated failures)
- tsc -b --noEmit: clean
- vite build: 418 kB bundle, ~2 kB delta for slots/theme extensions

Co-authored-by: Teknium <p@nousresearch.com>
2026-04-23 15:31:01 -07:00
helix4u
1cc0bdd5f3 fix(dashboard): avoid auth header collision with reverse proxies 2026-04-23 14:05:23 -07:00
Teknium
97b9b3d6a6
fix(gateway): drain-aware hermes update + faster still-working pings (#14736)
cmd_update no longer SIGKILLs in-flight agent runs, and users get
'still working' status every 3 min instead of 10. Two long-standing
sources of '@user — agent gives up mid-task' reports on Telegram and
other gateways.

Drain-aware update:
- New helper hermes_cli.gateway._graceful_restart_via_sigusr1(pid,
  drain_timeout) sends SIGUSR1 to the gateway and polls os.kill(pid,
  0) until the process exits or the budget expires.
- cmd_update's systemd loop now reads MainPID via 'systemctl show
  --property=MainPID --value' and tries the graceful path first. The
  gateway's existing SIGUSR1 handler -> request_restart(via_service=
  True) -> drain -> exit(75) is wired in gateway/run.py and is
  respawned by systemd's Restart=on-failure (and the explicit
  RestartForceExitStatus=75 on newer units).
- Falls back to 'systemctl restart' when MainPID is unknown, the
  drain budget elapses, or the unit doesn't respawn after exit (older
  units missing Restart=on-failure). Old install behavior preserved.
- Drain budget = max(restart_drain_timeout, 30s) + 15s margin so the
  drain loop in run_agent + final exit have room before fallback
  fires. Composes with #14728's tool-subprocess reaping.

Notification interval:
- agent.gateway_notify_interval default 600 -> 180.
- HERMES_AGENT_NOTIFY_INTERVAL env-var fallback in gateway/run.py
  matched.
- 9-minute weak-model spinning runs now ping at 3 min and 6 min
  instead of 27 seconds before completion, removing the 'is the bot
  dead?' reflex that drives gateway-restart cycles.

Tests:
- Two new tests in tests/hermes_cli/test_update_gateway_restart.py:
  one asserts SIGUSR1 is sent and 'systemctl restart' is NOT called
  when MainPID is known and the helper succeeds; one asserts the
  fallback fires when the helper returns False.
- E2E: spawned detached bash processes confirm the helper returns
  True on SIGUSR1-handling exit (~0.5s) and False on SIGUSR1-ignoring
  processes (timeout). Verified non-existent PID and pid=0 edge cases.
- 41/41 in test_update_gateway_restart.py (was 39, +2 new).
- 154/154 in shutdown-related suites including #14728's new tests.

Reported by @GeoffWellman and @ANT_1515 on X.
2026-04-23 14:01:57 -07:00
Teknium
327b57da91
fix(gateway): kill tool subprocesses before adapter disconnect on drain timeout (#14728)
Closes #8202.

Root cause: stop() reclaimed tool-call bash/sleep children only at the
very end of the shutdown sequence — after a 60s drain, 5s interrupt
grace, and per-adapter disconnect. Under systemd (TimeoutStopSec bounded
by drain_timeout), that meant the cgroup SIGKILL escalation fired first,
and systemd reaped the bash/sleep children instead of us.

Fix:
- Extract tool-subprocess cleanup into a local helper
  _kill_tool_subprocesses() in _stop_impl().
- Invoke it eagerly right after _interrupt_running_agents() on the
  drain-timeout path, before adapter disconnect.
- Keep the existing catch-all call at the end for the graceful path
  and defense in depth against mid-teardown respawns.
- Bump generated systemd unit TimeoutStopSec to drain_timeout + 30s
  so cleanup + disconnect + DB close has headroom above the drain
  budget, matching the 'subprocess timeout > TimeoutStopSec + margin'
  rule from the skill.

Tests:
- New: test_gateway_stop_kills_tool_subprocesses_before_adapter_disconnect_on_timeout
  asserts kill_all() runs before disconnect() when drain times out.
- New: test_gateway_stop_kills_tool_subprocesses_on_graceful_path
  guards that the final catch-all still fires when drain succeeds
  (regression guard against accidental removal during refactor).
- Updated: existing systemd unit generator tests expect TimeoutStopSec=90
  (= 60s drain + 30s headroom) with explanatory comment.
2026-04-23 13:59:29 -07:00
Teknium
255ba5bf26
feat(dashboard): expand themes to fonts, layout, density (#14725)
Dashboard themes now control typography and layout, not just colors.
Each built-in theme picks its own fonts, base size, radius, and density
so switching produces visible changes beyond hue.

Schema additions (per theme):

- typography — fontSans, fontMono, fontDisplay, fontUrl, baseSize,
  lineHeight, letterSpacing. fontUrl is injected as <link> on switch
  so Google/Bunny/self-hosted stylesheets all work.
- layout — radius (any CSS length) and density
  (compact | comfortable | spacious, multiplies Tailwind spacing).
- colorOverrides (optional) — pin individual shadcn tokens that would
  otherwise derive from the palette.

Built-in themes are now distinct beyond palette:

- default  — system stack, 15px, 0.5rem radius, comfortable
- midnight — Inter + JetBrains Mono, 14px, 0.75rem, comfortable
- ember    — Spectral (serif) + IBM Plex Mono, 15px, 0.25rem
- mono     — IBM Plex Sans + Mono, 13px, 0 radius, compact
- cyberpunk— Share Tech Mono everywhere, 14px, 0 radius, compact
- rose     — Fraunces (serif) + DM Mono, 16px, 1rem, spacious

Also fixes two bugs:

1. Custom user themes silently fell back to default. ThemeProvider
   only applied BUILTIN_THEMES[name], so YAML files in
   ~/.hermes/dashboard-themes/ showed in the picker but did nothing.
   Server now ships the full normalised definition; client applies it.
2. Docs documented a 21-token flat colors schema that never matched
   the code (applyPalette reads a 3-layer palette). Rewrote the
   Themes section against the actual shape.

Implementation:

- web/src/themes/types.ts: extend DashboardTheme with typography,
  layout, colorOverrides; ThemeListEntry carries optional definition.
- web/src/themes/presets.ts: 6 built-ins with distinct typography+layout.
- web/src/themes/context.tsx: applyTheme() writes palette+typography+
  layout+overrides as CSS vars, injects fontUrl stylesheet, fixes the
  fallback-to-default bug via resolveTheme(name).
- web/src/index.css: html/body/code read the new theme-font vars;
  --radius-sm/md/lg/xl derive from --theme-radius; --spacing scales
  with --theme-spacing-mul so Tailwind utilities shift with density.
- hermes_cli/web_server.py: _normalise_theme_definition() parses loose
  YAML (bare hex strings, partial blocks) into the canonical wire
  shape; /api/dashboard/themes ships full definitions for user themes.
- tests/hermes_cli/test_web_server.py: 16 new tests covering the
  normaliser and discovery (rejection cases, clamping, defaults).
- website/docs/user-guide/features/web-dashboard.md: rewrite Themes
  section with real schema, per-model tables, full YAML example.
2026-04-23 13:49:51 -07:00
kshitij
82a0ed1afb
feat: add Xiaomi MiMo v2.5-pro and v2.5 model support (#14635)
## Merged

Adds MiMo v2.5-pro and v2.5 support to Xiaomi native provider, OpenCode Go, and setup wizard.

### Changes
- Context lengths: added v2.5-pro (1M) and v2.5 (1M), corrected existing MiMo entries to exact values (262144)
- Provider lists: xiaomi, opencode-go, setup wizard
- Vision: upgraded from mimo-v2-omni to mimo-v2.5 (omnimodal)
- Config description updated for XIAOMI_API_KEY
- Tests updated for new vision model preference

### Verification
- 4322 tests passed, 0 new regressions
- Live API tested on Xiaomi portal: basic, reasoning, tool calling, multi-tool, file ops, system prompt, vision — all pass
- Self-review found and fixed 2 issues (redundant vision check, stale HuggingFace context length)
2026-04-23 10:06:25 -07:00
Teknium
d45c738a52
fix(gateway): preflight user D-Bus before systemctl --user start (#14531)
On fresh RHEL/Debian SSH sessions without linger, `systemctl --user
start hermes-gateway` fails with 'Failed to connect to bus: No medium
found' because /run/user/$UID/bus doesn't exist. Setup previously
showed a raw CalledProcessError and continued claiming success, so the
gateway never actually started.

systemd_start() and systemd_restart() now call _preflight_user_systemd()
for the user scope first:
- Bus socket already there → no-op (desktop / linger-enabled servers)
- Linger off → try loginctl enable-linger (works when polkit permits,
  needs sudo otherwise), wait for socket
- Still unreachable → raise UserSystemdUnavailableError with a clean
  remediation message pointing to sudo loginctl + hermes gateway run
  as the foreground fallback

Setup's start/restart handlers and gateway_command() catch the new
exception and render the multi-line guidance instead of a traceback.
2026-04-23 05:09:38 -07:00
David VV
39fcf1d127 fix(model_switch): group custom_providers by endpoint in /model picker (#9210)
Multiple custom_providers entries sharing the same base_url + api_key
are now grouped into a single picker row. A local Ollama host with
per-model display names ("Ollama — GLM 5.1", "Ollama — Qwen3-coder",
"Ollama — Kimi K2", "Ollama — MiniMax M2.7") previously produced four
near-duplicate picker rows that differed only by suffix; now it appears
as one "Ollama" row with four models.

Key changes:
- Grouping key changed from slug-by-name to (base_url, api_key). Names
  frequently differ per model while the endpoint stays the same.
- When the grouped endpoint matches current_base_url, the row's slug is
  set to current_provider so picker-driven switches route through the
  live credential pipeline (no re-resolution needed).
- Per-model suffix is stripped from the display name ("Ollama — X" →
  "Ollama") via em-dash / " - " separators.
- Two groups with different api_keys at the same base_url (or otherwise
  colliding on cleaned name) are disambiguated with a numeric suffix
  (custom:openai, custom:openai-2) so both stay visible.
- current_base_url parameter plumbed through both gateway call sites.

Existing #8216, #11499, #13509 regressions covered (dict/list shapes
of models:, section-3/section-4 dedup, normalized list-format entries).

Salvaged from @davidvv's PR #9210 — the underlying code had diverged
~1400 commits since that PR was opened, so this is a reconstruction of
the same approach on current main rather than a clean cherry-pick.
Authorship preserved via --author on this commit.

Closes #9210
2026-04-23 03:10:30 -07:00
Aslaaen
51c1d2de16 fix(profiles): stage profile imports to prevent directory clobbering 2026-04-23 03:02:34 -07:00
drstrangerujn
a5b0c7e2ec fix(config): preserve list-format models in custom_providers normalize
_normalize_custom_provider_entry silently drops the models field when it's
a list. Hand-edited configs (and the shape used by older Hermes versions)
still write models as a plain list of ids, so after the normalize pass the
entry reaches list_authenticated_providers() with no models and /model
shows the provider with (0) models — even though the underlying picker
code handles lists fine.

Convert list-format models into the empty-value dict shape the rest of
the pipeline already expects. Dict-format entries keep passing through
unchanged.

Repro (before the fix):

    custom_providers:
    - name: acme
      base_url: https://api.example.com/v1
      models: [foo, bar, baz]

/model shows "acme (0)"; bypassing normalize in list_authenticated_providers
returns three models, confirming the drop happens in normalize.

Adds four unit tests covering list→dict conversion, dict pass-through,
filtering of empty/non-string entries, and the empty-list case.
2026-04-23 02:37:07 -07:00
helix4u
bace220d29 fix(image-gen): persist plugin provider on reconfigure 2026-04-23 01:56:09 -07:00
Teknium
a2a8092e90 feat(cli): add --ignore-user-config and --ignore-rules flags
Port from openai/codex#18646.

Adds two flags to 'hermes chat' that fully isolate a run from user-level
configuration and rules:

* --ignore-user-config: skip ~/.hermes/config.yaml and fall back to
  built-in defaults. Credentials in .env are still loaded so the agent
  can actually call a provider.
* --ignore-rules: skip auto-injection of AGENTS.md, SOUL.md,
  .cursorrules, and persistent memory (maps to AIAgent(skip_context_files=True,
  skip_memory=True)).

Primary use cases:
- Reproducible CI runs that should not pick up developer-local config
- Third-party integrations (e.g. Chronicle in Codex) that bring their
  own config and don't want user preferences leaking in
- Bug-report reproduction without the reporter's personal overrides
- Debugging: bisect 'was it my config?' vs 'real bug' in one command

Both flags are registered on the parent parser AND the 'chat' subparser
(with argparse.SUPPRESS on the subparser to avoid overwriting the parent
value when the flag is placed before the subcommand, matching the
existing --yolo/--worktree/--pass-session-id pattern).

Env vars HERMES_IGNORE_USER_CONFIG=1 and HERMES_IGNORE_RULES=1 are set
by cmd_chat BEFORE 'from cli import main' runs, which is critical
because cli.py evaluates CLI_CONFIG = load_cli_config() at module import
time. The cli.py / hermes_cli.config.load_cli_config() function checks
the env var and skips ~/.hermes/config.yaml when set.

Tests: 11 new tests in tests/hermes_cli/test_ignore_user_config_flags.py
covering the env gate, constructor wiring, cmd_chat simulation, and
argparse flag registration. All pass; existing hermes_cli + cli suites
unaffected (3005 pass, 2 pre-existing unrelated failures).
2026-04-22 19:58:42 -07:00
Teknium
9eb543cafe
feat(/model): merge models.dev entries for lesser-loved providers (#14221)
New and newer models from models.dev now surface automatically in
/model (both hermes model CLI and the gateway Telegram/Discord picker)
for a curated set of secondary providers — no Hermes release required
when the registry publishes a new model.

Primary user-visible fix: on OpenCode Go, typing '/model mimo-v2.5-pro'
no longer silently fuzzy-corrects to 'mimo-v2-pro'. The exact match
against the merged models.dev catalog wins.

Scope (opt-in frozenset _MODELS_DEV_PREFERRED in hermes_cli/models.py):
  opencode-go, opencode-zen, deepseek, kilocode, fireworks, mistral,
  togetherai, cohere, perplexity, groq, nvidia, huggingface, zai,
  gemini, google.

Explicitly NOT merged:
  - openrouter and nous (never): curated list is already a hand-picked
    subset / Portal is source of truth.
  - xai, xiaomi, minimax, minimax-cn, kimi-coding, kimi-coding-cn,
    alibaba, qwen-oauth (per-project decision to keep curated-only).
  - providers with dedicated live-endpoint paths (copilot, anthropic,
    ai-gateway, ollama-cloud, custom, stepfun, openai-codex) — those
    paths already handle freshness themselves.

Changes:
  - hermes_cli/models.py: add _MODELS_DEV_PREFERRED + _merge_with_models_dev
    helper. provider_model_ids() branches on the set at its curated-fallback
    return. Merge is models.dev-first, curated-only extras appended,
    case-insensitive dedup, graceful fallback when models.dev is offline.
  - hermes_cli/model_switch.py: list_authenticated_providers() calls the
    same merge in both its code paths (PROVIDER_TO_MODELS_DEV loop +
    HERMES_OVERLAYS loop). Picker AND validation-fallback both see
    fresh entries.
  - tests/hermes_cli/test_models_dev_preferred_merge.py (new): 13 tests —
    merge-helper unit tests (empty/raise/order/dedup), opencode-go/zen
    behavior, openrouter+nous explicitly guarded from merge.
  - tests/hermes_cli/test_opencode_go_in_model_list.py: converted from
    snapshot-style assertion to a behavior-based floor check, so it
    doesn't break when models.dev publishes additional opencode-go
    entries.

Addresses a report from @pfanis via Telegram: newer Xiaomi variants
on OpenCode Go weren't appearing in the /model picker, and /model
was silently routing requests for new variants to older ones.
2026-04-22 17:33:42 -07:00
helix4u
b52123eb15 fix(gateway): recover stale pid and planned restart state 2026-04-22 16:33:46 -07:00
Teknium
51ca575994 feat(gateway): expose plugin slash commands natively on all platforms + decision-capable command hook
Plugin slash commands now surface as first-class commands in every gateway
enumerator — Discord native slash picker, Telegram BotCommand menu, Slack
/hermes subcommand map — without a separate per-platform plugin API.

The existing 'command:<name>' gateway hook gains a decision protocol via
HookRegistry.emit_collect(): handlers that return a dict with
{'decision': 'deny'|'handled'|'rewrite'|'allow'} can intercept slash
command dispatch before core handling runs, unifying what would otherwise
have been a parallel 'pre_gateway_command' hook surface.

Changes:

- gateway/hooks.py: add HookRegistry.emit_collect() that fires the same
  handler set as emit() but collects non-None return values. Backward
  compatible — fire-and-forget telemetry hooks still work via emit().
- hermes_cli/plugins.py: add optional 'args_hint' param to
  register_command() so plugins can opt into argument-aware native UI
  registration (Discord arg picker, future platforms).
- hermes_cli/commands.py: add _iter_plugin_command_entries() helper and
  merge plugin commands into telegram_bot_commands() and
  slack_subcommand_map(). New is_gateway_known_command() recognizes both
  built-in and plugin commands so the gateway hook fires for either.
- gateway/platforms/discord.py: extract _build_auto_slash_command helper
  from the COMMAND_REGISTRY auto-register loop and reuse it for
  plugin-registered commands. Built-in name conflicts are skipped.
- gateway/run.py: before normal slash dispatch, call emit_collect on
  command:<canonical> and honor deny/handled/rewrite/allow decisions.
  Hook now fires for plugin commands too.
- scripts/release.py: AUTHOR_MAP entry for @Magaav.
- Tests: emit_collect semantics, plugin command surfacing per platform,
  decision protocol (deny/handled/rewrite/allow + non-dict tolerance),
  Discord plugin auto-registration + conflict skipping, is_gateway_known_command.

Salvaged from #14131 (@Magaav). Original PR added a parallel
'pre_gateway_command' hook and a platform-keyed plugin command
registry; this re-implementation reuses the existing 'command:<name>'
hook and treats plugin commands as platform-agnostic so the same
capability reaches Telegram and Slack without new API surface.

Co-authored-by: Magaav <73175452+Magaav@users.noreply.github.com>
2026-04-22 16:23:21 -07:00
Brooklyn Nicholson
b641639e42 fix(debug): distinguish empty-log from missing-log in report placeholder
Copilot on #14138 flagged that the share report says '(file not found)'
when the log exists but is empty (either because the primary is empty
and no .1 rotation exists, or in the rare race where the file is
truncated between _resolve_log_path() and stat()).

- Split _primary_log_path() out of _resolve_log_path so both can share
  the LOG_FILES/home math without duplication.
- _capture_log_snapshot now reports '(file empty)' when the primary
  path exists on disk with zero bytes, and keeps '(file not found)'
  for the truly-missing case.

Tests: rename test_returns_none_for_empty → test_empty_primary_reports_file_empty
with the new assertion, plus a race-path test that monkeypatches
_resolve_log_path to exercise the size==0 branch directly.
2026-04-22 15:27:54 -05:00
kshitij
81a504a4a0 fix: align status bar skin tests with upstream main
Drop rebased test assumptions about theme-mode helpers removed on main and keep the status bar skin integration aligned with the current skin engine model.
2026-04-22 13:20:02 -07:00
kshitij
c323217188 fix: make CLI status bar skin-aware
Route prompt_toolkit status bar colors through the skin engine so /skin updates the status bar alongside the rest of the interactive TUI.

Add regression coverage for the new status bar style override keys and CLI style composition.
2026-04-22 13:20:02 -07:00
kshitijk4poor
de849c410d refactor(debug): remove dead _read_log_tail/_read_full_log wrappers
These thin wrappers around _capture_log_snapshot had zero production
callers after the snapshot refactor — run_debug_share uses snapshots
directly and collect_debug_report captures internally.  The wrappers
also caused a performance regression: _read_log_tail read up to 512KB
and built full_text just to return tail_text.

Remove both wrappers and migrate TestReadFullLog → TestCaptureLogSnapshot
to test _capture_log_snapshot directly.  Same coverage, tests the real
API instead of dead indirection.
2026-04-22 11:59:39 -07:00
kshitijk4poor
8dc936f10e chore: add taosiyuan163 to AUTHOR_MAP, add truncation boundary tests
Add missing AUTHOR_MAP entry for taosiyuan163 whose truncation boundary
fix was adapted into _capture_log_snapshot().

Add regression tests proving: line-boundary truncation keeps the full
first line, mid-line truncation correctly drops the partial fragment.
2026-04-22 11:59:39 -07:00
Junass1
61d0a99c11 fix(debug): sweep expired pending pastes on slash debug paths 2026-04-22 11:59:39 -07:00
helix4u
fc3862bdd6 fix(debug): snapshot logs once for debug share 2026-04-22 11:59:39 -07:00
Teknium
3e652f75b2
fix(plugins+nous): auto-coerce memory plugins; actionable Nous 401 diagnostic (#14005)
* fix(plugins): auto-coerce user-installed memory plugins to kind=exclusive

User-installed memory provider plugins at $HERMES_HOME/plugins/<name>/
were being dispatched to the general PluginManager, which has no
register_memory_provider method on PluginContext. Every startup logged:

  Failed to load plugin 'mempalace': 'PluginContext' object has no
  attribute 'register_memory_provider'

Bundled memory providers were already skipped via skip_names={memory,
context_engine} in discover_and_load, but user-installed ones weren't.

Fix: _parse_manifest now scans the plugin's __init__.py source for
'register_memory_provider' or 'MemoryProvider' (same heuristic as
plugins/memory/__init__.py:_is_memory_provider_dir) and auto-coerces
kind to 'exclusive' when the manifest didn't declare one explicitly.
This routes the plugin to plugins/memory discovery instead of the
general loader.

The escape hatch: if a manifest explicitly declares kind: standalone,
the heuristic doesn't override it.

Reported by Uncle HODL on Discord.

* fix(nous): actionable CLI message when Nous 401 refresh fails

Mirrors the Anthropic 401 diagnostic pattern. When Nous returns 401
and the credential refresh (_try_refresh_nous_client_credentials)
also fails, the user used to see only the raw APIError. Now prints:

  🔐 Nous 401 — Portal authentication failed.
     Response: <truncated body>
     Most likely: Portal OAuth expired, account out of credits, or
                  agent key revoked.
     Troubleshooting:
       • Re-authenticate: hermes login --provider nous
       • Check credits / billing: https://portal.nousresearch.com
       • Verify stored credentials: $HERMES_HOME/auth.json
       • Switch providers temporarily: /model <model> --provider openrouter

Addresses the common 'my hermes model hangs' pattern where the user's
Portal OAuth expired and the CLI gave no hint about the next step.
2026-04-22 05:54:11 -07:00
kshitijk4poor
5fb143169b feat(dashboard): track real API call count per session
Adds schema v7 'api_call_count' column. run_agent.py increments it by 1
per LLM API call, web_server analytics SQL aggregates it, frontend uses
the real counter instead of summing sessions.

The 'API Calls' card on the analytics dashboard previously displayed
COUNT(*) from the sessions table — the number of conversations, not
LLM requests. Each session makes 10-90 API calls through the tool loop,
so the reported number was ~30x lower than real.

Salvaged from PR #10140 (@kshitijk4poor). The cache-token accuracy
portions of the original PR were deferred — per-provider analytics is
the better path there, since cache_write_tokens and actual_cost_usd
are only reliably available from a subset of providers (Anthropic
native, Codex Responses, OpenRouter with usage.include).

Tests:
- schema_version v7 assertion
- migration v2 -> v7 adds api_call_count column with default 0
- update_token_counts increments api_call_count by provided delta
- absolute=True sets api_call_count directly
- /api/analytics/usage exposes total_api_calls in totals
2026-04-22 05:51:58 -07:00
hharry11
83cb9a03ee fix(cli): ensure project .env is sanitized before loading 2026-04-22 05:51:44 -07:00
hengm3467
c6b1ef4e58 feat: add Step Plan provider support (salvage #6005)
Adds a first-class 'stepfun' API-key provider surfaced as Step Plan:

- Support Step Plan setup for both International and China regions
- Discover Step Plan models live from /step_plan/v1/models, with a
  small coding-focused fallback catalog when discovery is unavailable
- Thread StepFun through provider metadata, setup persistence, status
  and doctor output, auxiliary routing, and model normalization
- Add tests for provider resolution, model validation, metadata
  mapping, and StepFun region/model persistence

Based on #6005 by @hengm3467.

Co-authored-by: hengm3467 <100685635+hengm3467@users.noreply.github.com>
2026-04-22 02:59:58 -07:00
Teknium
ff9752410a
feat(plugins): pluggable image_gen backends + OpenAI provider (#13799)
* feat(plugins): pluggable image_gen backends + OpenAI provider

Adds a ImageGenProvider ABC so image generation backends register as
bundled plugins under `plugins/image_gen/<name>/`. The plugin scanner
gains three primitives to make this work generically:

- `kind:` manifest field (`standalone` | `backend` | `exclusive`).
  Bundled `kind: backend` plugins auto-load — no `plugins.enabled`
  incantation. User-installed backends stay opt-in.
- Path-derived keys: `plugins/image_gen/openai/` gets key
  `image_gen/openai`, so a future `tts/openai` cannot collide.
- Depth-2 recursion into category namespaces (parent dirs without a
  `plugin.yaml` of their own).

Includes `OpenAIImageGenProvider` as the first consumer (gpt-image-1.5
default, plus gpt-image-1, gpt-image-1-mini, DALL-E 3/2). Base64
responses save to `$HERMES_HOME/cache/images/`; URL responses pass
through.

FAL stays in-tree for this PR — a follow-up ports it into
`plugins/image_gen/fal/` so the in-tree `image_generation_tool.py`
slims down. The dispatch shim in `_handle_image_generate` only fires
when `image_gen.provider` is explicitly set to a non-FAL value, so
existing FAL setups are untouched.

- 41 unit tests (scanner recursion, kind parsing, gate logic,
  registry, OpenAI payload shapes)
- E2E smoke verified: bundled plugin autoloads, registers, and
  `_handle_image_generate` routes to OpenAI when configured

* fix(image_gen/openai): don't send response_format to gpt-image-*

The live API rejects it: 'Unknown parameter: response_format'
(verified 2026-04-21 with gpt-image-1.5). gpt-image-* models return
b64_json unconditionally, so the parameter was both unnecessary and
actively broken.

* feat(image_gen/openai): gpt-image-2 only, drop legacy catalog

gpt-image-2 is the latest/best OpenAI image model (released 2026-04-21)
and there's no reason to expose the older gpt-image-1.5 / gpt-image-1 /
dall-e-3 / dall-e-2 alongside it — slower, lower quality, or awkward
(dall-e-2 squares only). Trim the catalog down to a single model.

Live-verified end-to-end: landscape 1536x1024 render of a Moog-style
synth matches prompt exactly, 2.4MB PNG saved to cache.

* feat(image_gen/openai): expose gpt-image-2 as three quality tiers

Users pick speed/fidelity via the normal model picker instead of a
hidden quality knob. All three tier IDs resolve to the single underlying
gpt-image-2 API model with a different quality parameter:

  gpt-image-2-low     ~15s   fast iteration
  gpt-image-2-medium  ~40s   default
  gpt-image-2-high    ~2min  highest fidelity

Live-measured on OpenAI's API today: 15.4s / 40.8s / 116.9s for the
same 1024x1024 prompt.

Config:
  image_gen.openai.model: gpt-image-2-high
  # or
  image_gen.model: gpt-image-2-low
  # or env var for scripts/tests
  OPENAI_IMAGE_MODEL=gpt-image-2-medium

Live-verified end-to-end with the low tier: 18.8s landscape render of a
golden retriever in wildflowers, vision-confirmed exact match.

* feat(tools_config): plugin image_gen providers inject themselves into picker

'hermes tools' → Image Generation now shows plugin-registered backends
alongside Nous Subscription and FAL.ai without tools_config.py needing
to know about them. OpenAI appears as a third option today; future
backends appear automatically as they're added.

Mechanism:
- ImageGenProvider gains an optional get_setup_schema() hook
  (name, badge, tag, env_vars). Default derived from display_name.
- tools_config._plugin_image_gen_providers() pulls the schemas from
  every registered non-FAL plugin provider.
- _visible_providers() appends those rows when rendering the Image
  Generation category.
- _configure_provider() handles the new image_gen_plugin_name marker:
  writes image_gen.provider and routes to the plugin's list_models()
  catalog for the model picker.
- _toolset_needs_configuration_prompt('image_gen') stops demanding a
  FAL key when any plugin provider reports is_available().

FAL is skipped in the plugin path because it already has hardcoded
TOOL_CATEGORIES rows — when it gets ported to a plugin in a follow-up
PR the hardcoded rows go away and it surfaces through the same path
as OpenAI.

Verified live: picker shows Nous Subscription / FAL.ai / OpenAI.
Picking OpenAI prompts for OPENAI_API_KEY, then shows the
gpt-image-2-low/medium/high model picker sourced from the plugin.

397 tests pass across plugins/, tools_config, registry, and picker.

* fix(image_gen): close final gaps for plugin-backend parity with FAL

Two small places that still hardcoded FAL:

- hermes_cli/setup.py status line: an OpenAI-only setup showed
  'Image Generation: missing FAL_KEY'. Now probes plugin providers
  and reports '(OpenAI)' when one is_available() — or falls back to
  'missing FAL_KEY or OPENAI_API_KEY' if nothing is configured.

- image_generate tool schema description: said 'using FAL.ai, default
  FLUX 2 Klein 9B'. Rewrote provider-neutral — 'backend and model are
  user-configured' — and notes the 'image' field can be a URL or an
  absolute path, which the gateway delivers either way via
  extract_local_files().
2026-04-21 21:30:10 -07:00
emozilla
29693f9d8e feat(aux): use Portal /api/nous/recommended-models for auxiliary models
Wire the auxiliary client (compaction, vision, session search, web extract)
to the Nous Portal's curated recommended-models endpoint when running on
Nous Portal, with a TTL-cached fetch that mirrors how we pull /models for
pricing.

hermes_cli/models.py
  - fetch_nous_recommended_models(portal_base_url, force_refresh=False)
    10-minute TTL cache, keyed per portal URL (staging vs prod don't
    collide).  Public endpoint, no auth required.  Returns {} on any
    failure so callers always get a dict.
  - get_nous_recommended_aux_model(vision, free_tier=None, ...)
    Tier-aware pick from the payload:
      - Paid tier → paidRecommended{Vision,Compaction}Model, falling back
        to freeRecommended* when the paid field is null (common during
        staged rollouts of new paid models).
      - Free tier → freeRecommended* only, never leaks paid models.
    When free_tier is None, auto-detects via the existing
    check_nous_free_tier() helper (already cached 3 min against
    /api/oauth/account).  Detection errors default to paid so we never
    silently downgrade a paying user.

agent/auxiliary_client.py — _try_nous()
  - Replaces the hardcoded xiaomi/mimo free-tier branch with a single call
    to get_nous_recommended_aux_model(vision=vision).
  - Falls back to _NOUS_MODEL (google/gemini-3-flash-preview) when the
    Portal is unreachable or returns a null recommendation.
  - The Portal is now the source of truth for aux model selection; the
    xiaomi allowlist we used to carry is effectively dead.

Tests (15 new)
  - tests/hermes_cli/test_models.py::TestNousRecommendedModels
    Fetch caching, per-portal keying, network failure, force_refresh;
    paid-prefers-paid, paid-falls-to-free, free-never-leaks-paid,
    auto-detect, detection-error → paid default, null/blank modelName
    handling.
  - tests/agent/test_auxiliary_client.py::TestNousAuxiliaryRefresh
    _try_nous honors Portal recommendation for text + vision, falls
    back to google/gemini-3-flash-preview on None or exception.

Behavior won't visibly change today — both tier recommendations currently
point at google/gemini-3-flash-preview — but the moment the Portal ships
a better paid recommendation, subscribers pick it up within 10 minutes
without a Hermes release.
2026-04-21 20:35:16 -07:00
emozilla
c22f4a76de remove Nous Portal free-model allowlist
Drop _NOUS_ALLOWED_FREE_MODELS + filter_nous_free_models and its two call
sites. Whatever Nous Portal prices as free now shows up in the picker as-is
— no local allowlist gatekeeping. Free-tier partitioning (paid vs free in
the menu) still runs via partition_nous_models_by_tier.
2026-04-21 20:35:16 -07:00
Teknium
b2ba351380 fix(kimi): reconcile sk-kimi- routing with Anthropic SDK URL semantics
Follow-ups after salvaging xiaoqiang243's kimi-for-coding patches:

- KIMI_CODE_BASE_URL: drop trailing /v1 (was /coding/v1).
  The /coding endpoint speaks Anthropic Messages, and the Anthropic SDK
  appends /v1/messages internally. /coding/v1 + SDK suffix produced
  /coding/v1/v1/messages (a 404). /coding + SDK suffix now yields
  /coding/v1/messages correctly.
- kimi-coding ProviderConfig: keep legacy default api.moonshot.ai/v1 so
  non-sk-kimi- moonshot keys still authenticate. sk-kimi- keys are
  already redirected to api.kimi.com/coding via _resolve_kimi_base_url.
- doctor.py: update Kimi UA to claude-code/0.1.0 (was KimiCLI/1.30.0)
  and rewrite /coding base URLs to /coding/v1 for the /models health
  check (Anthropic surface has no /models).
- test_kimi_env_vars: accept KIMI_CODING_API_KEY as a secondary env var.

E2E verified:
  sk-kimi-<key>  → https://api.kimi.com/coding/v1/messages (Anthropic)
  sk-<legacy>    → https://api.moonshot.ai/v1/chat/completions (OpenAI)
  UA: claude-code/0.1.0, x-api-key: <sk-kimi-*>
2026-04-21 19:48:39 -07:00
brooklyn!
3e198f37c9
Merge pull request #13641 from NousResearch/bb/tui-at-folder-filter
fix(tui): @folder: / @file: completions respect the explicit prefix
2026-04-21 16:33:30 -05:00
pefontana
48ecb98f8a feat(delegate): orchestrator role and configurable spawn depth (default flat)
Adds role='leaf'|'orchestrator' to delegate_task. With max_spawn_depth>=2,
an orchestrator child retains the 'delegation' toolset and can spawn its
own workers; leaf children cannot delegate further (identical to today).

Default posture is flat — max_spawn_depth=1 means a depth-0 parent's
children land at the depth-1 floor and orchestrator role silently
degrades to leaf. Users opt into nested delegation by raising
max_spawn_depth to 2 or 3 in config.yaml.

Also threads acp_command/acp_args through the main agent loop's delegate
dispatch (previously silently dropped in the schema) via a new
_dispatch_delegate_task helper, and adds a DelegateEvent enum with
legacy-string back-compat for gateway/ACP/CLI progress consumers.

Config (hermes_cli/config.py defaults):
  delegation.max_concurrent_children: 3   # floor-only, no upper cap
  delegation.max_spawn_depth: 1           # 1=flat (default), 2-3 unlock nested
  delegation.orchestrator_enabled: true   # global kill switch

Salvaged from @pefontana's PR #11215. Overrides vs. the original PR:
concurrency stays at 3 (PR bumped to 5 + cap 8 — we keep the floor only,
no hard ceiling); max_spawn_depth defaults to 1 (PR defaulted to 2 which
silently enabled one level of orchestration for every user).

Co-authored-by: pefontana <fontana.pedro93@gmail.com>
2026-04-21 14:23:45 -07:00
pefontana
7c3c7e50c5 test(delegate): make default_toolsets regression test robust to user config
The prior form of this test asserted on CLI_CONFIG["delegation"] after
importing cli, which only passed by accident of pytest-xdist worker
scheduling. cli._hermes_home is frozen at module import time (cli.py:76),
before the tests/conftest.py autouse HERMES_HOME-isolation fixture can
fire, so CLI_CONFIG ends up populated by deep-merging the contributor's
actual ~/.hermes/config.yaml over the defaults (cli.py:359-366). Any
contributor (like me) who still has the legacy key set in their own
config causes a false failure the moment another test file in the same
xdist worker imports cli at module level.

Asserting on the source of load_cli_config() instead sidesteps all of
that: the test now checks the defaults literal directly and is
independent of user config, HERMES_HOME, import order, and worker
scheduling.

Demonstrated failure mode before this fix:
  pytest tests/hermes_cli/test_config_drift.py \
         tests/hermes_cli/test_skills_hub.py -o addopts=""
  -> FAILED (CLI_CONFIG["delegation"] contained "default_toolsets"
     from the user's ~/.hermes/config.yaml)

Part of Initiative 2 / M0.5.
2026-04-21 13:44:27 -07:00
pefontana
631e8793f4 refactor(delegate): drop dead default_toolsets from CLI default config
delegation.default_toolsets was declared in cli.py's CLI_CONFIG default
dict and documented in cli-config.yaml.example, but never read: none of
tools/delegate_tool.py, _load_config(), or any call site ever looked it
up. The live fallback is the DEFAULT_TOOLSETS module constant at
tools/delegate_tool.py:101, which stays as-is.

hermes_cli/config.py's DEFAULT_CONFIG["delegation"] already omits the
key — this commit aligns cli.py with that.

Adds a regression test in tests/hermes_cli/test_config_drift.py so a
future refactor that re-adds the key without wiring it up to
_load_config() fails loudly.

Part of Initiative 2 / M0.5.
2026-04-21 13:44:27 -07:00
Brooklyn Nicholson
9d9db1e910 fix(tui): @folder: only yields directories, @file: only yields files
Reported during TUI v2 blitz testing: typing `@folder:` in the composer
pulled up .dockerignore, .env, .gitignore, and every other file in the
cwd alongside the actual directories. The completion loop yielded every
entry regardless of the explicit prefix and auto-rewrote each completion
to @file: vs @folder: based on is_dir — defeating the user's choice.

Also fixed a pre-existing adjacent bug: a bare `@file:` or `@folder:`
(no path) used expanded=="." as both search_dir AND match_prefix,
filtering the list to dotfiles only. When expanded is empty or ".",
search in cwd with no prefix filter.

- want_dir = prefix == "@folder:" drives an explicit is_dir filter
- preserve the typed prefix in completion text instead of rewriting
- three regression tests cover: folder-only, file-only, and the bare-
  prefix case where completions keep the `@folder:` prefix
2026-04-21 14:31:48 -05:00
Teknium
244ae6db15
fix(web_server,whatsapp-bridge): validate Host header against bound interface (#13530)
DNS rebinding attack: a victim browser that has the dashboard (or the
WhatsApp bridge) open could be tricked into fetching from an
attacker-controlled hostname that TTL-flips to 127.0.0.1. Same-origin
and CORS checks don't help — the browser now treats the attacker origin
as same-origin with the local service. Validating the Host header at
the app layer rejects any request whose Host isn't one we bound for.

Changes:

hermes_cli/web_server.py:
- New host_header_middleware runs before auth_middleware. Reads
  app.state.bound_host (set by start_server) and rejects requests
  whose Host header doesn't match the bound interface with HTTP 400.
- Loopback binds accept localhost / 127.0.0.1 / ::1. Non-loopback
  binds require exact match. 0.0.0.0 binds skip the check (explicit
  --insecure opt-in; no app-layer defence possible).
- IPv6 bracket notation parsed correctly: [::1] and [::1]:9119 both
  accepted.

scripts/whatsapp-bridge/bridge.js:
- Express middleware rejects non-loopback Host headers. Bridge
  already binds 127.0.0.1-only, this adds the complementary app-layer
  check for DNS rebinding defence.

Tests: 8 new in tests/hermes_cli/test_web_server_host_header.py
covering loopback/non-loopback/zero-zero binds, IPv6 brackets, case
insensitivity, and end-to-end middleware rejection via TestClient.

Reported in GHSA-ppp5-vxwm-4cf7 by @bupt-Yy-young. Hardening — not
CVE per SECURITY.md §3. The dashboard's main trust boundary is the
loopback bind + session token; DNS rebinding defeats the bind assumption
but not the token (since the rebinding browser still sees a first-party
fetch to 127.0.0.1 with the token-gated API). Host-header validation
adds the missing belt-and-braces layer.
2026-04-21 06:26:35 -07:00
Teknium
7fc1e91811
security(runtime_provider): close OLLAMA_API_KEY substring-leak sweep miss (#13522)
Two call sites still used a raw substring check to identify ollama.com:

  hermes_cli/runtime_provider.py:496:
      _is_ollama_url = "ollama.com" in base_url.lower()

  run_agent.py:6127:
      if fb_base_url_hint and "ollama.com" in fb_base_url_hint.lower() ...

Same bug class as GHSA-xf8p-v2cg-h7h5 (OpenRouter substring leak), which
was fixed in commit dbb7e00e via base_url_host_matches() across the
codebase. The earlier sweep missed these two Ollama sites. Self-discovered
during April 2026 security-advisory triage; filed as GHSA-76xc-57q6-vm5m.

Impact is narrow — requires a user with OLLAMA_API_KEY configured AND a
custom base_url whose path or look-alike host contains 'ollama.com'.
Users on default provider flows are unaffected. Filed as a draft advisory
to use the private-fork flow; not CVE-worthy on its own.

Fix is mechanical: replace substring check with base_url_host_matches
at both sites. Same helper the rest of the codebase uses.

Tests: 67 -> 71 passing. 7 new host-matcher cases in
tests/test_base_url_hostname.py (path injection, lookalike host,
localtest.me subdomain, ollama.ai TLD confusion, localhost, genuine
ollama.com, api.ollama.com subdomain) + 4 call-site tests in
tests/hermes_cli/test_runtime_provider_resolution.py verifying
OLLAMA_API_KEY is selected only when base_url actually targets
ollama.com.

Fixes GHSA-76xc-57q6-vm5m
2026-04-21 06:06:16 -07:00
Teknium
3f72b2fe15 fix(/model): accept provider switches when /models is unreachable
Gateway /model <name> --provider opencode-go (or any provider whose /models
endpoint is down, 404s, or doesn't exist) silently failed. validate_requested_model
returned accepted=False whenever fetch_api_models returned None, switch_model
returned success=False, and the gateway never wrote _session_model_overrides —
so the switch appeared to succeed in the error message flow but the next turn
kept calling the old provider.

The validator already had static-catalog fallbacks for MiniMax and Codex
(providers without a /models endpoint). Extended the same pattern as the
terminal fallback: when the live probe fails, consult provider_model_ids()
for the curated catalog. Known models → accepted+recognized. Close typos →
auto-corrected. Unknown models → soft-accepted with a 'Not in curated
catalog' warning. Providers with no catalog at all → soft-accepted with a
generic 'Note:' warning, finally honoring the in-code comment ('Accept and
persist, but warn') that had been lying since it was written.

Tests: 7 new tests in test_opencode_go_validation_fallback.py covering the
catalog lookup, case-insensitive match, auto-correct, unknown-with-suggestion,
unknown-without-suggestion, and no-catalog paths. TestValidateApiFallback in
test_model_validation.py updated — its four 'rejected_when_api_down' tests
were encoding exactly the bug being fixed.
2026-04-21 05:19:43 -07:00
Teknium
c6974043ef
refactor(acp): validate method_id against advertised provider in authenticate() (#13468)
* feat(models): hide OpenRouter models that don't advertise tool support

Port from Kilo-Org/kilocode#9068.

hermes-agent is tool-calling-first — every provider path assumes the
model can invoke tools. Models whose OpenRouter supported_parameters
doesn't include 'tools' (e.g. image-only or completion-only models)
cannot be driven by the agent loop and fail at the first tool call.

Filter them out of fetch_openrouter_models() so they never appear in
the model picker (`hermes model`, setup wizard, /model slash command).

Permissive when the field is missing — OpenRouter-compatible gateways
(Nous Portal, private mirrors, older snapshots) don't always populate
supported_parameters. Treat missing as 'unknown → allow' rather than
silently emptying the picker on those gateways. Only hide models
whose supported_parameters is an explicit list that omits tools.

Tests cover: tools present → kept, tools absent → dropped, field
missing → kept, malformed non-list → kept, non-dict item → kept,
empty list → dropped.

* refactor(acp): validate method_id against advertised provider in authenticate()

Previously authenticate() accepted any method_id whenever the server had
provider credentials configured. This was not a vulnerability under the
personal-assistant trust model (ACP is stdio-only, local-trust — anything
that can reach the transport is already code-execution-equivalent to the
user), but it was sloppy API hygiene: the advertised auth_methods list
from initialize() was effectively ignored.

Now authenticate() only returns AuthenticateResponse when method_id
matches the currently-advertised provider (case-insensitive). Mismatched
or missing method_id returns None, consistent with the no-credentials
case.

Raised by xeloxa via GHSA-g5pf-8w9m-h72x. Declined as a CVE
(ACP transport is stdio, local-trust model), but the correctness fix is
worth having on its own.
2026-04-21 03:39:55 -07:00
Teknium
4fea1769d2
feat(opencode-go): add Kimi K2.6 and Qwen3.5/3.6 Plus to curated catalog (#13429)
OpenCode Go's published model list (opencode.ai/docs/go) includes kimi-k2.6,
qwen3.5-plus, and qwen3.6-plus, but Hermes' curated lists didn't carry them.
When the live /models probe fails during `hermes model`, users fell back to
the stale curated list and had to type newer models via 'Enter custom model
name'.

Adds kimi-k2.6 (now first in the Go list), qwen3.6-plus, and qwen3.5-plus
to both the model picker (hermes_cli/models.py) and setup defaults
(hermes_cli/setup.py). All routed through the existing opencode-go
chat_completions path — no api_mode changes needed.
2026-04-21 01:56:55 -07:00
Teknium
2c69b3eca8
fix(auth): unify credential source removal — every source sticks (#13427)
Every credential source Hermes reads from now behaves identically on
`hermes auth remove`: the pool entry stays gone across fresh load_pool()
calls, even when the underlying external state (env var, OAuth file,
auth.json block, config entry) is still present.

Before this, auth_remove_command was a 110-line if/elif with five
special cases, and three more sources (qwen-cli, copilot, custom
config) had no removal handler at all — their pool entries silently
resurrected on the next invocation.  Even the handled cases diverged:
codex suppressed, anthropic deleted-without-suppressing, nous cleared
without suppressing.  Each new provider added a new gap.

What's new:
  agent/credential_sources.py — RemovalStep registry, one entry per
  source (env, claude_code, hermes_pkce, nous device_code, codex
  device_code, qwen-cli, copilot gh_cli + env vars, custom config).
  auth_remove_command dispatches uniformly via find_removal_step().

Changes elsewhere:
  agent/credential_pool.py — every upsert in _seed_from_env,
  _seed_from_singletons, and _seed_custom_pool now gates on
  is_source_suppressed(provider, source) via a shared helper.
  hermes_cli/auth_commands.py — auth_remove_command reduced to 25
  lines of dispatch; auth_add_command now clears ALL suppressions for
  the provider on re-add (was env:* only).

Copilot is special: the same token is seeded twice (gh_cli via
_seed_from_singletons + env:<VAR> via _seed_from_env), so removing one
entry without suppressing the other variants lets the duplicate
resurrect.  The copilot RemovalStep suppresses gh_cli + all three env
variants (COPILOT_GITHUB_TOKEN, GH_TOKEN, GITHUB_TOKEN) at once.

Tests: 11 new unit tests + 4059 existing pass.  12 E2E scenarios cover
every source in isolated HERMES_HOME with simulated fresh processes.
2026-04-21 01:52:49 -07:00
Teknium
b341b19fff
fix(auth): hermes auth remove sticks for shell-exported env vars (#13418)
Removing an env-seeded credential only cleared ~/.hermes/.env and the
current process's os.environ, leaving shell-exported vars (shell profile,
systemd EnvironmentFile, launchd plist) to resurrect the entry on the
next load_pool() call.  This matched the pre-#11485 codex behaviour.

Now we suppress env:<VAR> in auth.json on remove, gate _seed_from_env()
behind is_source_suppressed(), clear env:* suppressions on auth add,
and print a diagnostic pointing at the shell when the var lives there.

Applies to every env:* seeded credential (xai, deepseek, moonshot, zai,
nvidia, openrouter, anthropic, etc.), not just xai.

Reported by @teknium1 from community user 'Artificial Brain' — couldn't
remove their xAI key via hermes auth remove.
2026-04-21 01:34:50 -07:00
Teknium
62cbeb6367
test: stop testing mutable data — convert change-detectors to invariants (#13363)
Catalog snapshots, config version literals, and enumeration counts are data
that changes as designed. Tests that assert on those values add no
behavioral coverage — they just break CI on every routine update and cost
engineering time to 'fix.'

Replace with invariants where one exists, delete where none does.

Deleted (pure snapshots):
- TestMinimaxModelCatalog (3 tests): 'MiniMax-M2.7 in models' et al
- TestGeminiModelCatalog: 'gemini-2.5-pro in models', 'gemini-3.x in models'
- test_browser_camofox_state::test_config_version_matches_current_schema
  (docstring literally said it would break on unrelated bumps)

Relaxed (keep plumbing check, drop snapshot):
- Xiaomi / Arcee / Kimi moonshot / Kimi coding / HuggingFace static lists:
  now assert 'provider exists and has >= 1 entry' instead of specific names
- HuggingFace main/models.py consistency test: drop 'len >= 6' floor

Dynamicized (follow source, not a literal):
- 3x test_config.py migration tests: raw['_config_version'] ==
  DEFAULT_CONFIG['_config_version'] instead of hardcoded 21

Fixed stale tests against intentional behavior changes:
- test_insights::test_gateway_format_hides_cost: name matches new behavior
  (no dollar figures); remove contradicting '$' in text assertion
- test_config::prefers_api_then_url_then_base_url: flipped per PR #9332;
  rename + update to base_url > url > api
- test_anthropic_adapter: relax assert_called_once() (xdist-flaky) to
  assert called — contract is 'credential flowed through'
- test_interrupt_propagation: add provider/model/_base_url to bare-agent
  fixture so the stale-timeout code path resolves

Fixed stale integration tests against opt-in plugin gate:
- transform_tool_result + transform_terminal_output: write plugins.enabled
  allow-list to config.yaml and reset the plugin manager singleton

Source fix (real consistency invariant):
- agent/model_metadata.py: add moonshotai/Kimi-K2.6 context length
  (262144, same as K2.5). test_model_metadata_has_context_lengths was
  correctly catching the gap.

Policy:
- AGENTS.md Testing section: new subsection 'Don't write change-detector
  tests' with do/don't examples. Reviewers should reject catalog-snapshot
  assertions in new tests.

Covers every test that failed on the last completed main CI run
(24703345583) except test_modal_sandbox_fixes::test_terminal_tool_present
+ test_terminal_and_file_toolsets_resolve_all_tools, which now pass both
alone and with the full tests/tools/ directory (xdist ordering flake that
resolved itself).
2026-04-20 23:20:33 -07:00
Teknium
b4edf9e6be
refactor(ai-gateway): single source of truth for model catalog (#13304)
Delete the stale literal `_PROVIDER_MODELS["ai-gateway"]` (gpt-5,
gemini-2.5-pro, claude-4.5 — outdated the moment PR #13223 landed with
its curated `AI_GATEWAY_MODELS` snapshot) and derive it from
`AI_GATEWAY_MODELS` instead, so the picker tuples and the bare-id
fallback catalog stay in sync automatically. Also fixes
`get_default_model_for_provider('ai-gateway')` to return kimi-k2.6
(the curated recommendation) instead of claude-opus-4.6.
2026-04-20 22:21:21 -07:00
Teknium
cecf84daf7 fix: extend hostname-match provider detection across remaining call sites
Aslaaen's fix in the original PR covered _detect_api_mode_for_url and the
two openai/xai sites in run_agent.py. This finishes the sweep: the same
substring-match false-positive class (e.g. https://api.openai.com.evil/v1,
https://proxy/api.openai.com/v1, https://api.anthropic.com.example/v1)
existed in eight more call sites, and the hostname helper was duplicated
in two modules.

- utils: add shared base_url_hostname() (single source of truth).
- hermes_cli/runtime_provider, run_agent: drop local duplicates, import
  from utils. Reuse the cached AIAgent._base_url_hostname attribute
  everywhere it's already populated.
- agent/auxiliary_client: switch codex-wrap auto-detect, max_completion_tokens
  gate (auxiliary_max_tokens_param), and custom-endpoint max_tokens kwarg
  selection to hostname equality.
- run_agent: native-anthropic check in the Claude-style model branch
  and in the AIAgent init provider-auto-detect branch.
- agent/model_metadata: Anthropic /v1/models context-length lookup.
- hermes_cli/providers.determine_api_mode: anthropic / openai URL
  heuristics for custom/unknown providers (the /anthropic path-suffix
  convention for third-party gateways is preserved).
- tools/delegate_tool: anthropic detection for delegated subagent
  runtimes.
- hermes_cli/setup, hermes_cli/tools_config: setup-wizard vision-endpoint
  native-OpenAI detection (paired with deduping the repeated check into
  a single is_native_openai boolean per branch).

Tests:
- tests/test_base_url_hostname.py covers the helper directly
  (path-containing-host, host-suffix, trailing dot, port, case).
- tests/hermes_cli/test_determine_api_mode_hostname.py adds the same
  regression class for determine_api_mode, plus a test that the
  /anthropic third-party gateway convention still wins.

Also: add asslaenn5@gmail.com → Aslaaen to scripts/release.py AUTHOR_MAP.
2026-04-20 22:14:29 -07:00
Aslaaen
5356797f1b fix: restrict provider URL detection to exact hostname matches 2026-04-20 22:14:29 -07:00
Teknium
fdd0ecaf13
fix(env_loader): warn when non-ASCII stripped from credential env vars (#13300)
Load-time sanitizer silently removed non-ASCII codepoints from any
env var ending in _API_KEY / _TOKEN / _SECRET / _KEY, turning
copy-paste artifacts (Unicode lookalikes, ZWSP, NBSP) into opaque
provider-side API_KEY_INVALID errors.

Warn once per key to stderr with the offending codepoints (U+XXXX)
and guidance to re-copy from the provider dashboard.
2026-04-20 22:14:03 -07:00
jerilynzheng
5bb2d11b07 feat: auto-promote free Moonshot models to top of ai-gateway picker
When the live Vercel AI Gateway catalog exposes a Moonshot model with
zero input AND output pricing, it's promoted to position #1 as the
recommended default — even if the exact ID isn't in the curated
AI_GATEWAY_MODELS list. This enables dynamic discovery of new free
Moonshot variants without requiring a PR to update curation.

Paid Moonshot models are unaffected; falls back to the normal curated
recommended tag when no free Moonshot is live.
2026-04-20 21:02:28 -07:00
jerilynzheng
7004374404 feat: curated picker with live pricing for ai-gateway provider
- Curated AI_GATEWAY_MODELS list in hermes_cli/models.py (OSS first,
  kimi-k2.5 as recommended default).
- fetch_ai_gateway_models() filters the curated list against the live
  /v1/models catalog; falls back to the snapshot on network failure.
- fetch_ai_gateway_pricing() translates Vercel's input/output field
  names to the prompt/completion shape the shared picker expects;
  carries input_cache_read / input_cache_write through unchanged.
- get_pricing_for_provider() now handles ai-gateway.
- _model_flow_ai_gateway() provides a guided URL prompt when no key
  is set and a pricing-column picker; routes ai-gateway to it instead
  of the generic api-key flow.
2026-04-20 21:02:28 -07:00
Peter Fontana
3988c3c245 feat: shell hooks — wire shell scripts as Hermes hook callbacks
Users can declare shell scripts in config.yaml under a hooks: block that
fire on plugin-hook events (pre_tool_call, post_tool_call, pre_llm_call,
subagent_stop, etc). Scripts receive JSON on stdin, can return JSON on
stdout to block tool calls or inject context pre-LLM.

Key design:
- Registers closures on existing PluginManager._hooks dict — zero changes
  to invoke_hook() call sites
- subprocess.run(shell=False) via shlex.split — no shell injection
- First-use consent per (event, command) pair, persisted to allowlist JSON
- Bypass via --accept-hooks, HERMES_ACCEPT_HOOKS=1, or hooks_auto_accept
- hermes hooks list/test/revoke/doctor CLI subcommands
- Adds subagent_stop hook event fired after delegate_task children exit
- Claude Code compatible response shapes accepted

Cherry-picked from PR #13143 by @pefontana.
2026-04-20 20:53:51 -07:00
mavrickdeveloper
1fdf9a730c fix(tools): keep default-off toolsets disabled 2026-04-20 20:52:50 -07:00
IAvecilla
238313068a Update env vars for openclaw migration 2026-04-20 14:56:04 -07:00
Austin Pickett
720e1c65b2
Merge branch 'main' into feat/dashboard-skill-analytics 2026-04-20 05:25:49 -07:00
Teknium
42c30985c7 fix: enable plugins in config.yaml for lazy-discovery tests
The opt-in-by-default change (70111eea) requires plugins to be listed
in plugins.enabled. The cherry-picked test fixtures didn't write this
config, so two tests failed on current main.
2026-04-20 05:11:39 -07:00
Stephen Schoettler
a5e368ebfb fix: publish plugin slash commands in Telegram menu
- discover plugin commands before building Telegram command menus
- make plugin command and context engine accessors lazy-load plugins
- add regression coverage for Telegram menu and plugin lookup paths
2026-04-20 05:11:39 -07:00
Jason
23b81ab243 fix(cli): send User-Agent in /v1/models probe to pass Cloudflare 1010
Custom Claude proxies fronted by Cloudflare with Browser Integrity Check
enabled (e.g. `packyapi.com`) reject requests with the default
`Python-urllib/*` signature, returning HTTP 403 "error code: 1010".
`probe_api_models` swallowed that in its blanket `except Exception:
continue`, so `validate_requested_model` returned the misleading
"Could not reach the <provider> API to validate `<model>`" error even
though the endpoint is reachable and lists the requested model.

Advertise the probe request as `hermes-cli/<version>` so Cloudflare
treats it as a first-party client. This mirrors the pattern already used
by `agent/gemini_native_adapter.py` and `agent/anthropic_adapter.py`,
which set a descriptive UA for the same reason.

Reproduction (pre-fix):

    python3 -c "
    import urllib.request
    req = urllib.request.Request(
        'https://www.packyapi.com/v1/models',
        headers={'Authorization': 'Bearer sk-...'})
    urllib.request.urlopen(req).read()
    "
    urllib.error.HTTPError: HTTP Error 403: Forbidden
    (body: b'error code: 1010')

Any non-urllib UA (Mozilla, curl, reqwest) returns 200 with the
OpenAI-compatible models listing.

Tested on macOS (Python 3.11). No cross-platform concerns — the change
is a single header addition to an existing `urllib.request.Request`.
2026-04-20 04:56:30 -07:00
luyao618
2cdae233e2 fix(config): validate providers config entries — reject non-URL base, accept camelCase aliases (#9332)
Cherry-picked from PR #9359 by @luyao618.

- Accept camelCase aliases (apiKey, baseUrl, apiMode, keyEnv, defaultModel,
  contextLength, rateLimitDelay) with auto-mapping to snake_case + warning
- Validate URL field values with urlparse (scheme + netloc check) — reject
  non-URL strings like 'openai-reverse-proxy' that were silently accepted
- Warn on unknown keys in provider config entries
- Re-order URL field priority: base_url > url > api (was api > url > base_url)
- 12 new tests covering all scenarios

Closes #9332
2026-04-20 04:52:50 -07:00
kshitijk4poor
bc2559c44d fix: remove codex spark model support
Drop gpt-5.3-codex-spark from Codex forward-compat synthesis,
provider catalogs, and context metadata now that the API no longer
supports it.
2026-04-20 04:51:44 -07:00
Teknium
70111eea24 feat(plugins): make all plugins opt-in by default
Plugins now require explicit consent to load. Discovery still finds every
plugin — user-installed, bundled, and pip — so they all show up in
`hermes plugins` and `/plugins`, but the loader only instantiates
plugins whose name appears in `plugins.enabled` in config.yaml. This
removes the previous ambient-execution risk where a newly-installed or
bundled plugin could register hooks, tools, and commands on first run
without the user opting in.

The three-state model is now explicit:
  enabled     — in plugins.enabled, loads on next session
  disabled    — in plugins.disabled, never loads (wins over enabled)
  not enabled — discovered but never opted in (default for new installs)

`hermes plugins install <repo>` prompts "Enable 'name' now? [y/N]"
(defaults to no). New `--enable` / `--no-enable` flags skip the prompt
for scripted installs. `hermes plugins enable/disable` manage both lists
so a disabled plugin stays explicitly off even if something later adds
it to enabled.

Config migration (schema v20 → v21): existing user plugins already
installed under ~/.hermes/plugins/ (minus anything in plugins.disabled)
are auto-grandfathered into plugins.enabled so upgrades don't silently
break working setups. Bundled plugins are NOT grandfathered — even
existing users have to opt in explicitly.

Also: HERMES_DISABLE_BUNDLED_PLUGINS env var removed (redundant with
opt-in default), cmd_list now shows bundled + user plugins together with
their three-state status, interactive UI tags bundled entries
[bundled], docs updated across plugins.md and built-in-plugins.md.

Validation: 442 plugin/config tests pass. E2E: fresh install discovers
disk-cleanup but does not load it; `hermes plugins enable disk-cleanup`
activates hooks; migration grandfathers existing user plugins correctly
while leaving bundled plugins off.
2026-04-20 04:46:45 -07:00
Teknium
04068c5891
feat(plugins): add transform_tool_result hook for generic tool-result rewriting (#12972)
Closes #8933 more fully, extending the per-tool transform_terminal_output
hook from #12929 to a generic seam that fires after every tool dispatch.
Plugins can rewrite any tool's result string (normalize formats, redact
fields, summarize verbose output) without wrapping individual tools.

Changes
- hermes_cli/plugins.py: add "transform_tool_result" to VALID_HOOKS
- model_tools.py: invoke the hook in handle_function_call after
  post_tool_call (which remains observational); first valid str return
  replaces the result; fail-open
- tests/test_transform_tool_result_hook.py: 9 new tests covering no-op,
  None return, non-string return, first-match wins, kwargs, hook
  exception fallback, post_tool_call observation invariant, ordering
  vs post_tool_call, and an end-to-end real-plugin integration
- tests/hermes_cli/test_plugins.py: assert new hook in VALID_HOOKS
- tests/test_model_tools.py: extend the hook-call-sequence assertion
  to include the new hook

Design
- transform_tool_result runs AFTER post_tool_call so observers always
  see the original (untransformed) result. This keeps post_tool_call's
  observational contract.
- transform_terminal_output (from #12929) still runs earlier, inside
  terminal_tool, so plugins can canonicalize BEFORE the 50k truncation
  drops middle content. Both hooks coexist; they target different layers.
2026-04-20 03:48:08 -07:00
sjz-ks
2081b71c42 feat(tools): add terminal output transform hook 2026-04-20 03:04:06 -07:00
Lumen Radley
22655ed1e6 feat(cli): improve multiline previews 2026-04-20 02:53:40 -07:00
helix4u
03e3c22e86 fix(config): add stale timeout settings 2026-04-20 00:52:50 -07:00
salt-555
12c8cefbce fix(backup): handle files with pre-1980 timestamps
ZipFile.write() raises ValueError for files with mtime before 1980-01-01
(the ZIP format uses MS-DOS timestamps which can't represent earlier dates).
This crashes the entire backup. Add ValueError to the existing except clause
so these files are skipped and reported in the warnings summary, matching the
existing behavior for PermissionError and OSError.
2026-04-20 00:47:40 -07:00
helix4u
6ab78401c9 fix(aux): add session_search extra_body and concurrency controls
Adds auxiliary.<task>.extra_body config passthrough so reasoning-heavy
OpenAI-compatible providers can receive provider-specific request fields
(e.g. enable_thinking: false on GLM) on auxiliary calls, and bounds
session_search summary fan-out with auxiliary.session_search.max_concurrency
(default 3, clamped 1-5) to avoid 429 bursts on small providers.

- agent/auxiliary_client.py: extract _get_auxiliary_task_config helper,
  add _get_task_extra_body, merge config+explicit extra_body with explicit winning
- hermes_cli/config.py: extra_body defaults on all aux tasks +
  session_search.max_concurrency; _config_version 19 -> 20
- tools/session_search_tool.py: semaphore around _summarize_all gather
- tests: coverage in test_auxiliary_client, test_session_search, test_aux_config
- docs: user-guide/configuration.md + fallback-providers.md

Co-authored-by: Teknium <teknium@nousresearch.com>
2026-04-20 00:47:39 -07:00
Tranquil-Flow
b53f74a489 fix(auth): use ssl.SSLContext for CA bundle instead of deprecated string path (#12706) 2026-04-19 22:44:35 -07:00
Teknium
65a31ee0d5
fix(anthropic): complete third-party Anthropic-compatible provider support (#12846)
Third-party gateways that speak the native Anthropic protocol (MiniMax,
Zhipu GLM, Alibaba DashScope, Kimi, LiteLLM proxies) now work end-to-end
with the same feature set as direct api.anthropic.com callers.  Synthesizes
eight stale community PRs into one consolidated change.

Five fixes:

- URL detection: consolidate three inline `endswith("/anthropic")`
  checks in runtime_provider.py into the shared _detect_api_mode_for_url
  helper.  Third-party /anthropic endpoints now auto-resolve to
  api_mode=anthropic_messages via one code path instead of three.

- OAuth leak-guard: all five sites that assign `_is_anthropic_oauth`
  (__init__, switch_model, _try_refresh_anthropic_client_credentials,
  _swap_credential, _try_activate_fallback) now gate on
  `provider == "anthropic"` so a stale ANTHROPIC_TOKEN never trips
  Claude-Code identity injection on third-party endpoints.  Previously
  only 2 of 5 sites were guarded.

- Prompt caching: new method `_anthropic_prompt_cache_policy()` returns
  `(should_cache, use_native_layout)` per endpoint.  Replaces three
  inline conditions and the `native_anthropic=(api_mode=='anthropic_messages')`
  call-site flag.  Native Anthropic and third-party Anthropic gateways
  both get the native cache_control layout; OpenRouter gets envelope
  layout.  Layout is persisted in `_primary_runtime` so fallback
  restoration preserves the per-endpoint choice.

- Auxiliary client: `_try_custom_endpoint` honors
  `api_mode=anthropic_messages` and builds `AnthropicAuxiliaryClient`
  instead of silently downgrading to an OpenAI-wire client.  Degrades
  gracefully to OpenAI-wire when the anthropic SDK isn't installed.

- Config hygiene: `_update_config_for_provider` (hermes_cli/auth.py)
  clears stale `api_key`/`api_mode` when switching to a built-in
  provider, so a previous MiniMax custom endpoint's credentials can't
  leak into a later OpenRouter session.

- Truncation continuation: length-continuation and tool-call-truncation
  retry now cover `anthropic_messages` in addition to `chat_completions`
  and `bedrock_converse`.  Reuses the existing `_build_assistant_message`
  path via `normalize_anthropic_response()` so the interim message
  shape is byte-identical to the non-truncated path.

Tests: 6 new files, 42 test cases.  Targeted run + tests/run_agent,
tests/agent, tests/hermes_cli all pass (4554 passed).

Synthesized from (credits preserved via Co-authored-by trailers):
  #7410  @nocoo           — URL detection helper
  #7393  @keyuyuan        — OAuth 5-site guard
  #7367  @n-WN            — OAuth guard (narrower cousin, kept comment)
  #8636  @sgaofen         — caching helper + native-vs-proxy layout split
  #10954 @Only-Code-A     — caching on anthropic_messages+Claude
  #7648  @zhongyueming1121 — aux client anthropic_messages branch
  #6096  @hansnow         — /model switch clears stale api_mode
  #9691  @TroyMitchell911 — anthropic_messages truncation continuation

Closes: #7366, #8294 (third-party Anthropic identity + caching).
Supersedes: #7410, #7367, #7393, #8636, #10954, #7648, #6096, #9691.
Rejects:    #9621 (OpenAI-wire caching with incomplete blocklist — risky),
            #7242 (superseded by #9691, stale branch),
            #8321 (targets smart_model_routing which was removed in #12732).

Co-authored-by: nocoo <nocoo@users.noreply.github.com>
Co-authored-by: Keyu Yuan <leoyuan0099@gmail.com>
Co-authored-by: Zoee <30841158+n-WN@users.noreply.github.com>
Co-authored-by: sgaofen <135070653+sgaofen@users.noreply.github.com>
Co-authored-by: Only-Code-A <bxzt2006@163.com>
Co-authored-by: zhongyueming <mygamez@163.com>
Co-authored-by: Xiaohan Li <hansnow@users.noreply.github.com>
Co-authored-by: Troy Mitchell <i@troy-y.org>
2026-04-19 22:43:09 -07:00
Teknium
ca3a0bbc54 fix(model-picker): dedup overlapping providers: dict and custom_providers: list entries
When a user's config has the same endpoint in both the providers: dict
(v12+ keyed schema) and custom_providers: list (legacy schema) — which
happens automatically when callers pass the output of
get_compatible_custom_providers() alongside the raw providers dict —
list_authenticated_providers() emitted two picker rows for the same
endpoint: one bare-slug from section 3 and one 'custom:<name>' from
section 4. The slug shapes differed, so seen_slugs dedup never fired,
and users saw the same endpoint twice with identical display labels.

Fix: section 3 records the (display_name, base_url) of each emitted
entry in _section3_emitted_pairs; section 4 skips groups whose
(name, api_url) pair was already emitted. Preserves existing behaviour
for users on either schema alone, and for distinct entries across both.

Test: test_list_authenticated_providers_no_duplicate_labels_across_schemas.
2026-04-19 22:15:49 -07:00
Teknium
323e827f4a
test: remove 8 flaky tests that fail under parallel xdist scheduling (#12784)
These tests all pass in isolation but fail in CI due to test-ordering
pollution on shared xdist workers.  Each has a different root cause:

- tests/tools/test_send_message_tool.py (4 tests): racing session ContextVar
  pollution — get_session_env returns '' instead of 'cli' default when an
  earlier test on the same worker leaves HERMES_SESSION_PLATFORM set.
- tests/tools/test_skills_tool.py (2 tests): KeyError: 'gateway_setup_hint'
  from shared skill state mutation.
- tests/tools/test_tts_mistral.py::test_telegram_produces_ogg_and_voice_compatible:
  pre-existing intermittent failure.
- tests/hermes_cli/test_update_check.py::test_get_update_result_timeout:
  racing a background git-fetch thread that writes a real commits-behind
  value into module-level _update_result before assertion.

All 8 have been failing on main for multiple runs with no clear path to a
safe fix that doesn't require restructuring the tests' isolation story.
Removing is cheaper than chasing — the code paths they cover are
exercised elsewhere (send_message has 73+ other tests, skills_tool has
extensive coverage, TTS has other backend tests, update check has other
tests for check_for_updates proper).

Validation: all 4 files now pass cleanly: 169/169 under CI-parity env.
2026-04-19 19:38:02 -07:00
Teknium
b2f8e231dd fix(test): test get_update_result timeout behavior, not result-value identity
My previous attempt (patching check_for_updates) still lost the race:
the background update-check thread captures check_for_updates via
global lookup at call time, but on CI the thread was already past that
point (mid-git-fetch) by the time the test's patch took effect.  The
real fetch returned 4954 commits-behind and wrote that to
banner._update_result before the test's assertion ran.

Fix: test what we actually care about — that get_update_result respects
its timeout parameter — and drop the asserting-on-result-value that
races with legitimate background activity.  The get_update_result
function's job is to return after `timeout` seconds if the event isn't
set.  The value of `_update_result` is incidental to that test.

Validation: tests/hermes_cli/test_update_check.py now 9/9 pass under
CI-parity env, and the test no longer has a correctness dependency on
module-level state that other threads can write.
2026-04-19 19:18:19 -07:00
Teknium
ad4680cf74 fix(ci): stub resolve_runtime_provider in cron wake-gate tests + shield update-check timeout test from thread race
Two additional CI failures surfaced when the first PR ran through GHA —
both were pre-existing but blocked merge.

1) tests/cron/test_scheduler.py::TestRunJobWakeGate (3 tests)
   run_job calls resolve_runtime_provider BEFORE constructing AIAgent, so
   patching run_agent.AIAgent alone isn't enough — the resolver raises
   'No inference provider configured' in hermetic CI (no API keys) and
   the test never reaches the mocked AIAgent.  Added autouse fixture
   that stubs resolve_runtime_provider with a fake openrouter runtime.

2) tests/hermes_cli/test_update_check.py::test_get_update_result_timeout
   Observed on CI: assert 4950 is None.  A background update-check
   thread (from an earlier test or hermes_cli.main's own
   prefetch_update_check call) raced a real git-fetch result
   (4950 commits behind origin/main) into banner._update_result during
   this test's wait(0.1).  Wrap the test in patch.object(banner,
   'check_for_updates', return_value=None) so any in-flight thread
   writes None rather than a real value.

Validation:
  Under CI-parity env (env -i, no creds): 6/6 pass
  Broader suite (tests/hermes_cli + cron + gateway + run_agent/streaming
  + toolsets + discord_tool): 6033 passed, pre-existing failures in
  telegram_approval_buttons (3) and internal_event_bypass_pairing (1)
  are unrelated.
2026-04-19 19:18:19 -07:00
kshitijk4poor
d393104bad fix(gemini): tighten native routing and streaming replay
- only use the native adapter for the canonical Gemini native endpoint
- keep custom and /openai base URLs on the OpenAI-compatible path
- preserve Hermes keepalive transport injection for native Gemini clients
- stabilize streaming tool-call replay across repeated SSE events
- add follow-up tests for base_url precedence, async streaming, and duplicate tool-call chunks
2026-04-19 12:40:08 -07:00
kshitijk4poor
3dea497b20 feat(providers): route gemini through the native AI Studio API
- add a native Gemini adapter over generateContent/streamGenerateContent
- switch the built-in gemini provider off the OpenAI-compatible endpoint
- preserve thought signatures and native functionResponse replay
- route auxiliary Gemini clients through the same adapter
- add focused unit coverage plus native-provider integration checks
2026-04-19 12:40:08 -07:00
Teknium
aa5bd09232
fix(tests): unstick CI — sweep stale tests from recent merges (#12670)
One source fix (web_server category merge) + five test updates that
didn't travel with their feature PRs. All 13 failures on the 04-19
CI run on main are now accounted for (5 already self-healed on main;
8 fixed here).

Changes
- web_server.py: add code_execution → agent to _CATEGORY_MERGE (new
  singleton section from #11971 broke no-single-field-category invariant).
- test_browser_camofox_state: bump hardcoded _config_version 18 → 19
  (also from #11971).
- test_registry: add browser_cdp_tool (#12369) and discord_tool (#4753)
  to the expected built-in tool set.
- test_run_agent::test_tool_call_accumulation: rewrite fragment chunks
  — #0f778f77 switched streaming name-accumulation from += to = to
  fix MiniMax/NIM duplication; the test still encoded the old
  fragment-per-chunk premise.
- test_concurrent_interrupt::_Stub: no-op
  _apply_pending_steer_to_tool_results — #12116 added this call after
  concurrent tool batches; the hand-rolled stub was missing it.
- test_codex_cli_model_picker: drop the two obsolete tests that
  asserted auto-import from ~/.codex/auth.json into the Hermes auth
  store. #12360 explicitly removed that behavior (refresh-token reuse
  races with Codex CLI / VS Code); adoption is now explicit via
  `hermes auth openai-codex`. Remaining 3 tests in the file (normal
  path, Claude Code fallback, negative case) still cover the picker.

Validation
- scripts/run_tests.sh across all 6 affected files + surrounding tests
  (54 tests total) all green locally.
2026-04-19 12:39:58 -07:00
Teknium
c11ab6f64d feat(providers): enforce request_timeout_seconds on OpenAI-wire primary calls
Live test with timeout_seconds: 0.5 on claude-sonnet-4.6 proved the
initial wiring was insufficient: run_agent.py was overriding the
client-level timeout on every call via hardcoded per-request kwargs.

Root cause: run_agent.py had two sites that pass an explicit timeout=
kwarg into chat.completions.create() — api_kwargs['timeout'] at line
7075 (HERMES_API_TIMEOUT=1800s default) and the streaming path's
_httpx.Timeout(..., read=HERMES_STREAM_READ_TIMEOUT=120s, ...) at line
5760. Both override the per-provider config value the client was
constructed with, so a 0.5s config timeout would silently not enforce.

This commit:
- Adds AIAgent._resolved_api_call_timeout() — config > HERMES_API_TIMEOUT env > 1800s default.
- Uses it for the non-streaming api_kwargs['timeout'] field.
- Uses it for the streaming path's httpx.Timeout(connect, read, write, pool)
  so both connect and read respect the configured value when set.
  Local-provider auto-bump (Ollama/vLLM cold-start) only applies when
  no explicit config value is set.
- New test: test_resolved_api_call_timeout_priority covers all three
  precedence cases (config, env, default).

Live verified: 0.5s config on claude-sonnet-4.6 now triggers
APITimeoutError at ~3s per retry, exhausts 3 retries in ~15s total
(was: 29-47s success with timeout ignored). Positive case (60s config
+ gpt-4o-mini) still succeeds at 1.3s.
2026-04-19 11:23:00 -07:00
Teknium
f1fe29d1c3 feat(providers): extend request_timeout_seconds to all client paths
Follow-up on top of mvanhorn's cherry-picked commit. Original PR only
wired request_timeout_seconds into the explicit-creds OpenAI branch at
run_agent.py init; router-based implicit auth, native Anthropic, and the
fallback chain were still hardcoded to SDK defaults.

- agent/anthropic_adapter.py: build_anthropic_client() accepts an optional
  timeout kwarg (default 900s preserved when unset/invalid).
- run_agent.py: resolve per-provider/per-model timeout once at init; apply
  to Anthropic native init + post-refresh rebuild + stale/interrupt
  rebuilds + switch_model + _restore_primary_runtime + the OpenAI
  implicit-auth path + _try_activate_fallback (with immediate client
  rebuild so the first fallback request carries the configured timeout).
- tests: cover anthropic adapter kwarg honoring; widen mock signatures
  to accept the new timeout kwarg.
- docs/example: clarify that the knob now applies to every transport,
  the fallback chain, and rebuilds after credential rotation.
2026-04-19 11:23:00 -07:00
Matt Van Horn
3143d32330 feat(providers): add per-provider and per-model request_timeout_seconds config
Adds optional providers.<id>.request_timeout_seconds and
providers.<id>.models.<model>.timeout_seconds config, resolved via a new
hermes_cli/timeouts.py helper and applied where client_kwargs is built
in run_agent.py. Zero default behavior change: when both keys are unset,
the openai SDK default takes over.

Mirrors the existing _get_task_timeout pattern in agent/auxiliary_client.py
for auxiliary tasks - the primary turn path just never got the equivalent
knob.

Cross-project demand: openclaw/openclaw#43946 (17 reactions) asks for
exactly this config - specifically calls out Ollama cold-start hanging
the client.
2026-04-19 11:23:00 -07:00
Teknium
5a23f3291a fix(model_switch): section 3 base_url/model/dedup follow-up
On top of the salvaged PR #12505 (Jason/farion1231, which adds dict-format
models: enumeration to both sections), three section-3 refinements from
competing PR #11534 (YangManBOBO):

- accept base_url as canonical (matches Hermes's writer and custom_providers
  entries); keep api/url as fallbacks for legacy/hand-edited configs
- accept singular model as a default_model synonym, matching custom_providers
- add seen_slugs guard so the same provider slug appearing in both
  providers: dict and custom_providers: list emits exactly one picker row
  (providers: dict wins since section 3 runs first)

Two regression tests cover the new behavior. AUTHOR_MAP entry added for
farion1231 so CI doesn't reject the cherry-picked commit.
2026-04-19 11:07:29 -07:00
Jason
bca03eab20 fix(model_switch): enumerate dict-format models in /model picker
list_authenticated_providers() builds /model picker rows for CLI, TUI and
gateway flows, but fails to enumerate custom provider models stored in
dict form:

- custom_providers[] entries surface only the singular `model:` field,
  hiding every other model in the `models:` dict.
- providers: dict entries with dict-format `models:` are silently dropped
  and render as `(0 models)`.

Hermes's own writer (main.py::_save_custom_provider) persists configured
models as a dict keyed by model id, and most downstream readers
(agent/models_dev.py, gateway/run.py, run_agent.py, hermes_cli/config.py)
already consume that dict format. The /model picker was the only stale
path.

Add a dict branch in both sections of list_authenticated_providers(),
preferring dict (canonical) and keeping the list branch as fallback for
hand-edited / legacy configs. Dedup against the already-added default
model so nothing duplicates when the default is also a dict key.

Six new regression tests in tests/hermes_cli/ cover: dict models with a
default, dict models without a default, and default dedup against a
matching dict key.

Fixes #11677
Fixes #9148
Related: #11017
2026-04-19 11:07:29 -07:00
Teknium
ea0bd81b84 feat(skills): consolidate find-nearby into maps as a single location skill
find-nearby and the (new) maps optional skill both used OpenStreetMap's
Overpass + Nominatim to answer the same question — 'what's near this
location?' — so shipping both would be duplicate code for overlapping
capability. Consolidate into one active-by-default skill at
skills/productivity/maps/ that is a strict superset of find-nearby.

Moves + deletions:
- optional-skills/productivity/maps/ → skills/productivity/maps/ (active,
  no install step needed)
- skills/leisure/find-nearby/ → DELETED (fully superseded)

Upgrades to maps_client.py so it covers everything find-nearby did:
- Overpass server failover — tries overpass-api.de then
  overpass.kumi.systems so a single-mirror outage doesn't break the skill
  (new overpass_query helper, used by both nearby and bbox)
- nearby now accepts --near "<address>" as a shortcut that auto-geocodes,
  so one command replaces the old 'search → copy coords → nearby' chain
- nearby now accepts --category (repeatable) for multi-type queries in
  one call (e.g. --category restaurant --category bar), results merged
  and deduped by (osm_type, osm_id), sorted by distance, capped at --limit
- Each nearby result now includes maps_url (clickable Google Maps search
  link) and directions_url (Google Maps directions from the search point
  — only when a ref point is known)
- Promoted commonly-useful OSM tags to top-level fields on each result:
  cuisine, hours (opening_hours), phone, website — instead of forcing
  callers to dig into the raw tags dict

SKILL.md:
- Version bumped 1.1.0 → 1.2.0, description rewritten to lead with
  capability surface
- New 'Working With Telegram Location Pins' section replacing
  find-nearby's equivalent workflow
- metadata.hermes.supersedes: [find-nearby] so tooling can flag any
  lingering references to the old skill

External references updated:
- optional-skills/productivity/telephony/SKILL.md — related_skills
  find-nearby → maps
- website/docs/reference/skills-catalog.md — removed the (now-empty)
  'leisure' section, added 'maps' row under productivity
- website/docs/user-guide/features/cron.md — find-nearby example
  usages swapped to maps
- tests/tools/test_cronjob_tools.py, tests/hermes_cli/test_cron.py,
  tests/cron/test_scheduler.py — fixture string values swapped
- cli.py:5290 — /cron help-hint example swapped

Not touched:
- RELEASE_v0.2.0.md — historical record, left intact

E2E-verified live (Nominatim + Overpass, one query each):
- nearby --near "Times Square" --category restaurant --category bar → 3 results,
  sorted by distance, all with maps_url, directions_url, cuisine, phone, website
  where OSM had the tags

All 111 targeted tests pass across tests/cron/, tests/tools/, tests/hermes_cli/.
2026-04-19 05:19:22 -07:00
helix4u
361675018f fix(setup): stop hardcoding max-iterations copy 2026-04-19 00:28:25 -07:00
Teknium
b02833f32d
fix(codex): Hermes owns its own Codex auth; stop touching ~/.codex/auth.json (#12360)
Codex OAuth refresh tokens are single-use and rotate on every refresh.
Sharing them with the Codex CLI / VS Code via ~/.codex/auth.json made
concurrent use of both tools a race: whoever refreshed last invalidated
the other side's refresh_token.  On top of that, the silent auto-import
path picked up placeholder / aborted-auth data from ~/.codex/auth.json
(e.g. literal {"access_token":"access-new","refresh_token":"refresh-new"})
and seeded it into the Hermes pool as an entry the selector could
eventually pick.

Hermes now owns its own Codex auth state end-to-end:

Removed
- agent/credential_pool.py: _sync_codex_entry_from_cli() method,
  its pre-refresh + retry + _available_entries call sites, and the
  post-refresh write-back to ~/.codex/auth.json.
- agent/credential_pool.py: auto-import from ~/.codex/auth.json in
  _seed_from_singletons() — users now run `hermes auth openai-codex`
  explicitly.
- hermes_cli/auth.py: silent runtime migration in
  resolve_codex_runtime_credentials() — now surfaces
  `codex_auth_missing` directly (message already points to `hermes auth`).
- hermes_cli/auth.py: post-refresh write-back in
  _refresh_codex_auth_tokens().
- hermes_cli/auth.py: dead helper _write_codex_cli_tokens() and its 4
  tests in test_auth_codex_provider.py.

Kept
- hermes_cli/auth.py: _import_codex_cli_tokens() — still used by the
  interactive `hermes auth openai-codex` setup flow for a user-gated
  one-time import (with "a separate login is recommended" messaging).

User-visible impact
- On existing installs with Hermes auth already present: no change.
- On a fresh install where the user has only logged in via Codex CLI:
  `hermes chat --provider openai-codex` now fails with "No Codex
  credentials stored. Run `hermes auth` to authenticate." The
  interactive setup flow then detects ~/.codex/auth.json and offers a
  one-time import.
- On an install where Codex CLI later refreshes its token: Hermes is
  unaffected (we no longer read from that file at runtime).

Tests
- tests/hermes_cli/test_auth_codex_provider.py: 15/15 pass.
- tests/hermes_cli/test_auth_commands.py: 20/20 pass.
- tests/agent/test_credential_pool.py: 31/31 pass.
- Live E2E on openai-codex/gpt-5.4: 1 API call, 1.7s latency,
  3 log lines, no refresh events, no auth drama.

The related 14:52 refresh-loop bug (hundreds of rotations/minute on a
single entry) is a separate issue — that requires a refresh-attempt
cap on the auth-recovery path in run_agent.py, which remains open.
2026-04-18 19:19:46 -07:00
helix4u
ca32a2a60b fix(gemini): restore bearer auth on openai route 2026-04-18 12:52:01 -07:00
helix4u
a7dd6a3449 fix(gemini): hide stale and low-TPM Google models 2026-04-18 12:52:01 -07:00
helix4u
2eab7ee15f fix(gemini): hide low-TPM Gemma models from exposed lists 2026-04-18 12:52:01 -07:00
Teknium
285bb2b915
feat(execute_code): add project/strict execution modes, default to project (#11971)
Weaker models (Gemma-class) repeatedly rediscover and forget that
execute_code uses a different CWD and Python interpreter than terminal(),
causing them to flip-flop on whether user files exist and to hit import
errors on project dependencies like pandas.

Adds a new 'code_execution.mode' config key (default 'project') that
brings execute_code into line with terminal()'s filesystem/interpreter:

  project (new default):
    - cwd       = session's TERMINAL_CWD (falls back to os.getcwd())
    - python    = active VIRTUAL_ENV/bin/python or CONDA_PREFIX/bin/python
                  with a Python 3.8+ version check; falls back cleanly to
                  sys.executable if no venv or the candidate fails
    - result    : 'import pandas' works, '.env' resolves, matches terminal()

  strict (opt-in):
    - cwd       = staging tmpdir (today's behavior)
    - python    = sys.executable (today's behavior)
    - result    : maximum reproducibility and isolation; project deps
                  won't resolve

Security-critical invariants are identical across both modes and covered by
explicit regression tests:

  - env scrubbing (strips *_API_KEY, *_TOKEN, *_SECRET, *_PASSWORD,
    *_CREDENTIAL, *_PASSWD, *_AUTH substrings)
  - SANDBOX_ALLOWED_TOOLS whitelist (no execute_code recursion, no
    delegate_task, no MCP from inside scripts)
  - resource caps (5-min timeout, 50KB stdout, 50 tool calls)

Deliberately avoids 'sandbox'/'isolated'/'cloud' language in tool
descriptions (regression from commit 39b83f34 where agents on local
backends falsely believed they were sandboxed and refused networking).

Override via env var: HERMES_EXECUTE_CODE_MODE=strict|project
2026-04-18 01:46:25 -07:00
Teknium
598cba62ad
test: update stale tests to match current code (#11963)
Seven test files were asserting against older function signatures and
behaviors. CI has been red on main because of accumulated test debt
from other PRs; this catches the tests up.

- tests/agent/test_subagent_progress.py: _build_child_progress_callback
  now takes (task_index, goal, parent_agent, task_count=1); update all
  call sites and rewrite tests that assumed the old 'batch-only' relay
  semantics (now relays per-tool AND flushes a summary at BATCH_SIZE).
  Renamed test_thinking_not_relayed_to_gateway → test_thinking_relayed_to_gateway
  since thinking IS now relayed as subagent.thinking.
- tests/tools/test_delegate.py: _build_child_agent now requires
  task_count; add task_count=1 to all 8 call sites.
- tests/cli/test_reasoning_command.py: AIAgent gained _stream_callback;
  stub it on the two test agent helpers that use spec=AIAgent / __new__.
- tests/hermes_cli/test_cmd_update.py: cmd_update now runs npm install
  in repo root + ui-tui/ + web/ and 'npm run build' in web/; assert
  all four subprocess calls in the expected order.
- tests/hermes_cli/test_model_validation.py: dissimilar unknown models
  now return accepted=False (previously True with warning); update
  both affected tests.
- tests/tools/test_registry.py: include feishu_doc_tool and
  feishu_drive_tool in the expected builtin tool set.
- tests/gateway/test_voice_command.py: missing-voice-deps message now
  suggests 'pip install PyNaCl' not 'hermes-agent[messaging]'.

411/411 pass locally across these 7 files.
2026-04-17 21:35:30 -07:00
AviArora02-commits
994faacce8 fix: suppress Authorization: Bearer for Gemini provider to prevent HTTP 400 (#7893) 2026-04-17 21:30:17 -07:00
Teknium
8a59f8a9ed
fix(update): survive mid-update terminal disconnect (#11960)
hermes update no longer dies when the controlling terminal closes
(SSH drop, shell close) during pip install.  SIGHUP is set to SIG_IGN
for the duration of the update, and stdout/stderr are wrapped so writes
to a closed pipe are absorbed instead of cascading into process exit.
All update output is mirrored to ~/.hermes/logs/update.log so users can
see what happened after reconnecting.

SIGINT (Ctrl-C) and SIGTERM (systemd) are intentionally still honored —
those are deliberate cancellations, not accidents.  In gateway mode the
helper is a no-op since the update is already detached.

POSIX preserves SIG_IGN across exec(), so pip and git subprocesses
inherit hangup protection automatically — no changes to subprocess
spawning needed.
2026-04-17 21:29:24 -07:00
Teknium
53e4a2f2c6
feat(update): warn about legacy hermes.service units during hermes update (#11918)
Follow-up to #11909: surface the legacy-unit warning where users are most
likely to see it. After a 'hermes update', if a pre-rename hermes.service
is still installed alongside the current hermes-gateway.service, print
the list of legacy units + the 'hermes gateway migrate-legacy' command.

Profile-safe: reuses _find_legacy_hermes_units() which is an explicit
allowlist of hermes.service only — profile units never match.
Platform-gated: only prints on systemd hosts (the rename is Linux-only).
Non-blocking: just prints, never prompts, so gateway-spawned
hermes update --gateway runs aren't affected.
2026-04-17 19:35:12 -07:00
Teknium
07db20c72d
fix(gateway): detect legacy hermes.service + mark --replace SIGTERM as planned (#11909)
* fix(gateway): detect legacy hermes.service units from pre-rename installs

Older Hermes installs used a different service name (hermes.service) before
the rename to hermes-gateway.service. When both units remain installed, they
fight over the same bot token — after PR #5646's signal-recovery change,
this manifests as a 30-second SIGTERM flap loop between the two services.

Detection is an explicit allowlist (no globbing) plus an ExecStart content
check, so profile units (hermes-gateway-<profile>.service) and unrelated
third-party services named 'hermes' are never matched.

Wired into systemd_install, systemd_status, gateway_setup wizard, and the
main hermes setup flow — anywhere we already warn about scope conflicts now
also warns about legacy units.

* feat(gateway): add migrate-legacy command + install-time removal prompt

- New hermes_cli.gateway.remove_legacy_hermes_units() removes legacy
  unit files with stop → disable → unlink → daemon-reload. Handles user
  and system scopes separately; system scope returns path list when not
  running as root so the caller can tell the user to re-run with sudo.
- New 'hermes gateway migrate-legacy' subcommand (with --dry-run and -y)
  routes to remove_legacy_hermes_units via gateway_command dispatch.
- systemd_install now offers to remove legacy units BEFORE installing
  the new hermes-gateway.service, preventing the SIGTERM flap loop that
  hits users who still have pre-rename hermes.service around.

Profile units (hermes-gateway-<profile>.service) remain untouched in
all paths — the legacy allowlist is explicit (_LEGACY_SERVICE_NAMES)
and the ExecStart content check further narrows matches.

* fix(gateway): mark --replace SIGTERM as planned so target exits 0

PR #5646 made SIGTERM exit the gateway with code 1 so systemd's
Restart=on-failure revives it after unexpected kills. But when a user has
two gateway units fighting for the same bot token (e.g. legacy
hermes.service + hermes-gateway.service from a pre-rename install), the
--replace takeover itself becomes the 'unexpected' SIGTERM — the loser
exits 1, systemd revives it 30s later, and the cycle flaps indefinitely.

Before calling terminate_pid(), --replace now writes a short-lived marker
file naming the target PID + start_time. The target's shutdown_signal_handler
consumes the marker and, when it names this process, leaves
_signal_initiated_shutdown=False so the final exit code stays 0.

Staleness defences:
- PID + start_time combo prevents PID reuse matching an old marker
- Marker older than 60s is treated as stale and discarded
- Marker is unlinked on first read even if it doesn't match this process
- Replacer clears the marker post-loop + on permission-denied give-up
2026-04-17 19:27:58 -07:00
helix4u
47010e0757 fix(gateway): allow systemd-backed distrobox services 2026-04-17 19:24:30 -07:00
Teknium
2297c5f5ce fix(auth): restore --label for hermes auth add nous --type oauth
persist_nous_credentials() now accepts an optional label kwarg which
gets embedded in providers.nous under the 'label' key.
_seed_from_singletons() prefers the embedded label over the
auto-derived label_from_token() fingerprint when materialising the
pool entry, so re-seeding on every load_pool('nous') preserves the
user's chosen label.

auth_commands.py threads --label through to the helper, restoring
parity with how other OAuth providers (anthropic, codex, google,
qwen) honor the flag.

Tests: 4 new (embed, reseed-survives, no-label fallback, end-to-end
through auth_add_command). All 390 nous/auth/credential_pool tests
pass.
2026-04-17 19:13:40 -07:00
Antoine Khater
c7fece1f9d fix: normalise Nous device-code pool source to avoid duplicates
Review feedback on the original commit: the helper wrote a pool entry
with source `manual:device_code` while `_seed_from_singletons()` upserts
with `device_code` (no `manual:` prefix), so the pool grew a duplicate
row on every `load_pool()` after login.

Normalise: the helper now writes `providers.nous` and delegates the pool
write entirely to `_seed_from_singletons()` via a follow-up
`load_pool()` call. The canonical source is `device_code`; the helper
never materialises a parallel `manual:device_code` entry.

- `persist_nous_credentials()` loses its `label` and `source` kwargs —
  both are now derived by the seed path from the singleton state.
- CLI and web dashboard call sites simplified accordingly.
- New test `test_persist_nous_credentials_idempotent_no_duplicate_pool_entries`
  asserts that two consecutive persists leave exactly one pool row and
  no stray `manual:` entries.
- Existing `test_auth_add_nous_oauth_persists_pool_entry` updated to
  assert the canonical source and single-entry invariant.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 19:13:40 -07:00
Antoine Khater
c096a6935f fix(auth): mirror Nous OAuth credentials to providers.nous on CLI login
`hermes auth add nous --type oauth` only wrote credential_pool.nous,
leaving providers.nous empty. When the Nous agent_key's 24h TTL expired,
run_agent.py's 401-recovery path called resolve_nous_runtime_credentials
(which reads providers.nous), got AuthError "Hermes is not logged into
Nous Portal", caught it as logger.debug (suppressed at INFO level), and
the agent died with "Non-retryable client error" — no signal to the
user that recovery even tried.

Introduce persist_nous_credentials() as the single source of truth for
Nous device-code login persistence. Both auth_commands (CLI) and
web_server (dashboard) now route through it, so pool and providers
stay in sync at write time.

Why: CLI-provisioned profiles couldn't recover from agent_key expiry,
producing silent daily outages 24h after first login. PR #6856/#6869
addressed adjacent issues but assumed providers.nous was populated;
this one wasn't being written.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-04-17 19:13:40 -07:00
Teknium
04a0c3cb95
fix(config): preserve env refs when save_config rewrites config (#11892)
Co-authored-by: binhnt92 <84617813+binhnt92@users.noreply.github.com>
2026-04-17 19:03:26 -07:00
Teknium
8444f66890
feat(hermes model): add Configure auxiliary models UI to hermes model (#11891)
Previously users had to hand-edit config.yaml to route individual auxiliary
tasks (vision, compression, web_extract, etc.) to a specific provider+model.
Add a first-class picker reachable from the bottom of the existing `hermes
model` provider list.

Flow:
  hermes model
    → Configure auxiliary models...
      → <task picker: 9 tasks, shows current setting inline>
        → <provider picker: authenticated providers + auto + custom>
          → <model picker: curated list + live pricing>

The aux picker does NOT re-run credential/OAuth setup; users authenticate
providers through the normal `hermes model` flow, then route aux tasks to
them here.  `list_authenticated_providers()` gates the list to providers
the user has configured.

Also:
  - 'Cancel' entry relabeled 'Leave unchanged' (sentinel still 'cancel'
    internally, so dispatch logic is unchanged)
  - 'Reset all to auto' entry to bulk-clear aux overrides; preserves
    user-tuned timeout / download_timeout values
  - Adds `title_generation` task to DEFAULT_CONFIG.auxiliary — the task
    was called from agent/title_generator.py but was missing from defaults,
    so config-backed timeout overrides never worked for it

Co-authored-by: teknium1 <teknium@nousresearch.com>
2026-04-17 19:02:06 -07:00
Sara Reynolds
8ab1aa2efc fix(gateway): fix discrepancies in gateway status 2026-04-17 18:58:29 -07:00
Teknium
304fb921bf
fix: two process leaks (agent-browser daemons, paste.rs sleepers) (#11843)
Both fixes close process leaks observed in production (18+ orphaned
agent-browser node daemons, 15+ orphaned paste.rs sleep interpreters
accumulated over ~3 days, ~2.7 GB RSS).

## agent-browser daemon leak

Previously the orphan reaper (_reap_orphaned_browser_sessions) only ran
from _start_browser_cleanup_thread, which is only invoked on the first
browser tool call in a process. Hermes sessions that never used the
browser never swept orphans, and the cross-process orphan detection
relied on in-process _active_sessions, which doesn't see other hermes
PIDs' sessions (race risk).

- Write <session>.owner_pid alongside the socket dir recording the
  hermes PID that owns the daemon (extracted into _write_owner_pid for
  direct testability).
- Reaper prefers owner_pid liveness over in-process _active_sessions.
  Cross-process safe: concurrent hermes instances won't reap each
  other's daemons. Legacy tracked_names fallback kept for daemons
  that predate owner_pid.
- atexit handler (_emergency_cleanup_all_sessions) now always runs
  the reaper, not just when this process had active sessions —
  every clean hermes exit sweeps accumulated orphans.

## paste.rs auto-delete leak

_schedule_auto_delete spawned a detached Python subprocess per call
that slept 6 hours then issued DELETE requests. No dedup, no tracking —
every 'hermes debug share' invocation added ~20 MB of resident Python
interpreters that stuck around until the sleep finished.

- Replaced the spawn with ~/.hermes/pastes/pending.json: records
  {url, expire_at} entries.
- _sweep_expired_pastes() synchronously DELETEs past-due entries on
  every 'hermes debug' invocation (run_debug() dispatcher).
- Network failures stay in pending.json for up to 24h, then give up
  (paste.rs's own retention handles the 'user never runs hermes again'
  edge case).
- Zero subprocesses; regression test asserts subprocess/Popen/time.sleep
  never appear in the function source (skipping docstrings via AST).

## Validation

|                              | Before        | After        |
|------------------------------|---------------|--------------|
| Orphan agent-browser daemons | 18 accumulated| 2 (live)     |
| paste.rs sleep interpreters  | 15 accumulated| 0            |
| RSS reclaimed                | -             | ~2.7 GB      |
| Targeted tests               | -             | 2253 pass    |

E2E verified: alive-owner daemons NOT reaped; dead-owner daemons
SIGTERM'd and socket dirs cleaned; pending.json sweep deletes expired
entries without spawning subprocesses.
2026-04-17 18:46:30 -07:00
Brooklyn Nicholson
aa583cb14e Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-17 17:51:40 -05:00
Teknium
c6fd2619f7
fix(gemini-cli): surface MODEL_CAPACITY_EXHAUSTED cleanly + drop retired gemma-4-26b (#11833)
Google-side 429 Code Assist errors now flow through Hermes' normal rate-limit
path (status_code on the exception, Retry-After preserved via error.response)
instead of being opaque RuntimeErrors. User sees a one-line capacity message
instead of a 500-char JSON dump.

Changes
- CodeAssistError grows status_code / response / retry_after / details attrs.
  _extract_status_code in error_classifier picks up status_code and classifies
  429 as FailoverReason.rate_limit, so fallback_providers triggers the same
  way it does for SDK errors. run_agent.py line ~10428 already walks
  error.response.headers for Retry-After — preserving the response means that
  path just works.
- _gemini_http_error parses the Google error envelope (error.status +
  error.details[].reason from google.rpc.ErrorInfo, retryDelay from
  google.rpc.RetryInfo). MODEL_CAPACITY_EXHAUSTED / RESOURCE_EXHAUSTED / 404
  model-not-found each produce a human-readable message; unknown shapes fall
  back to the previous raw-body format.
- Drop gemma-4-26b-it from hermes_cli/models.py, hermes_cli/setup.py, and
  agent/model_metadata.py — Google returned 404 for it today in local repro.
  Kept gemma-4-31b-it (capacity-constrained but not retired).

Validation
|                           | Before                         | After                                     |
|---------------------------|--------------------------------|-------------------------------------------|
| Error message             | 'Code Assist returned HTTP 429: {500 chars JSON}' | 'Gemini capacity exhausted for gemini-2.5-pro (Google-side throttle...)' |
| status_code on error      | None (opaque RuntimeError)     | 429                                       |
| Classifier reason         | unknown (string-match fallback) | FailoverReason.rate_limit                |
| Retry-After honored       | ignored                        | extracted from RetryInfo or header        |
| gemma-4-26b-it picker     | advertised (404s on Google)    | removed                                   |

Unit + E2E tests cover non-streaming 429, streaming 429, 404 model-not-found,
Retry-After header fallback, malformed body, and classifier integration.
Targeted suites: tests/agent/test_gemini_cloudcode.py (81 tests), full
tests/hermes_cli (2203 tests) green.

Co-authored-by: teknium1 <teknium@nousresearch.com>
2026-04-17 15:34:12 -07:00
Teknium
3207b9bda0
test: speed up slow tests (backoff + subprocess + IMDS network) (#11797)
Cuts shard-3 local runtime in half by neutralizing real wall-clock
waits across three classes of slow test:

## 1. Retry backoff mocks

- tests/run_agent/conftest.py (NEW): autouse fixture mocks
  jittered_backoff to 0.0 so the `while time.time() < sleep_end`
  busy-loop exits immediately. No global time.sleep mock (would
  break threading tests).
- test_anthropic_error_handling, test_413_compression,
  test_run_agent_codex_responses, test_fallback_model: per-file
  fixtures mock time.sleep / asyncio.sleep for retry / compression
  paths.
- test_retaindb_plugin: cap the retaindb module's bound time.sleep
  to 0.05s via a per-test shim (background writer-thread retries
  sleep 2s after errors; tests don't care about exact duration).
  Plus replace arbitrary time.sleep(N) waits with short polling
  loops bounded by deadline.

## 2. Subprocess sleeps in production code

- test_update_gateway_restart: mock time.sleep. Production code
  does time.sleep(3) after `systemctl restart` to verify the
  service survived. Tests mock subprocess.run \u2014 nothing actually
  restarts \u2014 so the wait is dead time.

## 3. Network / IMDS timeouts (biggest single win)

- tests/conftest.py: add AWS_EC2_METADATA_DISABLED=true plus
  AWS_METADATA_SERVICE_TIMEOUT=1 and ATTEMPTS=1. boto3 falls back
  to IMDS (169.254.169.254) when no AWS creds are set. Any test
  hitting has_aws_credentials() / resolve_aws_auth_env_var() (e.g.
  test_status, test_setup_copilot_acp, anything that touches
  provider auto-detect) burned ~2-4s waiting for that to time out.
- test_exit_cleanup_interrupt: explicitly mock
  resolve_runtime_provider which was doing real network auto-detect
  (~4s). Tests don't care about provider resolution \u2014 the agent
  is already mocked.
- test_timezone: collapse the 3-test "TZ env in subprocess" suite
  into 2 tests by checking both injection AND no-leak in the same
  subprocess spawn (was 3 \u00d7 3.2s, now 2 \u00d7 4s).

## Validation

| Test | Before | After |
|---|---|---|
| test_anthropic_error_handling (8 tests) | ~80s | ~15s |
| test_413_compression (14 tests) | ~18s | 2.3s |
| test_retaindb_plugin (67 tests) | ~13s | 1.3s |
| test_status_includes_tavily_key | 4.0s | 0.05s |
| test_setup_copilot_acp_skips_same_provider_pool_step | 8.0s | 0.26s |
| test_update_gateway_restart (5 tests) | ~18s total | ~0.35s total |
| test_exit_cleanup_interrupt (2 tests) | 8s | 1.5s |
| **Matrix shard 3 local** | **108s** | **50s** |

No behavioral contract changed \u2014 tests still verify retry happens,
service restart logic runs, etc.; they just don't burn real seconds
waiting for it.

Supersedes PR #11779 (those changes are included here).
2026-04-17 14:21:22 -07:00
asurla
3b569ff576 feat(providers): add native NVIDIA NIM provider
Adds NVIDIA NIM as a first-class provider: ProviderConfig in
auth.py, HermesOverlay in providers.py, curated models
(Nemotron plus other open source models hosted on
build.nvidia.com), URL mapping in model_metadata.py, aliases
(nim, nvidia-nim, build-nvidia, nemotron), and env var tests.

Docs updated: providers page, quickstart table, fallback
providers table, and README provider list.
2026-04-17 13:47:46 -07:00
Brooklyn Nicholson
d5b9db8b4a Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-17 15:13:36 -05:00
Teknium
d0e1388ca9
fix(tests): make AIAgent constructor calls self-contained (#11755)
* fix(tests): make AIAgent constructor calls self-contained (no env leakage)

Tests in tests/run_agent/ were constructing AIAgent() without passing
both api_key and base_url, then relying on leaked state from other
tests in the same xdist worker (or process-level env vars) to keep
provider resolution happy. Under hermetic conftest + pytest-split,
that state is gone and the tests fail with 'No LLM provider configured'.

Fix: pass both api_key and base_url explicitly on 47 AIAgent()
construction sites across 13 files. AIAgent.__init__ with both set
takes the direct-construction path (line 960 in run_agent.py) and
skips the resolver entirely.

One call site (test_none_base_url_passed_as_none) left alone — that
test asserts behavior for base_url=None specifically.

This is a prerequisite for any future matrix-split or stricter
isolation work, and lands cleanly on its own.

Validation:
- tests/run_agent/ full: 760 passed, 0 failed (local)
- Previously relied on cross-test pollution; now self-contained

* fix(tests): update opencode-go model order assertion to match kimi-k2.5-first

commit 78a74bb promoted kimi-k2.5 to first position in model suggestion
lists but didn't update this test, which has been failing on main since.
Reorder expected list to match the new canonical order.
2026-04-17 12:32:03 -07:00
Brooklyn Nicholson
1f37ef2fd1 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-17 08:59:33 -05:00
Jorge
86f02d8d71 refactor(cli): align model picker viewport with PR #11260 vocabulary
Match the row-budget naming introduced in PR #11260 for the approval and
clarify panels: rename chrome_reserve=14 into reserved_below=6 (input
chrome below the panel) + panel_chrome=6 (this panel's borders, blanks,
and hint row) + min_visible=3 (floor on visible items). Same arithmetic
as before, but a reviewer reading both files now sees the same handle.

Compact-chrome mode is intentionally not adopted — that pattern fits the
"fixed mandatory content might overflow" shape of approval/clarify
(solved by truncating with a marker), whereas the picker's overflow is
already handled by the scrolling viewport.
2026-04-17 06:33:21 -07:00
Jorge
5fbe16635b fix(cli): scroll the /model picker viewport so long catalogs aren't clipped
The /model picker rendered every choice into a prompt_toolkit Window
with no max height. Providers with many models (e.g. Ollama Cloud's 36+)
overflowed the terminal, clipping the bottom border and the last items.

- Add HermesCLI._compute_model_picker_viewport() to slide a scroll
  offset that keeps the cursor on screen, sized from the live terminal
  rows minus chrome reserved for input/status/border.
- Render only the visible slice in _get_model_picker_display() and
  persist the offset on _model_picker_state across redraws.
- Bind ESC (eager) to close the picker, matching the Cancel button.
- Cover the viewport math with 8 unit tests in
  tests/hermes_cli/test_model_picker_viewport.py.
2026-04-17 06:33:21 -07:00
Teknium
c60b6dc317 test(dingtalk): cover get_connected_platforms + null platform_toolsets
Follow-ups to the salvaged commits in this PR:

* gateway/config.py — strip trailing whitespace from youngDoo's diff
  (line 315 had ~140 trailing spaces).

* hermes_cli/tools_config.py — replace `config.get("platform_toolsets", {})`
  with `config.get("platform_toolsets") or {}`. Handles the case where the
  YAML key is present but explicitly null (parses as None, previously
  crashed with AttributeError on the next line's .get(platform)).
  Cherry-picked from yyq4193's #9003 with attribution.

* tests/gateway/test_config.py — 4 new tests for TestGetConnectedPlatforms
  covering DingTalk via extras, via env vars, disabled, and missing creds.

* tests/hermes_cli/test_tools_config.py — regression test for the null
  platform_toolsets edge case.

* scripts/release.py — add kagura-agent, youngDoo, yyq4193 to AUTHOR_MAP.

Co-authored-by: yyq4193 <39405770+yyq4193@users.noreply.github.com>
2026-04-17 06:26:18 -07:00
Teknium
d404849351
test: make test env hermetic; enforce CI parity via scripts/run_tests.sh (#11577)
* test: make test env hermetic; enforce CI parity via scripts/run_tests.sh

Fixes the recurring 'works locally, fails in CI' (and vice versa) class
of flakes by making tests hermetic and providing a canonical local runner
that matches CI's environment.

## Layer 1 — hermetic conftest.py (tests/conftest.py)

Autouse fixture now unsets every credential-shaped env var before every
test, so developer-local API keys can't leak into tests that assert
'auto-detect provider when key present'.

Pattern: unset any var ending in _API_KEY, _TOKEN, _SECRET, _PASSWORD,
_CREDENTIALS, _ACCESS_KEY, _PRIVATE_KEY, etc. Plus an explicit list of
credential names that don't fit the suffix pattern (AWS_ACCESS_KEY_ID,
FAL_KEY, GH_TOKEN, etc.) and all the provider BASE_URL overrides that
change auto-detect behavior.

Also unsets HERMES_* behavioral vars (HERMES_YOLO_MODE, HERMES_QUIET,
HERMES_SESSION_*, etc.) that mutate agent behavior.

Also:
  - Redirects HOME to a per-test tempdir (not just HERMES_HOME), so
    code reading ~/.hermes/* directly can't touch the real dir.
  - Pins TZ=UTC, LANG=C.UTF-8, LC_ALL=C.UTF-8, PYTHONHASHSEED=0 to
    match CI's deterministic runtime.

The old _isolate_hermes_home fixture name is preserved as an alias so
any test that yields it explicitly still works.

## Layer 2 — scripts/run_tests.sh canonical runner

'Always use scripts/run_tests.sh, never call pytest directly' is the
new rule (documented in AGENTS.md). The script:
  - Unsets all credential env vars (belt-and-suspenders for callers
    who bypass conftest — e.g. IDE integrations)
  - Pins TZ/LANG/PYTHONHASHSEED
  - Uses -n 4 xdist workers (matches GHA ubuntu-latest; -n auto on
    a 20-core workstation surfaces test-ordering flakes CI will never
    see, causing the infamous 'passes in CI, fails locally' drift)
  - Finds the venv in .venv, venv, or main checkout's venv
  - Passes through arbitrary pytest args

Installs pytest-split on demand so the script can also be used to run
matrix-split subsets locally for debugging.

## Remove 3 module-level dotenv stubs that broke test isolation

tests/hermes_cli/test_{arcee,xiaomi,api_key}_provider.py each had a
module-level:

    if 'dotenv' not in sys.modules:
        fake_dotenv = types.ModuleType('dotenv')
        fake_dotenv.load_dotenv = lambda *a, **kw: None
        sys.modules['dotenv'] = fake_dotenv

This patches sys.modules['dotenv'] to a fake at import time with no
teardown. Under pytest-xdist LoadScheduling, whichever worker collected
one of these files first poisoned its sys.modules; subsequent tests in
the same worker that imported load_dotenv transitively (e.g.
test_env_loader.py via hermes_cli.env_loader) got the no-op lambda and
saw their assertions fail.

dotenv is a required dependency (python-dotenv>=1.2.1 in pyproject.toml),
so the defensive stub was never needed. Removed.

## Validation

- tests/hermes_cli/ alone: 2178 passed, 1 skipped, 0 failed (was 4
  failures in test_env_loader.py before this fix)
- tests/test_plugin_skills.py, tests/hermes_cli/test_plugins.py,
  tests/test_hermes_logging.py combined: 123 passed (the caplog
  regression tests from PR #11453 still pass)
- Local full run shows no F/E clusters in the 0-55% range that were
  previously present before the conftest hardening

## Background

See AGENTS.md 'Testing' section for the full list of drift sources
this closes. Matrix split (closed as #11566) will be re-attempted
once this foundation lands — cross-test pollution was the root cause
of the shard-3 hang in that PR.

* fix(conftest): don't redirect HOME — it broke CI subprocesses

PR #11577's autouse fixture was setting HOME to a per-test tempdir.
CI started timing out at 97% complete with dozens of E/F markers and
orphan python processes at cleanup — tests (or transitive deps)
spawn subprocesses that expect a stable HOME, and the redirect broke
them in non-obvious ways.

Env-var unsetting and TZ/LANG/hashseed pinning (the actual CI-drift
fixes) are unchanged and still in place. HERMES_HOME redirection is
also unchanged — that's the canonical way to isolate tests from
~/.hermes/, not HOME.

Any code in the codebase reading ~/.hermes/* via `Path.home() / ".hermes"`
instead of `get_hermes_home()` is a bug to fix at the callsite, not
something to paper over in conftest.
2026-04-17 06:09:09 -07:00
Teknium
13f2d997b0 test(dingtalk): cover QR device-flow auth + OpenClaw branding disclosure
Adds 15 regression tests for hermes_cli/dingtalk_auth.py covering:
  * _api_post — network error mapping, errcode-nonzero mapping, success path
  * begin_registration — 2-step chain, missing-nonce/device_code/uri
    error cases
  * wait_for_registration_success — success path, missing-creds guard,
    on_waiting callback invocation
  * render_qr_to_terminal — returns False when qrcode missing, prints
    when available
  * Configuration — BASE_URL default + override, SOURCE default

Also adds a one-line disclosure in dingtalk_qr_auth() telling users
the scan page will be OpenClaw-branded. Interim measure: DingTalk's
registration portal is hardcoded to route all sources to /openapp/
registration/openClaw, so users see OpenClaw branding regardless of
what 'source' value we send. We keep 'openClaw' as the source token
until DingTalk-Real-AI registers a Hermes-specific template.

Also adds meng93 to scripts/release.py AUTHOR_MAP.
2026-04-17 05:08:07 -07:00
Teknium
29d5d36b14
fix(copilot): normalize vendor-prefixed and dash-notation model IDs (#6879) (#11561)
The Copilot API returns HTTP 400 "model_not_supported" when it receives a
model ID it doesn't recognize (vendor-prefixed like
`anthropic/claude-sonnet-4.6` or dash-notation like `claude-sonnet-4-6`).
Two bugs combined to leave both formats unhandled:

1. `_COPILOT_MODEL_ALIASES` in hermes_cli/models.py only covered bare
   dot-notation and vendor-prefixed dot-notation.  Hermes' default Claude
   IDs elsewhere use hyphens (anthropic native format), and users with an
   aggregator-style config who switch `model.provider` to `copilot`
   inherit `anthropic/claude-X-4.6` — neither case was in the table.

2. The Copilot branch of `normalize_model_for_provider()` only stripped
   the vendor prefix when it matched the target provider (`copilot/`) or
   was the special-cased `openai/` for openai-codex.  Every other vendor
   prefix survived to the Copilot request unchanged.

Fix:

- Add dash-notation aliases (`claude-{opus,sonnet,haiku}-4-{5,6}` and the
  `anthropic/`-prefixed variants) to the alias table.
- Rewire the Copilot / Copilot-ACP branch of
  `normalize_model_for_provider()` to delegate to the existing
  `normalize_copilot_model_id()`.  That function already does alias
  lookups, catalog-aware resolution, and vendor-prefix fallback — it was
  being bypassed for the generic normalisation entry point.

Because `switch_model()` already calls `normalize_model_for_provider()`
for every `/model` switch (line 685 in model_switch.py), this single fix
covers the CLI startup path (cli.py), the `/model` slash command path,
and the gateway load-from-config path.

Closes #6879

Credits dsr-restyn (#6743) who independently diagnosed the dash-notation
case; their aliases are folded into this consolidated fix alongside the
vendor-prefix stripping repair.
2026-04-17 04:19:36 -07:00
Teknium
f268215019
fix(auth): codex auth remove no longer silently undone by auto-import (#11485)
* feat(skills): add 'hermes skills reset' to un-stick bundled skills

When a user edits a bundled skill, sync flags it as user_modified and
skips it forever. The problem: if the user later tries to undo the edit
by copying the current bundled version back into ~/.hermes/skills/, the
manifest still holds the old origin hash from the last successful
sync, so the fresh bundled hash still doesn't match and the skill stays
stuck as user_modified.

Adds an escape hatch for this case.

  hermes skills reset <name>
      Drops the skill's entry from ~/.hermes/skills/.bundled_manifest and
      re-baselines against the user's current copy. Future 'hermes update'
      runs accept upstream changes again. Non-destructive.

  hermes skills reset <name> --restore
      Also deletes the user's copy and re-copies the bundled version.
      Use when you want the pristine upstream skill back.

Also available as /skills reset in chat.

- tools/skills_sync.py: new reset_bundled_skill(name, restore=False)
- hermes_cli/skills_hub.py: do_reset() + wired into skills_command and
  handle_skills_slash; added to the slash /skills help panel
- hermes_cli/main.py: argparse entry for 'hermes skills reset'
- tests/tools/test_skills_sync.py: 5 new tests covering the stuck-flag
  repro, --restore, unknown-skill error, upstream-removed-skill, and
  no-op on already-clean state
- website/docs/user-guide/features/skills.md: new 'Bundled skill updates'
  section explaining the origin-hash mechanic + reset usage

* fix(auth): codex auth remove no longer silently undone by auto-import

'hermes auth remove openai-codex' appeared to succeed but the credential
reappeared on the next command.  Two compounding bugs:

1. _seed_from_singletons() for openai-codex unconditionally re-imports
   tokens from ~/.codex/auth.json whenever the Hermes auth store is
   empty (by design — the Codex CLI and Hermes share that file).  There
   was no suppression check, unlike the claude_code seed path.

2. auth_remove_command's cleanup branch only matched
   removed.source == 'device_code' exactly.  Entries added via
   'hermes auth add openai-codex' have source 'manual:device_code', so
   for those the Hermes auth store's providers['openai-codex'] state was
   never cleared on remove — the next load_pool() re-seeded straight
   from there.

Net effect: there was no way to make a codex removal stick short of
manually editing both ~/.hermes/auth.json and ~/.codex/auth.json before
opening Hermes again.

Fix:

- Add unsuppress_credential_source() helper (mirrors
  suppress_credential_source()).
- Gate the openai-codex branch in _seed_from_singletons() with
  is_source_suppressed(), matching the claude_code pattern.
- Broaden auth_remove_command's codex match to handle both
  'device_code' and 'manual:device_code' (via endswith check), always
  call suppress_credential_source(), and print guidance about the
  unchanged ~/.codex/auth.json file.
- Clear the suppression marker in auth_add_command's openai-codex
  branch so re-linking via 'hermes auth add openai-codex' works.

~/.codex/auth.json is left untouched — that's the Codex CLI's own
credential store, not ours to delete.

Tests cover: unsuppress helper behavior, remove of both source
variants, add clears suppression, seed respects suppression.  E2E
verified: remove → load → add → load flow now behaves correctly.
2026-04-17 04:10:17 -07:00
Teknium
2367c6ffd5
test: remove 169 change-detector tests across 21 files (#11472)
First pass of test-suite reduction to address flaky CI and bloat.

Removed tests that fall into these change-detector patterns:

1. Source-grep tests (tests/gateway/test_feishu.py, test_email.py): tests
   that call inspect.getsource() on production modules and grep for string
   literals. Break on any refactor/rename even when behavior is correct.

2. Platform enum tautologies (every gateway/test_X.py): assertions like
   `Platform.X.value == 'x'` duplicated across ~9 adapter test files.

3. Toolset/PLATFORM_HINTS/setup-wizard registry-presence checks: tests that
   only verify a key exists in a dict. Data-layout tests, not behavior.

4. Argparse wiring tests (test_argparse_flag_propagation, test_subparser_routing
   _fallback): tests that do parser.parse_args([...]) then assert args.field.
   Tests Python's argparse, not our code.

5. Pure dispatch tests (test_plugins_cmd.TestPluginsCommandDispatch): patch
   cmd_X, call plugins_command with matching action, assert mock called.
   Tests the if/elif chain, not behavior.

6. Kwarg-to-mock verification (test_auxiliary_client ~45 tests,
   test_web_tools_config, test_gemini_cloudcode, test_retaindb_plugin): tests
   that mock the external API client, call our function, and assert exact
   kwargs. Break on refactor even when behavior is preserved.

7. Schedule-internal "function-was-called" tests (acp/test_server scheduling
   tests): tests that patch own helper method, then assert it was called.

Kept behavioral tests throughout: error paths (pytest.raises), security
tests (path traversal, SSRF, redaction), message alternation invariants,
provider API format conversion, streaming logic, memory contract, real
config load/merge tests.

Net reduction: 169 tests removed. 38 empty classes cleaned up.

Collected before: 12,522 tests
Collected after:  12,353 tests
2026-04-17 01:05:09 -07:00
Teknium
3f74dafaee
fix(nous): respect 'Skip (keep current)' after OAuth login (#11476)
* feat(skills): add 'hermes skills reset' to un-stick bundled skills

When a user edits a bundled skill, sync flags it as user_modified and
skips it forever. The problem: if the user later tries to undo the edit
by copying the current bundled version back into ~/.hermes/skills/, the
manifest still holds the old origin hash from the last successful
sync, so the fresh bundled hash still doesn't match and the skill stays
stuck as user_modified.

Adds an escape hatch for this case.

  hermes skills reset <name>
      Drops the skill's entry from ~/.hermes/skills/.bundled_manifest and
      re-baselines against the user's current copy. Future 'hermes update'
      runs accept upstream changes again. Non-destructive.

  hermes skills reset <name> --restore
      Also deletes the user's copy and re-copies the bundled version.
      Use when you want the pristine upstream skill back.

Also available as /skills reset in chat.

- tools/skills_sync.py: new reset_bundled_skill(name, restore=False)
- hermes_cli/skills_hub.py: do_reset() + wired into skills_command and
  handle_skills_slash; added to the slash /skills help panel
- hermes_cli/main.py: argparse entry for 'hermes skills reset'
- tests/tools/test_skills_sync.py: 5 new tests covering the stuck-flag
  repro, --restore, unknown-skill error, upstream-removed-skill, and
  no-op on already-clean state
- website/docs/user-guide/features/skills.md: new 'Bundled skill updates'
  section explaining the origin-hash mechanic + reset usage

* fix(nous): respect 'Skip (keep current)' after OAuth login

When a user already set up on another provider (e.g. OpenRouter) runs
`hermes model` and picks Nous Portal, OAuth succeeds and then a model
picker is shown.  If the user picks 'Skip (keep current)', the previous
provider + model should be preserved.

Previously, \_update_config_for_provider was called unconditionally after
login, which flipped config.yaml model.provider to 'nous' while keeping
the old model.default (e.g. anthropic/claude-opus-4.6 from OpenRouter),
leaving the user with a mismatched provider/model pair on the next
request.

Fix: snapshot the prior active_provider before login, and if no model is
selected (Skip, or no models available, or fetch failure), restore the
prior active_provider and leave config.yaml untouched.  The Nous OAuth
tokens stay saved so future `hermes model` -> Nous works without
re-authenticating.

Test plan:
- New tests cover Skip path (preserves provider+model, saves creds),
  pick-a-model path (switches to nous), and fresh-install Skip path
  (active_provider cleared, not stuck as 'nous').
2026-04-17 00:52:42 -07:00
Teknium
a55a133387
fix(tests): attach caplog to specific logger in 3 order-dependent tests (#11453)
Three tests in tests/test_plugin_skills.py and tests/hermes_cli/test_plugins.py
used caplog.at_level(logging.WARNING) without specifying a logger. When another
test earlier in the same xdist worker touched propagation on tools.skills_tool
or hermes_cli.plugins, caplog would miss the warning and the assertion would
fail intermittently in CI.

These three tests accounted for 15 of the last ~30 Tests workflow failures
(5 each), including the recent main failure on commit 436a7359 (PR #11398).

Fix: pass logger="tools.skills_tool" / logger="hermes_cli.plugins" to
caplog.at_level() so the handler attaches directly to the logger under test
and capture is independent of global propagation state.

Affected tests:
- tests/test_plugin_skills.py::TestSkillViewPluginGuards::test_injection_logged_but_served
- tests/hermes_cli/test_plugins.py::TestPluginCommands::test_register_command_empty_name_rejected
- tests/hermes_cli/test_plugins.py::TestPluginCommands::test_register_command_builtin_conflict_rejected

No production code change. Verified passing under xdist (-n 4) alongside
test_hermes_logging.py (the test most likely to poison the logger state).
2026-04-17 00:20:40 -07:00
Teknium
70768665a4
fix(mcp): consolidate OAuth handling, pick up external token refreshes (#11383)
* feat(mcp-oauth): scaffold MCPOAuthManager

Central manager for per-server MCP OAuth state. Provides
get_or_build_provider (cached), remove (evicts cache + deletes
disk), invalidate_if_disk_changed (mtime watch, core fix for
external-refresh workflow), and handle_401 (dedup'd recovery).

No behavior change yet — existing call sites still use
build_oauth_auth directly. Task 1 of 8 in the MCP OAuth
consolidation (fixes Cthulhu's BetterStack reliability issues).

* feat(mcp-oauth): add HermesMCPOAuthProvider with pre-flow disk watch

Subclasses the MCP SDK's OAuthClientProvider to inject a disk
mtime check before every async_auth_flow, via the central
manager. When a subclass instance is used, external token
refreshes (cron, another CLI instance) are picked up before
the next API call.

Still dead code: the manager's _build_provider still delegates
to build_oauth_auth and returns the plain OAuthClientProvider.
Task 4 wires this subclass in. Task 2 of 8.

* refactor(mcp-oauth): extract build_oauth_auth helpers

Decomposes build_oauth_auth into _configure_callback_port,
_build_client_metadata, _maybe_preregister_client, and
_parse_base_url. Public API preserved. These helpers let
MCPOAuthManager._build_provider reuse the same logic in Task 4
instead of duplicating the construction dance.

Also updates the SDK version hint in the warning from 1.10.0 to
1.26.0 (which is what we actually require for the OAuth types
used here). Task 3 of 8.

* feat(mcp-oauth): manager now builds HermesMCPOAuthProvider directly

_build_provider constructs the disk-watching subclass using the
helpers from Task 3, instead of delegating to the plain
build_oauth_auth factory. Any consumer using the manager now gets
pre-flow disk-freshness checks automatically.

build_oauth_auth is preserved as the public API for backwards
compatibility. The code path is now:

    MCPOAuthManager.get_or_build_provider  ->
      _build_provider  ->
        _configure_callback_port
        _build_client_metadata
        _maybe_preregister_client
        _parse_base_url
        HermesMCPOAuthProvider(...)

Task 4 of 8.

* feat(mcp): wire OAuth manager + add _reconnect_event

MCPServerTask gains _reconnect_event alongside _shutdown_event.
When set, _run_http / _run_stdio exit their async-with blocks
cleanly (no exception), and the outer run() loop re-enters the
transport to rebuild the MCP session with fresh credentials.
This is the recovery path for OAuth failures that the SDK's
in-place httpx.Auth cannot handle (e.g. cron externally consumed
the refresh_token, or server-side session invalidation).

_run_http now asks MCPOAuthManager for the OAuth provider
instead of calling build_oauth_auth directly. Config-time,
runtime, and reconnect paths all share one provider instance
with pre-flow disk-watch active.

shutdown() defensively sets both events so there is no race
between reconnect and shutdown signalling.

Task 5 of 8.

* feat(mcp): detect auth failures in tool handlers, trigger reconnect

All 5 MCP tool handlers (tool call, list_resources, read_resource,
list_prompts, get_prompt) now detect auth failures and route
through MCPOAuthManager.handle_401:

  1. If the manager says recovery is viable (disk has fresh tokens,
     or SDK can refresh in-place), signal MCPServerTask._reconnect_event
     to tear down and rebuild the MCP session with fresh credentials,
     then retry the tool call once.

  2. If no recovery path exists, return a structured needs_reauth
     JSON error so the model stops hallucinating manual refresh
     attempts (the 'let me curl the token endpoint' loop Cthulhu
     pasted from Discord).

_is_auth_error catches OAuthFlowError, OAuthTokenError,
OAuthNonInteractiveError, and httpx.HTTPStatusError(401). Non-auth
exceptions still surface via the generic error path unchanged.

Task 6 of 8.

* feat(mcp-cli): route add/remove through manager, add 'hermes mcp login'

cmd_mcp_add and cmd_mcp_remove now go through MCPOAuthManager
instead of calling build_oauth_auth / remove_oauth_tokens
directly. This means CLI config-time state and runtime MCP
session state are backed by the same provider cache — removing
a server evicts the live provider, adding a server populates
the same cache the MCP session will read from.

New 'hermes mcp login <name>' command:
  - Wipes both the on-disk tokens file and the in-memory
    MCPOAuthManager cache
  - Triggers a fresh OAuth browser flow via the existing probe
    path
  - Intended target for the needs_reauth error Task 6 returns
    to the model

Task 7 of 8.

* test(mcp-oauth): end-to-end integration tests

Five new tests exercising the full consolidation with real file
I/O and real imports (no transport mocks):

  1. external_refresh_picked_up_without_restart — Cthulhu's cron
     workflow. External process writes fresh tokens to disk;
     on the next auth flow the manager's mtime-watch flips
     _initialized and the SDK re-reads from storage.

  2. handle_401_deduplicates_concurrent_callers — 10 concurrent
     handlers for the same failed token fire exactly ONE recovery
     attempt (thundering-herd protection).

  3. handle_401_returns_false_when_no_provider — defensive path
     for unknown servers.

  4. invalidate_if_disk_changed_handles_missing_file — pre-auth
     state returns False cleanly.

  5. provider_is_reused_across_reconnects — cache stickiness so
     reconnects preserve the disk-watch baseline mtime.

Task 8 of 8 — consolidation complete.
2026-04-16 21:57:10 -07:00
Teknium
24fa055763
fix(ci): resolve 4 pre-existing main failures (docs lint + 3 stale tests) (#11373)
* docs: fix ascii-guard border alignment errors

Three docs pages had ASCII diagram boxes with off-by-one column
alignment issues that failed docs-site-checks CI:

- architecture.md: outer box is 71 cols but inner-box content lines
  and border corners were offset by 1 col, making content-line right
  border at col 70/72 while top/bottom border was at col 71. Inner
  boxes also had border corners at cols 19/36/53 but content pipes
  at cols 20/37/54. Rewrote the diagram with consistent 71-col width
  throughout, aligned inner boxes at cols 4-19, 22-37, 40-55 with
  2-space gaps and 15-space trailing padding.

- gateway-internals.md: same class of issue — outer box at 51 cols,
  inner content lines varied 52-54 cols. Rewrote with consistent
  51-col width, inner boxes at cols 4-15, 18-29, 32-43. Also
  restructured the bottom-half message flow so it's bare text
  (not half-open box cells) matching the intent of the original.

- agent-loop.md line 112-114: box 2 (API thread) content lines had
  one extra space pushing the right border to col 46 while the top
  and bottom borders of that box sat at col 45. Trimmed one trailing
  space from each of the three content lines.

All 123 docs files now pass `npm run lint:diagrams`:
  ✓ Errors: 0  (warnings: 6, non-fatal)

Pre-existing failures on main — unrelated to any open PR.

* test(setup): accept description kwarg in prompt_choice mock lambdas

setup.py's `_curses_prompt_choice` gained an optional `description`
parameter (used for rendering context hints alongside the prompt).
`prompt_choice` forwards it via keyword arg. The two existing tests
mocked `_curses_prompt_choice` with lambdas that didn't accept the
new kwarg, so the forwarded call raised TypeError.

Fix: add `description=None` to both mock lambda signatures so they
absorb the new kwarg without changing behavior.

* test(matrix): update stale audio-caching assertion

test_regular_audio_has_http_url asserted that non-voice audio
messages keep their HTTP URL and are NOT downloaded/cached. That
was true when the caching code only triggered on
`is_voice_message`. Since bec02f37 (encrypted-media caching
refactor), matrix.py caches all media locally — photos, audio,
video, documents — so downstream tools can read them as real
files via media_urls. This applies to regular audio too.

Renamed the test to `test_regular_audio_is_cached_locally`,
flipped the assertions accordingly, and documented the
intentional behavior change in the docstring. Other tests in
the file (voice-specific caching, message-type detection,
reply-to threading) continue to pass.

* test(413): allow multi-pass preflight compression

run_agent.py's preflight compression runs up to 3 passes in a loop
for very large sessions (each pass summarizes the middle N turns,
then re-checks tokens). The loop breaks when a pass returns a
message list no shorter than its input (can't compress further).

test_preflight_compresses_oversized_history used a static mock
return value that returned the same 2 messages regardless of input,
so the loop ran pass 1 (41 -> 2) and pass 2 (2 -> 2 -> break),
making call_count == 2. The assert_called_once() assertion was
strictly wrong under the multi-pass design.

The invariant the test actually cares about is: preflight ran, and
its first invocation received the full oversized history. Replaced
the count assertion with those two invariants.

* docs: drop '...' from gateway diagram, merge side-by-side boxes

ascii-guard 2.3.0 flagged two remaining issues after the initial fix
pass:

1. gateway-internals.md L33: the '...' suffix after inner box 3's
   right border got parsed as 'extra characters after inner-box right
   border'. Dropped the '...' — the surrounding prose already conveys
   'and more platforms' without needing the visual hint.

2. agent-loop.md: ascii-guard can't cleanly parse two side-by-side
   boxes of different heights (main thread 7 rows, API thread 5 rows).
   Even equalizing heights didn't help — the linter treats the left
   box's right border as the end of the diagram. Merged into a single
   54-char-wide outer box with both threads labeled as regions inside,
   keeping the ▶ arrow to preserve the main→API flow direction.
2026-04-16 20:43:41 -07:00
Brooklyn Nicholson
41d3d7afb7 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-16 22:35:27 -05:00
Teknium
01906e99dd
feat(image_gen): multi-model FAL support with picker in hermes tools (#11265)
* feat(image_gen): multi-model FAL support with picker in hermes tools

Adds 8 FAL text-to-image models selectable via `hermes tools` →
Image Generation → (FAL.ai | Nous Subscription) → model picker.

Models supported:
- fal-ai/flux-2/klein/9b (new default, <1s, $0.006/MP)
- fal-ai/flux-2-pro (previous default, kept backward-compat upscaling)
- fal-ai/z-image/turbo (Tongyi-MAI, bilingual EN/CN)
- fal-ai/nano-banana (Gemini 2.5 Flash Image)
- fal-ai/gpt-image-1.5 (with quality tier: low/medium/high)
- fal-ai/ideogram/v3 (best typography)
- fal-ai/recraft-v3 (vector, brand styles)
- fal-ai/qwen-image (LLM-based)

Architecture:
- FAL_MODELS catalog declares per-model size family, defaults, supports
  whitelist, and upscale flag. Three size families handled uniformly:
  image_size_preset (flux family), aspect_ratio (nano-banana), and
  gpt_literal (gpt-image-1.5).
- _build_fal_payload() translates unified inputs (prompt + aspect_ratio)
  into model-specific payloads, merges defaults, applies caller overrides,
  wires GPT quality_setting, then filters to the supports whitelist — so
  models never receive rejected keys.
- IMAGEGEN_BACKENDS registry in tools_config prepares for future imagegen
  providers (Replicate, Stability, etc.); each provider entry tags itself
  with imagegen_backend: 'fal' to select the right catalog.
- Upscaler (Clarity) defaults off for new models (preserves <1s value
  prop), on for flux-2-pro (backward-compat). Per-model via FAL_MODELS.

Config:
  image_gen.model           = fal-ai/flux-2/klein/9b  (new)
  image_gen.quality_setting = medium                  (new, GPT only)
  image_gen.use_gateway     = bool                    (existing)

Agent-facing schema unchanged (prompt + aspect_ratio only) — model
choice is a user-level config decision, not an agent-level arg.

Picker uses curses_radiolist (arrow keys, auto numbered-fallback on
non-TTY). Column-aligned: Model / Speed / Strengths / Price.

Docs: image-generation.md rewritten with the model table and picker
walkthrough. tools-reference, tool-gateway, overview updated to drop
the stale "FLUX 2 Pro" wording.

Tests: 42 new in tests/tools/test_image_generation.py covering catalog
integrity, all 3 size families, supports filter, default merging, GPT
quality wiring, model resolution fallback. 8 new in
tests/hermes_cli/test_tools_config.py for picker wiring (registry,
config writes, GPT quality follow-up prompt, corrupt-config repair).

* feat(image_gen): translate managed-gateway 4xx to actionable error

When the Nous Subscription managed FAL proxy rejects a model with 4xx
(likely portal-side allowlist miss or billing gate), surface a clear
message explaining:
  1. The rejected model ID + HTTP status
  2. Two remediation paths: set FAL_KEY for direct access, or
     pick a different model via `hermes tools`

5xx, connection errors, and direct-FAL errors pass through unchanged
(those have different root causes and reasonable native messages).

Motivation: new FAL models added to this release (flux-2-klein-9b,
z-image-turbo, nano-banana, gpt-image-1.5, ideogram-v3, recraft-v3,
qwen-image) are untested against the Nous Portal proxy. If the portal
allowlists model IDs, users on Nous Subscription will hit cryptic
4xx errors without guidance on how to work around it.

Tests: 8 new cases covering status extraction across httpx/fal error
shapes and 4xx-vs-5xx-vs-ConnectionError translation policy.

Docs: brief note in image-generation.md for Nous subscribers.

Operator action (Nous Portal side): verify that fal-queue-gateway
passes through these 7 new FAL model IDs. If the proxy has an
allowlist, add them; otherwise Nous Subscription users will see the
new translated error and fall back to direct FAL.

* feat(image_gen): pin GPT-Image quality to medium (no user choice)

Previously the tools picker asked a follow-up question for GPT-Image
quality tier (low / medium / high) and persisted the answer to
`image_gen.quality_setting`. This created two problems:

1. Nous Portal billing complexity — the 22x cost spread between tiers
   ($0.009 low / $0.20 high) forces the gateway to meter per-tier per
   user, which the portal team can't easily support at launch.
2. User footgun — anyone picking `high` by mistake burns through
   credit ~6x faster than `medium`.

This commit pins quality at medium by baking it into FAL_MODELS
defaults for gpt-image-1.5 and removes all user-facing override paths:

- Removed `_resolve_gpt_quality()` runtime lookup
- Removed `honors_quality_setting` flag on the model entry
- Removed `_configure_gpt_quality_setting()` picker helper
- Removed `_GPT_QUALITY_CHOICES` constant
- Removed the follow-up prompt call in `_configure_imagegen_model()`
- Even if a user manually edits `image_gen.quality_setting` in
  config.yaml, no code path reads it — always sends medium.

Tests:
- Replaced TestGptQualitySetting (6 tests) with TestGptQualityPinnedToMedium
  (5 tests) — proves medium is baked in, config is ignored, flag is
  removed, helper is removed, non-gpt models never get quality.
- Replaced test_picker_with_gpt_image_also_prompts_quality with
  test_picker_with_gpt_image_does_not_prompt_quality — proves only 1
  picker call fires when gpt-image is selected (no quality follow-up).

Docs updated: image-generation.md replaces the quality-tier table
with a short note explaining the pinning decision.

* docs(image_gen): drop stale 'wires GPT quality tier' line from internals section

Caught in a cleanup sweep after pinning quality to medium. The
"How It Works Internally" walkthrough still described the removed
quality-wiring step.
2026-04-16 20:19:53 -07:00
Teknium
ab33ce1c86
fix(opencode): strip /v1 from base_url on mid-session /model switch to Anthropic-routed models (#11286)
PR #4918 fixed the double-/v1 bug at fresh agent init by stripping the
trailing /v1 from OpenCode base URLs when api_mode is anthropic_messages
(so the Anthropic SDK's own /v1/messages doesn't land on /v1/v1/messages).
The same logic was missing from the /model mid-session switch path.

Repro: start a session on opencode-go with GLM-5 (or any chat_completions
model), then `/model minimax-m2.7`. switch_model() correctly sets
api_mode=anthropic_messages via opencode_model_api_mode(), but base_url
passes through as https://opencode.ai/zen/go/v1. The Anthropic SDK then
POSTs to https://opencode.ai/zen/go/v1/v1/messages, which returns the
OpenCode website 404 HTML page (title 'Not Found | opencode').

Same bug affects `/model claude-sonnet-4-6` on opencode-zen.

Verified upstream: POST /v1/messages returns clean JSON 401 with x-api-key
auth (route works), while POST /v1/v1/messages returns the exact HTML 404
users reported.

Fix mirrors runtime_provider.resolve_runtime_provider:
- hermes_cli/model_switch.py::switch_model() strips /v1 after the OpenCode
  api_mode override when the resolved mode is anthropic_messages.
- run_agent.py::AIAgent.switch_model() applies the same strip as
  defense-in-depth, so any direct caller can't reintroduce the double-/v1.

Tests: 9 new regression tests in tests/hermes_cli/test_model_switch_opencode_anthropic.py
covering minimax on opencode-go, claude on opencode-zen, chat_completions
(GLM/Kimi/Gemini) keeping /v1 intact, codex_responses (GPT) keeping /v1
intact, trailing-slash handling, and the agent-level defense-in-depth.
2026-04-16 19:41:41 -07:00
Brooklyn Nicholson
ca30803d89 chore(tui): strip noise comments 2026-04-16 19:14:05 -05:00
Brooklyn Nicholson
7f1204840d test(tui): fix stale mocks + xdist flakes in TUI test suite
All 61 TUI-related tests green across 3 consecutive xdist runs.

tests/tui_gateway/test_protocol.py:
- rename `get_messages` → `get_messages_as_conversation` on mock DB (method
  was renamed in the real backend, test was still stubbing the old name)
- update tool-message shape expectation: `{role, name, context}` matches
  current `_history_to_messages` output, not the legacy `{role, text}`

tests/hermes_cli/test_tui_resume_flow.py:
- `cmd_chat` grew a first-run provider-gate that bailed to "Run: hermes
  setup" before `_launch_tui` was ever reached; 3 tests stubbed
  `_resolve_last_session` + `_launch_tui` but not the gate
- factored a `main_mod` fixture that stubs `_has_any_provider_configured`,
  reused by all three tests

tests/test_tui_gateway_server.py:
- `test_config_set_personality_resets_history_and_returns_info` was flaky
  under xdist because the real `_write_config_key` touches
  `~/.hermes/config.yaml`, racing with any other worker that writes
  config. Stub it in the test.
2026-04-16 19:07:49 -05:00
helix4u
6ba4bb6b8e fix(models): add glm-5.1 to opencode-go catalogs 2026-04-16 16:49:22 -07:00
Ben
79156ab19c dashboard: show GATEWAY_HEALTH_URL instead of PID for remote gateways
When the dashboard connects to a remote gateway via GATEWAY_HEALTH_URL,
display the URL instead of the remote PID (which is meaningless locally).
Falls back to PID display for local gateways as before.

- Backend: expose gateway_health_url in /api/status response
- Frontend: prefer gateway_health_url over PID in gatewayValue()
- Add truncate + title tooltip for long URLs that overflow the card
- Add min-w-0/overflow-hidden on status cards for proper truncation
- Tests: verify gateway_health_url in remote and no-URL scenarios
2026-04-16 16:48:14 -07:00
Brooklyn Nicholson
cb2a737bc8 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-16 14:48:33 -05:00
emozilla
f188ac74f0 feat: ungate Tool Gateway — subscription-based access with per-tool opt-in
Replace the HERMES_ENABLE_NOUS_MANAGED_TOOLS env-var feature flag with
subscription-based detection. The Tool Gateway is now available to any
paid Nous subscriber without needing a hidden env var.

Core changes:
- managed_nous_tools_enabled() checks get_nous_auth_status() +
  check_nous_free_tier() instead of an env var
- New use_gateway config flag per tool section (web, tts, browser,
  image_gen) records explicit user opt-in and overrides direct API
  keys at runtime
- New prefers_gateway(section) shared helper in tool_backend_helpers.py
  used by all 4 tool runtimes (web, tts, image gen, browser)

UX flow:
- hermes model: after Nous login/model selection, shows a curses
  prompt listing all gateway-eligible tools with current status.
  User chooses to enable all, enable only unconfigured tools, or skip.
  Defaults to Enable for new users, Skip when direct keys exist.
- hermes tools: provider selection now manages use_gateway flag —
  selecting Nous Subscription sets it, selecting any other provider
  clears it
- hermes status: renamed section to Nous Tool Gateway, added
  free-tier upgrade nudge for logged-in free users
- curses_radiolist: new description parameter for multi-line context
  that survives the screen clear

Runtime behavior:
- Each tool runtime (web_tools, tts_tool, image_generation_tool,
  browser_use) checks prefers_gateway() before falling back to
  direct env-var credentials
- get_nous_subscription_features() respects use_gateway flags,
  suppressing direct credential detection when the user opted in

Removed:
- HERMES_ENABLE_NOUS_MANAGED_TOOLS env var and all references
- apply_nous_provider_defaults() silent TTS auto-set
- get_nous_subscription_explainer_lines() static text
- Override env var warnings (use_gateway handles this properly now)
2026-04-16 12:36:49 -07:00
Brooklyn Nicholson
9c71f3a6ea Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-16 10:47:41 -05:00
Jorge
5b4773fc20 fix: wire up Ollama Cloud dynamic model discovery in /model TUI picker
provider_model_ids() and list_authenticated_providers() had no case for
"ollama-cloud", so the /model slash command showed 0 models despite
fetch_ollama_cloud_models() being fully implemented. The CLI subcommand
worked because it called fetch_ollama_cloud_models() directly.

- Add ollama-cloud case to provider_model_ids() in models.py
- Populate curated dict for ollama-cloud in list_authenticated_providers()
- Add tests for both code paths
2026-04-16 07:17:45 -07:00
Teknium
3c42064efc
fix: enforce config.yaml as sole CWD source + deprecate .env CWD vars + add hermes memory reset (#11029)
config.yaml terminal.cwd is now the single source of truth for working
directory. MESSAGING_CWD and TERMINAL_CWD in .env are deprecated with a
migration warning.

Changes:

1. config.py: Remove MESSAGING_CWD from OPTIONAL_ENV_VARS (setup wizard
   no longer prompts for it). Add warn_deprecated_cwd_env_vars() that
   prints a migration hint when deprecated env vars are detected.

2. gateway/run.py: Replace all MESSAGING_CWD reads with TERMINAL_CWD
   (which is bridged from config.yaml terminal.cwd). MESSAGING_CWD is
   still accepted as a backward-compat fallback with deprecation warning.
   Config bridge skips cwd placeholder values so they don't clobber
   the resolved TERMINAL_CWD.

3. cli.py: Guard against lazy-import clobbering — when cli.py is
   imported lazily during gateway runtime (via delegate_tool), don't
   let load_cli_config() overwrite an already-resolved TERMINAL_CWD
   with os.getcwd() of the service's working directory. (#10817)

4. hermes_cli/main.py: Add 'hermes memory reset' command with
   --target all/memory/user and --yes flags. Profile-scoped via
   HERMES_HOME.

Migration path for users with .env settings:
  Remove MESSAGING_CWD / TERMINAL_CWD from .env
  Add to config.yaml:
    terminal:
      cwd: /your/project/path

Addresses: #10225, #4672, #10817, #7663
2026-04-16 06:48:33 -07:00
Bartok9
73befa505d fix(cli): handle null/non-dict display config in skin initialization
display: null or display: <non-dict> in config.yaml crashed skin init
with AttributeError. Now falls back to default skin gracefully.

Cherry-picked from #10867 by @Bartok9. Consolidates #10876 by @cola-runner.

Co-authored-by: cola-runner <cola-runner@users.noreply.github.com>
2026-04-16 06:35:31 -07:00
Brooklyn Nicholson
f81dba0da2 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-16 08:23:20 -05:00
kshitijk4poor
1b61ec470b feat: add Ollama Cloud as built-in provider
Add ollama-cloud as a first-class provider with full parity to existing
API-key providers (gemini, zai, minimax, etc.):

- PROVIDER_REGISTRY entry with OLLAMA_API_KEY env var
- Provider aliases: ollama -> custom (local), ollama_cloud -> ollama-cloud
- models.dev integration for accurate context lengths
- URL-to-provider mapping (ollama.com -> ollama-cloud)
- Passthrough model normalization (preserves Ollama model:tag format)
- Default auxiliary model (nemotron-3-nano:30b)
- HermesOverlay in providers.py
- CLI --provider choices, CANONICAL_PROVIDERS entry
- Dynamic model discovery with disk caching (1hr TTL)
- 37 provider-specific tests

Cherry-picked from PR #6038 by kshitijk4poor. Closes #3926
2026-04-16 02:22:09 -07:00
helix4u
4093982f19 fix: recompute Copilot api_mode after model switch
Recomputes GitHub Copilot api_mode from the selected model in the
shared /model switch path.  Before this change, Copilot could carry a
stale codex_responses mode forward from a GPT-5 selection into a later
Claude model switch, causing unsupported_api_for_model errors.

Cherry-picked from #10533 by @helix4u with:
- Comment specificity (Provider-specific → Copilot api_mode override)
- Fix pre-existing duplicate opencode-go in set literal
- Extract test mock helper to reduce duplication
- Add GPT-5 → GPT-5 regression test (keeps codex_responses)
2026-04-16 01:16:14 -07:00
Teknium
36b54afbc4
feat(plugins): add dispatch_tool() to PluginContext (#10763)
Expands the plugin interface so slash command handlers can dispatch tool
calls through the registry with parent agent context wired up automatically.

This is the public API for plugins that need to orchestrate tools like
delegate_task — they call ctx.dispatch_tool() instead of reaching into
framework internals. The parent agent is resolved lazily from _cli_ref
when available (CLI mode) and omitted in gateway mode (tools degrade
gracefully).

Enables the hermes-deliver-plugin pattern where /deliver and /fanout
slash commands spawn subagents via delegate_task without touching the
agent conversation loop.

7 new tests covering: registry delegation, parent_agent injection from
cli_ref, gateway mode (no cli_ref), uninitialized agent, explicit
parent_agent override, kwargs forwarding, return value passthrough.
2026-04-15 22:23:01 -07:00
kshitijk4poor
ff5bf0d6c8 fix(tests): resolve CI test failures — pool auto-seeding, stale assertions, mock isolation
Salvaged from PR #10643 by kshitijk4poor, updated for current main.

Root causes fixed:
1. Telegram xdist mock pollution — new tests/gateway/conftest.py with shared
   mock that runs at collection time (prevents ChatType=None caching)
2. VIRTUAL_ENV env var leak — monkeypatch.delenv in _detect_venv_dir tests
3. Copilot base_url missing — add fallback in _resolve_runtime_from_pool_entry
4. Stale vision model assertion — zai now uses glm-5v-turbo
5. Reasoning item id intentionally stripped — assert 'id' not in (store=False)
6. Context length warning unreachable — pass base_url to AIAgent in test
7. Kimi provider label updated — 'Kimi / Kimi Coding Plan' matches models.py
8. Google Workspace calendar tests — rewritten for current production code,
   properly mock subprocess on api_module, removed stale +agenda assertions
9. Credential pool auto-seeding — mock _select_pool_entry / _resolve_auto /
   _import_codex_cli_tokens to prevent real credentials from leaking into tests
2026-04-15 22:05:21 -07:00
Teknium
498b995c13
feat: implement register_command() on plugin context (#10626)
Complete the half-built plugin slash command system. The dispatch
code in cli.py and gateway/run.py already called
get_plugin_command_handler() but the registration side was never
implemented.

Changes:
- Add register_command() to PluginContext — stores handler,
  description, and plugin name; normalizes names; rejects conflicts
  with built-in commands
- Add _plugin_commands dict to PluginManager
- Add commands_registered tracking on LoadedPlugin
- Add get_plugin_command_handler() and get_plugin_commands()
  module-level convenience functions
- Fix commands.py to use actual plugin description in Telegram
  bot menu (was hardcoded 'Plugin command')
- Add plugin commands to SlashCommandCompleter autocomplete
- Show command count in /plugins display
- 12 new tests covering registration, conflict detection,
  normalization, handler dispatch, and introspection

Closes #10495
2026-04-15 19:53:11 -07:00
Brooklyn Nicholson
097702c8a7 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-15 19:11:07 -05:00
Brenner Spear
2fbdc2c8fa feat(discord): add channel_prompts config
Add native Discord channel_prompts support with parent forum fallback,
ephemeral runtime injection, config migration updates, docs, and tests.
2026-04-15 16:31:28 -07:00
Brooklyn Nicholson
72aebfbb24 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-15 17:43:41 -05:00
Harish Kukreja
f1df83179f fix(doctor): skip health check for OpenCode Go (no shared /models endpoint)
OpenCode Go does not expose a shared /models endpoint, so the doctor
probe was always failing and producing a false warning. Set the default
URL to None and disable the health check for this provider.
2026-04-15 15:05:32 -07:00
Teknium
3b4ecf8ee7
fix: remove 'q' alias from /quit so /queue's 'q' alias works (#10467) (#10538)
Both /queue and /quit registered 'q' as an alias. Since /quit appeared
later in COMMAND_REGISTRY, _build_command_lookup() silently overwrote
/queue's claim, making the documented /queue shorthand unusable.

Fix: remove 'q' from /quit's aliases. /quit already has 'exit' as an
alias plus the full '/quit' command. /queue has no other short alias.

Closes #10467
2026-04-15 15:04:01 -07:00
Brooklyn Nicholson
baa0de7649 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-15 16:35:01 -05:00
Teknium
19142810ed
fix: /debug privacy — auto-delete pastes after 1 hour, add privacy notices (#10510)
- Pastes uploaded by /debug now auto-delete after 1 hour via a detached
  background process that sends DELETE to paste.rs
- CLI: shows privacy notice listing what data will be uploaded
- Gateway: only uploads summary report (system info + log tails), NOT
  full log files containing conversation content
- Added 'hermes debug delete <url>' for immediate manual deletion
- 16 new tests covering auto-delete scheduling, paste deletion, privacy
  notices, and the delete subcommand

Addresses user privacy concern where /debug uploaded full conversation
logs to a public paste service with no warning or expiry.
2026-04-15 13:40:27 -07:00
Brooklyn Nicholson
cc15b55bb9 chore: uptick 2026-04-15 10:23:15 -05:00
Brooklyn Nicholson
371166fe26 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-15 10:21:00 -05:00
Arihant Sethia
857b543543 feat: add skill analytics to the dashboard
Expose skill usage in analytics so the dashboard and insights output can
show which skills the agent loads and manages over time.

This adds skill aggregation to the InsightsEngine by extracting
`skill_view` and `skill_manage` calls from assistant tool_calls,
computing per-skill totals, and including the results in both terminal
and gateway insights formatting. It also extends the dashboard analytics
API and Analytics page to render a Top Skills table.

Terminology is aligned with the skills docs:
  - Agent Loaded = `skill_view` events
  - Agent Managed = `skill_manage` actions

Architecture:
  - agent/insights.py collects and aggregates per-skill usage
  - hermes_cli/web_server.py exposes `skills` on `/api/analytics/usage`
  - web/src/lib/api.ts adds analytics skill response types
  - web/src/pages/AnalyticsPage.tsx renders the Top Skills table
  - web/src/i18n/{en,zh}.ts updates user-facing labels

Tests:
  - tests/agent/test_insights.py covers skill aggregation and formatting
  - tests/hermes_cli/test_web_server.py covers analytics API contract
    including the `skills` payload
  - verified with `cd web && npm run build`

Files changed:
  - agent/insights.py
  - hermes_cli/web_server.py
  - tests/agent/test_insights.py
  - tests/hermes_cli/test_web_server.py
  - web/src/i18n/en.ts
  - web/src/i18n/types.ts
  - web/src/i18n/zh.ts
  - web/src/lib/api.ts
  - web/src/pages/AnalyticsPage.tsx
2026-04-15 06:44:43 +00:00
Ubuntu
da8bab77fb fix(cli): restore messaging toolset for gateway platforms 2026-04-14 23:13:35 -07:00
Teknium
9932366f3c feat(doctor): add Command Installation check for hermes bin symlink
hermes doctor now checks whether the ~/.local/bin/hermes symlink exists
and points to the correct venv entry point. With --fix, it creates or
repairs the symlink automatically.

Covers:
- Missing symlink at ~/.local/bin/hermes (or $PREFIX/bin on Termux)
- Symlink pointing to wrong target
- Missing venv entry point (venv/bin/hermes or .venv/bin/hermes)
- PATH warning when ~/.local/bin is not on PATH
- Skipped on Windows (different mechanism)

Addresses user report: 'python -m hermes_cli.main doesn't have an option
to fix the local bin/install'

10 new tests covering all scenarios.
2026-04-14 23:13:11 -07:00
Teknium
029938fbed
fix(cli): defensive subparser routing for argparse bpo-9338 (#10113)
On some Python versions, argparse fails to route subcommand tokens when
the parent parser has nargs='?' optional arguments (--continue).  The
symptom: 'hermes model' produces 'unrecognized arguments: model' even
though 'model' is a registered subcommand.

Fix: when argv contains a token matching a known subcommand, set
subparsers.required=True to force deterministic routing.  If that fails
(e.g. 'hermes -c model' where 'model' is consumed as the session name
for --continue), fall back to the default optional-subparsers behaviour.

Adds 13 tests covering all key argument combinations.

Reported via user screenshot showing the exact error on an installed
version with the model subcommand listed in usage but rejected at parse
time.
2026-04-14 23:13:02 -07:00
Brooklyn Nicholson
496bfb3c59 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-14 22:30:22 -05:00
Teknium
da528a8207 fix: detect and strip non-ASCII characters from API keys (#6843)
API keys containing Unicode lookalike characters (e.g. ʋ U+028B instead
of v) cause UnicodeEncodeError when httpx encodes the Authorization
header as ASCII.  This commonly happens when users copy-paste keys from
PDFs, rich-text editors, or web pages with decorative fonts.

Three layers of defense:

1. **Save-time validation** (hermes_cli/config.py):
   _check_non_ascii_credential() strips non-ASCII from credential values
   when saving to .env, with a clear warning explaining the issue.

2. **Load-time sanitization** (hermes_cli/env_loader.py):
   _sanitize_loaded_credentials() strips non-ASCII from credential env
   vars (those ending in _API_KEY, _TOKEN, _SECRET, _KEY) after dotenv
   loads them, so the rest of the codebase never sees non-ASCII keys.

3. **Runtime recovery** (run_agent.py):
   The UnicodeEncodeError recovery block now also sanitizes self.api_key
   and self._client_kwargs['api_key'], fixing the gap where message/tool
   sanitization succeeded but the API key still caused httpx to fail on
   the Authorization header.

Also: hermes_logging.py RotatingFileHandler now explicitly sets
encoding='utf-8' instead of relying on locale default (defensive
hardening for ASCII-locale systems).
2026-04-14 20:20:31 -07:00
Brooklyn Nicholson
77cd5bf565 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-14 19:33:03 -05:00
Teknium
2a98098035 fix: hermes gateway restart waits for service to come back up (#8260)
Previously, systemd_restart() sent SIGUSR1 to the gateway, printed
'restart requested', and returned immediately. The gateway still
needed to drain active agents, exit with code 75, wait for systemd's
RestartSec=30, and start the new process. The user saw 'success' but
the gateway was actually down for 30-60 seconds.

Now the SIGUSR1 path blocks with progress feedback:

Phase 1 — wait for old process to die:
   User service draining active work...
  Polls os.kill(pid, 0) until ProcessLookupError (up to 90s)

Phase 2 — wait for new process to become active:
   Waiting for hermes-gateway to restart...
  Polls systemctl is-active + verifies new PID (up to 60s)

Success:
  ✓ User service restarted (PID 12345)

Timeout:
  ⚠ User service did not become active within 60s.
    Check status: hermes gateway status
    Check logs: journalctl --user -u hermes-gateway --since '2 min ago'

The reload-or-restart fallback path (line 1189) already blocks because
systemctl reload-or-restart is synchronous.

Test plan:
- Updated test to verify wait-for-restart behavior
- All 118 gateway CLI tests pass
2026-04-14 17:12:58 -07:00
Teknium
10494b42a1
feat(discord): register skills under /skill command group with category subcommands (#9909)
Instead of consuming one top-level slash command slot per skill (hitting the
100-command limit with ~26 built-ins + 74 skills), skills are now organized
under a single /skill group command with category-based subcommand groups:

  /skill creative ascii-art [args]
  /skill media gif-search [args]
  /skill mlops axolotl [args]

Discord supports 25 subcommand groups × 25 subcommands = 625 max skills,
well beyond the previous 74-slot ceiling.

Categories are derived from the skill directory structure:
- skills/creative/ascii-art/ → category 'creative'
- skills/mlops/training/axolotl/ → category 'mlops' (top-level parent)
- skills/dogfood/ → uncategorized (direct subcommand)

Changes:
- hermes_cli/commands.py: add discord_skill_commands_by_category() with
  category grouping, hub/disabled filtering, Discord limit enforcement
- gateway/platforms/discord.py: replace top-level skill registration with
  _register_skill_group() using app_commands.Group hierarchy
- tests: 7 new tests covering group creation, category grouping,
  uncategorized skills, hub exclusion, deep nesting, empty skills,
  and handler dispatch

Inspired by Discord community suggestion from bottium.
2026-04-14 16:27:02 -07:00
Brooklyn Nicholson
bf54f1fb2f Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-14 18:26:05 -05:00
Teknium
353b5bacbd test: add tests for /health/detailed endpoint and gateway health probe
- TestHealthDetailedEndpoint: 3 tests for the new API server endpoint
  (returns runtime data, handles missing status, no auth required)
- TestProbeGatewayHealth: 5 tests for _probe_gateway_health()
  (URL normalization, successful/failed probes, fallback chain)
- TestStatusRemoteGateway: 4 tests for /api/status remote fallback
  (remote probe triggers, skipped when local PID found, null PID handling)
2026-04-14 15:41:30 -07:00
Teknium
99bcc2de5b
fix(security): harden dashboard API against unauthenticated access (#9800)
Addresses responsible disclosure from FuzzMind Security Lab (CVE pending).

The web dashboard API server had 36 endpoints, of which only 5 checked
the session token. The token itself was served from an unauthenticated
GET /api/auth/session-token endpoint, rendering the protection circular.
When bound to 0.0.0.0 (--host flag), all API keys, config, and cron
management were accessible to any machine on the network.

Changes:
- Add auth middleware requiring session token on ALL /api/ routes except
  a small public whitelist (status, config/defaults, config/schema,
  model/info)
- Remove GET /api/auth/session-token endpoint entirely; inject the token
  into index.html via a <script> tag at serve time instead
- Replace all inline token comparisons (!=) with hmac.compare_digest()
  to prevent timing side-channel attacks
- Block non-localhost binding by default; require --insecure flag to
  override (with warning log)
- Update frontend fetchJSON() to send Authorization header on all
  requests using the injected window.__HERMES_SESSION_TOKEN__

Credit: Callum (@0xca1x) and @migraine-sudo at FuzzMind Security Lab
2026-04-14 10:57:56 -07:00
Teknium
b867171291 fix: preserve profile name completion in dynamic shell completion
The dynamic parser walker from the contributor's commit lost the profile
name tab-completion that existed in the old static generators. This adds
it back for all three shells:

- Bash: _hermes_profiles() helper, -p/--profile completion, profile
  action→name completion (use/delete/show/alias/rename/export)
- Zsh: _hermes_profiles() function, -p/--profile argument spec, profile
  action case with name completion
- Fish: __hermes_profiles function, -s p -l profile flag, profile action
  completions

Also removes the dead fallback path in cmd_completion() that imported
the old static generators from profiles.py (parser is always available
via the lambda wiring) and adds 11 regression-prevention tests for
profile completion.
2026-04-14 10:45:42 -07:00
leozeli
a686dbdd26 feat(cli): add dynamic shell completion for bash, zsh, and fish
Replaces the hardcoded completion stubs in profiles.py with a dynamic
generator that walks the live argparse parser tree at runtime.

- New hermes_cli/completion.py: _walk() recursively extracts all
  subcommands and flags; generate_bash/zsh/fish() produce complete
  scripts with nested subcommand support
- cmd_completion now accepts the parser via closure so completions
  always reflect the actual registered commands (including plugin-
  registered ones like honcho)
- completion subcommand now accepts bash | zsh | fish (fish requested
  in issue comments)
- Fix _SUBCOMMANDS set: add honcho, claw, plugins, acp, webhook,
  memory, dump, debug, backup, import, completion, logs so that
  multi-word session names after -c/-r are not broken by these commands
- Add tests/hermes_cli/test_completion.py: 17 tests covering parser
  extraction, alias deduplication, bash/zsh/fish output content,
  bash syntax validation, fish syntax validation, and subcommand
  drift prevention

Tested on Linux (Arch). bash and fish completion verified live.
zsh script passes syntax check (zsh not installed on test machine).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-14 10:45:42 -07:00
shijianzhi
70611879de fix(cli): fix doctor checks for Kimi China credentials 2026-04-14 10:16:30 -07:00
Brooklyn Nicholson
9a3a2925ed feat: scroll aware sticky prompt 2026-04-14 11:49:32 -05:00
Teknium
7ad47ace51
fix: resolve remaining 4 CI test failures (#9543)
- test_auth_commands: suppress _seed_from_singletons auto-seeding that
  adds extra credentials from CI env (same pattern as nearby tests)
- test_interrupt: clear stale _interrupted_threads set to prevent
  thread ident reuse from prior tests in same xdist worker
- test_code_execution: add watch_patterns to _BLOCKED_TERMINAL_PARAMS
  to match production _TERMINAL_BLOCKED_PARAMS
2026-04-14 02:18:38 -07:00
Teknium
2558d28a9b
fix: resolve CI test failures — add missing functions, fix stale tests (#9483)
Production fixes:
- Add clear_session_context() to hermes_logging.py (fixes 48 teardown errors)
- Add clear_session() to tools/approval.py (fixes 9 setup errors)
- Add SyncError M_UNKNOWN_TOKEN check to Matrix _sync_loop (bug fix)
- Fall back to inline api_key in named custom providers when key_env
  is absent (runtime_provider.py)

Test fixes:
- test_memory_user_id: use builtin+external provider pair, fix honcho
  peer_name override test to match production behavior
- test_display_config: remove TestHelpers for non-existent functions
- test_auxiliary_client: fix OAuth tokens to match _is_oauth_token
  patterns, replace get_vision_auxiliary_client with resolve_vision_provider_client
- test_cli_interrupt_subagent: add missing _execution_thread_id attr
- test_compress_focus: add model/provider/api_key/base_url/api_mode
  to mock compressor
- test_auth_provider_gate: add autouse fixture to clean Anthropic env
  vars that leak from CI secrets
- test_opencode_go_in_model_list: accept both 'built-in' and 'hermes'
  source (models.dev API unavailable in CI)
- test_email: verify email Platform enum membership instead of source
  inspection (build_channel_directory now uses dynamic enum loop)
- test_feishu: add bot_added/bot_deleted handler mocks to _Builder
- test_ws_auth_retry: add AsyncMock for sync_store.get_next_batch,
  add _pending_megolm and _joined_rooms to Matrix adapter mocks
- test_restart_drain: monkeypatch-delete INVOCATION_ID (systemd sets
  this in CI, changing the restart call signature)
- test_session_hygiene: add user_id to SessionSource
- test_session_env: use relative baseline for contextvar clear check
  (pytest-xdist workers share context)
2026-04-14 01:43:45 -07:00
Teknium
6dc8f8e9c0 feat(skin): add warm-lightmode skin from PR #4811
Add a second light-mode skin option with warm brown/parchment tones,
adapted from ygd58's contribution in PR #4811. Includes completion
menu and status bar color keys for full light-terminal support.

Co-authored-by: buray <78954051+ygd58@users.noreply.github.com>
2026-04-13 23:51:21 -07:00
Liu Chongwei
bc93641c4f feat(skins): add built-in daylight skin 2026-04-13 23:51:21 -07:00
Teknium
38ad158b6b
fix: auto-correct close model name matches in /model validation (#9424)
* feat(skills): add fitness-nutrition skill to optional-skills

Cherry-picked from PR #9177 by @haileymarshall.

Adds a fitness and nutrition skill for gym-goers and health-conscious users:
- Exercise search via wger API (690+ exercises, free, no auth)
- Nutrition lookup via USDA FoodData Central (380K+ foods, DEMO_KEY fallback)
- Offline body composition calculators (BMI, TDEE, 1RM, macros, body fat %)
- Pure stdlib Python, no pip dependencies

Changes from original PR:
- Moved from skills/ to optional-skills/health/ (correct location)
- Fixed BMR formula in FORMULAS.md (removed confusing -5+10, now just +5)
- Fixed author attribution to match PR submitter
- Marked USDA_API_KEY as optional (DEMO_KEY works without signup)

Also adds optional env var support to the skill readiness checker:
- New 'optional: true' field in required_environment_variables entries
- Optional vars are preserved in metadata but don't block skill readiness
- Optional vars skip the CLI capture prompt flow
- Skills with only optional missing vars show as 'available' not 'setup_needed'

* fix: auto-correct close model name matches in /model validation

When a user types a model name with a minor typo (e.g. gpt5.3-codex instead
of gpt-5.3-codex), the validation now auto-corrects to the closest match
instead of accepting the wrong name with a warning.

Uses difflib get_close_matches with cutoff=0.9 to avoid false corrections
(e.g. gpt-5.3 should not silently become gpt-5.4). Applied consistently
across all three validation paths: codex provider, custom endpoints, and
generic API-probed providers.

The validate_requested_model() return dict gains an optional corrected_model
key that switch_model() applies before building the result.

Reported by Discord user — /model gpt5.3-codex was accepted with a warning
but would fail at the API level.

---------

Co-authored-by: haileymarshall <haileymarshall@users.noreply.github.com>
2026-04-13 23:09:39 -07:00
Teknium
62fb6b2cd8 fix: guard zero context length display + add 19 tests for model info
- ModelInfoCard: hide card when effective_context_length <= 0 instead
  of showing 'Context Window: 0 auto-detected'
- Add tests for _normalize_config_for_web model_context_length extraction
- Add tests for _denormalize_config_from_web round-trip (write back,
  remove on zero, upgrade bare string to dict, coerce string input)
- Add tests for CONFIG_SCHEMA ordering (model_context_length after model)
- Add tests for GET /api/model/info endpoint (dict config, bare string,
  empty model, capabilities, graceful error handling)
2026-04-13 22:04:35 -07:00
Gianfranco Piana
eabc0a2f66 feat(plugins): let pre_tool_call hooks block tool execution
Plugins can now return {"action": "block", "message": "reason"} from
their pre_tool_call hook to prevent a tool from executing. The error
message is returned to the model as a tool result so it can adjust.

Covers both execution paths: handle_function_call (model_tools.py) and
agent-level tools (run_agent.py _invoke_tool + sequential/concurrent).
Blocked tools skip all side effects (counter resets, checkpoints,
callbacks, read-loop tracker).

Adds skip_pre_tool_call_hook flag to avoid double-firing the hook when
run_agent.py already checked and then calls handle_function_call.

Salvaged from PR #5385 (gianfrancopiana) and PR #4610 (oredsecurity).
2026-04-13 22:01:49 -07:00
Teknium
5621fc449a
chore: rename AI Gateway → Vercel AI Gateway, move Xiaomi to #5 (#9326)
- Rename 'AI Gateway' to 'Vercel AI Gateway' across auth, models,
  doctor, setup, and tests.
- Move Xiaomi MiMo to position #5 in the provider picker.
2026-04-13 19:51:54 -07:00
Brooklyn Nicholson
1b573b7b21 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-13 21:17:41 -05:00
arthurbr11
0a4cf5b3e1 feat(providers): add Arcee AI as direct API provider
Adds Arcee AI as a standard direct provider (ARCEEAI_API_KEY) with
Trinity models: trinity-large-thinking, trinity-large-preview, trinity-mini.

Standard OpenAI-compatible provider checklist: auth.py, config.py,
models.py, main.py, providers.py, doctor.py, model_normalize.py,
model_metadata.py, setup.py, trajectory_compressor.py.

Based on PR #9274 by arthurbr11, simplified to a standard direct
provider without dual-endpoint OpenRouter routing.
2026-04-13 18:40:06 -07:00
Teknium
ac80bd61ad test: add regression tests for custom_providers multi-model dedup and grouping
Tests for salvaged PRs #9233 and #8011.
2026-04-13 16:41:30 -07:00
Brooklyn Nicholson
7e4dd6ea02 Merge branch 'main' of github.com:NousResearch/hermes-agent into feat/ink-refactor 2026-04-13 18:32:13 -05:00
Teknium
32cea0c08d
fix: dashboard shows Nous Portal as 'not connected' despite active auth (#9261)
The dashboard device-code flow (_nous_poller in web_server.py) saved
credentials to the credential pool only, while get_nous_auth_status()
only checked the auth store (auth.json). This caused the Keys tab to
show 'not connected' even when the backend was fully authenticated.

Two fixes:
1. get_nous_auth_status() now checks the credential pool first (like
   get_codex_auth_status() already does), then falls back to the auth
   store.
2. _nous_poller now also persists to the auth store after saving to
   the credential pool, matching the CLI flow (_login_nous).

Adds 3 tests covering pool-only, auth-store-fallback, and empty-state
scenarios.
2026-04-13 16:32:11 -07:00
Teknium
8d023e43ed
refactor: remove dead code — 1,784 lines across 77 files (#9180)
Deep scan with vulture, pyflakes, and manual cross-referencing identified:
- 41 dead functions/methods (zero callers in production)
- 7 production-dead functions (only test callers, tests deleted)
- 5 dead constants/variables
- ~35 unused imports across agent/, hermes_cli/, tools/, gateway/

Categories of dead code removed:
- Refactoring leftovers: _set_default_model, _setup_copilot_reasoning_selection,
  rebuild_lookups, clear_session_context, get_logs_dir, clear_session
- Unused API surface: search_models_dev, get_pricing, skills_categories,
  get_read_files_summary, clear_read_tracker, menu_labels, get_spinner_list
- Dead compatibility wrappers: schedule_cronjob, list_cronjobs, remove_cronjob
- Stale debug helpers: get_debug_session_info copies in 4 tool files
  (centralized version in debug_helpers.py already exists)
- Dead gateway methods: send_emote, send_notice (matrix), send_reaction
  (bluebubbles), _normalize_inbound_text (feishu), fetch_room_history
  (matrix), _start_typing_indicator (signal), parse_feishu_post_content
- Dead constants: NOUS_API_BASE_URL, SKILLS_TOOL_DESCRIPTION,
  FILE_TOOLS, VALID_ASPECT_RATIOS, MEMORY_DIR
- Unused UI code: _interactive_provider_selection,
  _interactive_model_selection (superseded by prompt_toolkit picker)

Test suite verified: 609 tests covering affected files all pass.
Tests for removed functions deleted. Tests using removed utilities
(clear_read_tracker, MEMORY_DIR) updated to use internal APIs directly.
2026-04-13 16:32:04 -07:00
Brooklyn Nicholson
aeb53131f3 fix(ui-tui): harden TUI error handling, model validation, command UX parity, and gateway lifecycle 2026-04-13 18:29:24 -05:00
Brooklyn Nicholson
ebe3270430 fix: fake models 2026-04-13 14:57:42 -05:00